Prompt -> LLMs -> Completion
Prompt space: context window| (normally 1000 words) Act of using model to generate text: inference Completion: question + answer
Before Transformer: RNNs
Transformer:
Positional encoding: preserve the word-order and relevance of words in sentence.
Revise the language to make the model behaves like we want -> prompt engineering
In-context learning: including examples| (additional data) in the prompt
One single example: one-shot, others: mult-shot.
Scope: Define the use case. Select: Choosing an existing model or pretarin your own. Adapt and align: - Prompt engineering - Finetuning - Align with human feedback - Evaluate Application integration: - Optimize and deploy model for inference - Augment model and build LLM-powered applications
Question | Answer |
---|---|
1. Interacting with Large Language Models (LLMs) differs from traditional macmechanism that allosa ahine learning models. Working with LLMs involves natural language input, known as a _____, resulting in output from the Large Language Model, known as the ______ . | (prompt, completion) |
2. Large Language Models (LLMs) are capable of performing multiple tasks supporting a variety of use cases. Which of the following tasks supports the use case of converting code comments into executable code? | (translation) |
3. What is the self-attention that powers the transformer architecture? | (A mechanism that allows a model to focus on different parts of the input sequence during computation) |
4. Which of the following stages are part of the generative AI model lifecycle mentioned in the course? (Select all that apply) | (define, select, manipulating, deploying) |
5. “RNNs are better than Transformers for generative AI Tasks.” | (False) |
6. Which transformer-based model architecture has the objective of guessing a masked token based on the previous sequence of tokens by building bidirectional representations of the input sequence. | (Autoencoder) |
7. Which transformer-based model architecture is well-suited to the task of text translation? | (Seq2Seq) |
8. Do we always need to increase the model size to improve its performance? | (False) |
9. Scaling laws for pre-training large language models consider several aspects to maximize performance of a model within a set of constraints and available scaling choices. Select all alternatives that should be considered for scaling when performing model pre-training? | (compute budget, dataset & model size) |
10. “You can combine data parallelism with model parallelism to train LLMs.” | (True) |