Coursera

Week 1 note

Overview

Prompt -> LLMs -> Completion

Prompt space: context window| (normally 1000 words) Act of using model to generate text: inference Completion: question + answer

Use cases and tasks

Generating texts

Prompting and prompt engineering

Revise the language to make the model behaves like we want -> prompt engineering

In-context learning: including examples| (additional data) in the prompt

One single example: one-shot, others: mult-shot.

Generative configurations

Generative AI project lifecycle

Scope: Define the use case. Select: Choosing an existing model or pretarin your own. Adapt and align: - Prompt engineering - Finetuning - Align with human feedback - Evaluate Application integration: - Optimize and deploy model for inference - Augment model and build LLM-powered applications

Select

Pretraining for domain adaptation

Quiz

Question Answer
1. Interacting with Large Language Models (LLMs) differs from traditional macmechanism that allosa ahine learning models. Working with LLMs involves natural language input, known as a _____, resulting in output from the Large Language Model, known as the ______ . (prompt, completion)
2. Large Language Models (LLMs) are capable of performing multiple tasks supporting a variety of use cases. Which of the following tasks supports the use case of converting code comments into executable code? (translation)
3. What is the self-attention that powers the transformer architecture? (A mechanism that allows a model to focus on different parts of the input sequence during computation)
4. Which of the following stages are part of the generative AI model lifecycle mentioned in the course? (Select all that apply) (define, select, manipulating, deploying)
5. “RNNs are better than Transformers for generative AI Tasks.” (False)
6. Which transformer-based model architecture has the objective of guessing a masked token based on the previous sequence of tokens by building bidirectional representations of the input sequence. (Autoencoder)
7. Which transformer-based model architecture is well-suited to the task of text translation? (Seq2Seq)
8. Do we always need to increase the model size to improve its performance? (False)
9. Scaling laws for pre-training large language models consider several aspects to maximize performance of a model within a set of constraints and available scaling choices. Select all alternatives that should be considered for scaling when performing model pre-training? (compute budget, dataset & model size)
10. “You can combine data parallelism with model parallelism to train LLMs.” (True)