Tuning & Optimization
Executive Summary
Technique
Goal
Cost
Key Method
1. Fine-Tuning Strategies
SFT (Supervised Fine-Tuning)
PEFT: LoRA (Low-Rank Adaptation)
2. Alignment: RLHF & DPO
RLHF (Reinforcement Learning from Human Feedback)
DPO (Direct Preference Optimization)
3. Optimization: Quantization
Interview Questions
Code: LoRA Config (PEFT)
Last updated