Intermediate
Fine-tuning Models
Train pre-trained models on your own data using the Trainer API. Learn dataset preparation, training configuration, evaluation, and publishing to the Hub.
Why Fine-tune?
Pre-trained models are trained on general data. Fine-tuning adapts them to your specific task — your domain, your labels, your data. A fine-tuned model on 1,000 examples often outperforms a general model on millions.
Preparing Your Dataset
Python
from datasets import load_dataset from transformers import AutoTokenizer # Load a dataset from the Hub dataset = load_dataset("imdb") print(dataset) # DatasetDict({ # train: Dataset({features: ['text', 'label'], num_rows: 25000}), # test: Dataset({features: ['text', 'label'], num_rows: 25000}) # }) # Tokenize the dataset tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased") def tokenize_function(examples): return tokenizer(examples["text"], padding="max_length", truncation=True) tokenized_dataset = dataset.map(tokenize_function, batched=True)
The Trainer API
Python
from transformers import ( AutoModelForSequenceClassification, TrainingArguments, Trainer, ) import numpy as np from datasets import load_metric # Load pre-trained model model = AutoModelForSequenceClassification.from_pretrained( "distilbert-base-uncased", num_labels=2 ) # Define training arguments training_args = TrainingArguments( output_dir="./results", num_train_epochs=3, per_device_train_batch_size=16, per_device_eval_batch_size=64, warmup_steps=500, weight_decay=0.01, logging_dir="./logs", evaluation_strategy="epoch", save_strategy="epoch", load_best_model_at_end=True, ) # Define evaluation metric metric = load_metric("accuracy") def compute_metrics(eval_pred): logits, labels = eval_pred predictions = np.argmax(logits, axis=-1) return metric.compute(predictions=predictions, references=labels) # Create Trainer trainer = Trainer( model=model, args=training_args, train_dataset=tokenized_dataset["train"], eval_dataset=tokenized_dataset["test"], compute_metrics=compute_metrics, ) # Train! trainer.train()
Evaluation
Python
# Evaluate the fine-tuned model results = trainer.evaluate() print(f"Accuracy: {results['eval_accuracy']:.4f}") print(f"Loss: {results['eval_loss']:.4f}")
Pushing to the Hub
Python
# Login to Hugging Face from huggingface_hub import login login(token="your_token_here") # Push model and tokenizer to the Hub trainer.push_to_hub("my-fine-tuned-sentiment-model") # Or push individually model.push_to_hub("my-fine-tuned-sentiment-model") tokenizer.push_to_hub("my-fine-tuned-sentiment-model")
Tip: Always create a model card when pushing to the Hub. Include training details, intended use, limitations, and evaluation results so others can understand and use your model responsibly.
GPU Required: Fine-tuning typically requires a GPU. Use Google Colab (free tier includes a T4 GPU), or cloud services like AWS, GCP, or Lambda Labs for larger training runs.
What's Next?
Once your model is trained, you need to serve it efficiently. The next lesson covers optimized inference with quantization, ONNX export, and Text Generation Inference.
Lilly Tech Systems