Intermediate
Models & Tokenizers
Go beyond pipelines. Learn to work directly with AutoModel, AutoTokenizer, and understand the architectures behind transformer models.
AutoModel & AutoTokenizer
The Auto classes automatically detect and load the correct model architecture based on the checkpoint name:
Python
from transformers import AutoTokenizer, AutoModel # Load tokenizer and model tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased") model = AutoModel.from_pretrained("bert-base-uncased") # Tokenize input text inputs = tokenizer("Hello, how are you?", return_tensors="pt") print(inputs.keys()) # dict_keys(['input_ids', 'token_type_ids', 'attention_mask']) # Run through model outputs = model(**inputs) print(outputs.last_hidden_state.shape) # torch.Size([1, 7, 768])
Task-Specific Auto Classes
For specific tasks, use the appropriate Auto class that adds the correct head on top of the base model:
Python
from transformers import ( AutoModelForSequenceClassification, AutoModelForTokenClassification, AutoModelForQuestionAnswering, AutoModelForCausalLM, AutoModelForSeq2SeqLM, AutoModelForImageClassification, ) # Classification model (e.g., sentiment analysis) model = AutoModelForSequenceClassification.from_pretrained( "distilbert-base-uncased-finetuned-sst-2-english" ) # Text generation model model = AutoModelForCausalLM.from_pretrained("gpt2")
Understanding Tokenizers
Tokenizers convert raw text into numerical tokens that models can process. Different models use different tokenization strategies:
Python
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased") # Basic tokenization tokens = tokenizer.tokenize("Transformers are amazing!") print(tokens) # ['transformers', 'are', 'amazing', '!'] # Full encoding with special tokens encoded = tokenizer("Transformers are amazing!", return_tensors="pt") print(encoded['input_ids']) # tensor([[ 101, 19081, 2024, 6429, 999, 102]]) # Decode back to text decoded = tokenizer.decode(encoded['input_ids'][0]) print(decoded) # '[CLS] transformers are amazing! [SEP]'
Model Architectures
Transformer models fall into three main architecture categories:
- Encoder-only (BERT, RoBERTa, DistilBERT): Best for understanding tasks — classification, NER, question answering. They process the entire input at once with bidirectional attention.
- Decoder-only (GPT-2, LLaMA, Mistral): Best for generation tasks — text completion, code generation. They generate tokens one at a time, left to right.
- Encoder-Decoder (T5, BART, mBART): Best for sequence-to-sequence tasks — translation, summarization. The encoder processes input, and the decoder generates output.
How to choose: Need to classify or extract? Use an encoder model. Need to generate text? Use a decoder model. Need to transform text (translate, summarize)? Use an encoder-decoder model.
Working with Model Outputs
Python
import torch from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english") model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english") inputs = tokenizer("I really enjoyed this course!", return_tensors="pt") with torch.no_grad(): outputs = model(**inputs) # Raw logits logits = outputs.logits # Convert to probabilities probs = torch.softmax(logits, dim=-1) print(f"Negative: {probs[0][0]:.4f}, Positive: {probs[0][1]:.4f}")
What's Next?
Now that you understand models and tokenizers, the next lesson covers fine-tuning — how to train pre-trained models on your own data using the Trainer API.
Lilly Tech Systems