Hugging Face Hub Beginner

Hugging Face Hub is the world's largest platform for sharing AI models, datasets, and demos. With over 900,000 models, it is the go-to destination for finding and using pretrained models across every AI domain.

Overview

The Hub hosts models from individual researchers, companies, and organizations. Every model has a model card with documentation, usage examples, performance benchmarks, and license information.

Browsing Models

You can filter models on the Hub by:

  • Task: text-generation, image-classification, object-detection, translation, etc.
  • Library: transformers, diffusers, timm, spaCy, sentence-transformers
  • Language: English, Chinese, multilingual, etc.
  • License: Apache 2.0, MIT, Llama, research-only
  • Size: Filter by parameter count

Installing the transformers Library

Terminal
# Install transformers and required dependencies
$ pip install transformers torch

# Or with conda
$ conda install -c conda-forge transformers pytorch

Pipeline API (Easiest Way)

The Pipeline API is the simplest way to use a pretrained model. It handles tokenization, model loading, and post-processing automatically:

Python
from transformers import pipeline

# Sentiment analysis
classifier = pipeline("sentiment-analysis")
result = classifier("I love this product!")
print(result)
# [{'label': 'POSITIVE', 'score': 0.9998}]

# Text generation
generator = pipeline("text-generation", model="gpt2")
text = generator("The future of AI is", max_length=50)

# Image classification
classifier = pipeline("image-classification", model="google/vit-base-patch16-224")
result = classifier("photo.jpg")

# Speech recognition
transcriber = pipeline("automatic-speech-recognition", model="openai/whisper-base")
result = transcriber("audio.mp3")

AutoModel and AutoTokenizer

For more control, use AutoModel and AutoTokenizer to load models and process inputs manually:

Python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load tokenizer and model
model_name = "distilbert-base-uncased-finetuned-sst-2-english"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Tokenize input
inputs = tokenizer("This movie was fantastic!", return_tensors="pt")

# Run inference
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.softmax(outputs.logits, dim=-1)

print(predictions)
# tensor([[0.0002, 0.9998]])  -> POSITIVE

Downloading Models

Python
from huggingface_hub import snapshot_download

# Download entire model repository
snapshot_download(repo_id="bert-base-uncased")

# Download to a specific directory
snapshot_download(
    repo_id="bert-base-uncased",
    local_dir="./models/bert"
)

Model Cards

Every model on the Hub should have a model card that includes:

  • Model description: What the model does and how it was trained
  • Intended uses: What tasks the model is designed for
  • Training data: What data was used
  • Evaluation results: Benchmark scores and metrics
  • Limitations: Known weaknesses and biases
  • License: How you can use the model

Trending and Popular Models

The Hub features trending models on its homepage and lets you sort by downloads, likes, and recency. Some of the most popular model families include:

Model FamilyTaskDownloads (monthly)
Meta LlamaText generation50M+
OpenAI WhisperSpeech recognition30M+
BERT / DistilBERTText classification, NER40M+
Stable DiffusionImage generation10M+
sentence-transformersEmbeddings25M+

Next Up

Explore pretrained models for computer vision — from image classification to object detection and image generation.

Next: Vision Models →