Reference

Google Models

The full Google DeepMind model lineup: Gemini (commercial API), Gemma (open-weight), and PaLM 2 (legacy). Google offers the largest context windows in the industry — up to 2 million tokens.

Gemini 2.5 Series

Google's latest and most capable models, featuring native "thinking" capabilities for enhanced reasoning.

ModelReleasedContextMax OutputMultimodalPricing (Input/Output per 1M)
Gemini 2.5 ProMar 20251M (2M preview)65,536Text, Vision, Audio, Video$1.25 / $10.00 (≤200K), $2.50 / $15.00 (>200K)
Gemini 2.5 FlashMar 20251M65,536Text, Vision, Audio, Video$0.15 / $0.60 (non-thinking), $0.30 / $2.50 (thinking)

Gemini 2.5 Pro

Google's most intelligent model. Features built-in "thinking" that allows the model to reason through complex problems before responding. Leads many public benchmarks in coding, math, and science.

  • Best for: Complex reasoning, coding, math, scientific analysis, long-document processing
  • Key features: Native thinking mode, 1M context (2M in preview), processes text/images/audio/video natively, tool use, code execution
  • Standout: Industry-leading context window allows processing of entire codebases or hundreds of documents at once

Gemini 2.5 Flash

A fast, efficient thinking model. Offers strong reasoning capabilities at much lower cost and latency than 2.5 Pro. The "thinking" can be toggled or given a token budget for cost control.

  • Best for: Production applications needing reasoning, high-volume processing, cost-sensitive deployments
  • Key features: Configurable thinking budget, very fast, multimodal

Gemini 2.0 Series

ModelReleasedContextMultimodalPricing (Input/Output per 1M)
Gemini 2.0 FlashDec 20241MText, Vision, Audio, Video$0.10 / $0.40

Gemini 2.0 Flash

Introduced multimodal output (text, images, audio) and agentic capabilities. Designed for building AI agents with tool use, code execution, and Google Search grounding.

  • Best for: Agent-based applications, multimodal generation, real-time interactions
  • Key features: Native tool use, multimodal output, Google Search grounding

Gemini 1.5 Series

ModelReleasedContextMultimodalPricing (Input/Output per 1M)
Gemini 1.5 ProFeb 20242MText, Vision, Audio, Video$1.25 / $5.00 (≤128K), $2.50 / $10.00 (>128K)
Gemini 1.5 FlashMay 20241MText, Vision, Audio, Video$0.075 / $0.30

Gemini 1.5 Pro

The first model to offer a 2 million token context window, enough to process multiple entire books or hours of video at once. Pioneered the "long context" paradigm.

  • Best for: Very long document analysis, video understanding, large codebase processing
  • Historical significance: Demonstrated that million-token contexts could be practical and useful

Gemini 1.5 Flash

A lighter, faster variant of 1.5 Pro. Optimized for speed and efficiency while maintaining strong capabilities across modalities.

Gemini 1.0 & Legacy

Gemini 1.0 Pro

  • Released: December 2023
  • Context: 32K tokens
  • Status: Superseded by newer Gemini models
  • Significance: Google's first Gemini model; replaced PaLM 2 in most applications

Gemini Nano

  • Released: December 2023
  • Sizes: Nano-1 (1.8B) and Nano-2 (3.25B)
  • Context: 32K tokens
  • Purpose: On-device model for mobile phones (Pixel, Samsung Galaxy). Runs locally without internet. Used for Smart Reply, summarization, and other device features.

PaLM 2

  • Released: May 2023
  • Variants: Gecko (smallest), Otter, Bison, Unicorn (largest)
  • Significance: Preceded Gemini. Powered early Bard, MakerSuite, and Google Cloud AI
  • Status: Deprecated in favor of Gemini

Gemma Family (Open-Weight)

Gemma is Google's family of open-weight models based on the same research and technology as Gemini but released for the community to use, fine-tune, and deploy.

ModelReleasedParametersContextLicense
Gemma 2 27BJun 202427B8KGemma License
Gemma 2 9BJun 20249B8KGemma License
Gemma 2 2BJun 20242B8KGemma License
Gemma 7BFeb 20247B8KGemma License
Gemma 2BFeb 20242B8KGemma License
CodeGemmaApr 20242B / 7B8KGemma License
RecurrentGemmaApr 20242B / 9B8KGemma License

Gemma 2

The second generation of Google's open models. Gemma 2 27B is particularly notable for outperforming many larger models. Uses a novel architecture with alternating local and global attention layers.

  • Best for: Self-hosted applications, fine-tuning, research, edge deployment (2B)
  • Key features: Strong benchmark scores for size class, instruction-tuned variants available

CodeGemma

Specialized for code generation and understanding. Available in 2B (fill-in-the-middle) and 7B (code generation) variants.

RecurrentGemma

An experimental model using a novel recurrent architecture (Griffin) instead of pure Transformers. Designed for efficient inference on long sequences.

Gemma License: The Gemma License allows free commercial use but prohibits using model outputs to train other models that compete with Gemma. It is more permissive than Llama's license but more restrictive than Apache 2.0. Read the full license before building commercial products.
💡
Google's API access: Gemini models are available through Google AI Studio (free tier available) and Vertex AI (enterprise). Gemma models can be downloaded from Hugging Face or Kaggle.