Intermediate

Pinecone

Pinecone is a fully managed, serverless vector database designed for production AI applications. Zero infrastructure to manage — just create an index and start building.

What is Pinecone?

Pinecone is a cloud-native vector database that handles all the complexity of vector search infrastructure — indexing, scaling, replication, and optimization. You interact with it through a simple API, and Pinecone handles everything else.

  • Fully managed: No servers to provision, no indexes to tune manually.
  • Serverless: Pay only for what you use. Scales to zero when idle.
  • Fast: Sub-100ms query latency at any scale.
  • Metadata filtering: Combine vector similarity with attribute filters.
  • Namespaces: Partition data within a single index for multi-tenancy.

Getting Started

  1. Create an Account

    Sign up at pinecone.io. The free tier includes enough resources for development and small projects.

  2. Install the Python SDK

    Terminal
    pip install pinecone
  3. Get Your API Key

    Find your API key in the Pinecone dashboard under API Keys. Store it as an environment variable:

    Terminal
    export PINECONE_API_KEY="your-api-key-here"

Creating an Index

Python - Create a Serverless Index
from pinecone import Pinecone, ServerlessSpec

pc = Pinecone(api_key="your-api-key")

# Create a serverless index
pc.create_index(
    name="my-index",
    dimension=1536,          # Must match your embedding model
    metric="cosine",         # cosine, euclidean, or dotproduct
    spec=ServerlessSpec(
        cloud="aws",
        region="us-east-1"
    )
)

# Connect to the index
index = pc.Index("my-index")
print(index.describe_index_stats())
💡
Dimension must match your model: OpenAI text-embedding-3-small uses 1536 dimensions. text-embedding-3-large uses 3072. Always check your embedding model's output dimensions.

Upserting Vectors

Upsert (update or insert) vectors into your index. Each vector needs a unique ID, the vector values, and optional metadata.

Python - Upsert Vectors
import openai

# Generate embeddings
client = openai.OpenAI()

documents = [
    {"id": "doc1", "text": "Python is a popular programming language",
     "category": "programming"},
    {"id": "doc2", "text": "Machine learning uses algorithms to learn from data",
     "category": "ai"},
    {"id": "doc3", "text": "Neural networks are inspired by the human brain",
     "category": "ai"},
]

# Batch embed
texts = [d["text"] for d in documents]
response = client.embeddings.create(
    input=texts,
    model="text-embedding-3-small"
)

# Prepare and upsert vectors
vectors = [
    {
        "id": doc["id"],
        "values": emb.embedding,
        "metadata": {"text": doc["text"], "category": doc["category"]}
    }
    for doc, emb in zip(documents, response.data)
]

index.upsert(vectors=vectors, namespace="articles")

Querying

Python - Query with Filters
# Create query embedding
query = "How do computers learn?"
query_response = client.embeddings.create(
    input=[query],
    model="text-embedding-3-small"
)
query_vector = query_response.data[0].embedding

# Search with metadata filter
results = index.query(
    vector=query_vector,
    top_k=5,
    namespace="articles",
    include_metadata=True,
    filter={"category": {"$eq": "ai"}}
)

for match in results.matches:
    print(f"Score: {match.score:.4f} | {match.metadata['text']}")
# Score: 0.9234 | Machine learning uses algorithms to learn from data
# Score: 0.8891 | Neural networks are inspired by the human brain

Namespaces

Namespaces partition data within a single index. Each namespace is isolated — queries only search within the specified namespace. This is perfect for multi-tenancy.

Python - Using Namespaces
# Upsert to different namespaces
index.upsert(vectors=team_a_vectors, namespace="team-a")
index.upsert(vectors=team_b_vectors, namespace="team-b")

# Query only Team A's data
results = index.query(
    vector=query_vector,
    top_k=10,
    namespace="team-a"
)

Serverless vs Pod-Based

Feature Serverless Pod-Based
Pricing Pay per query + storage Pay per pod (fixed hourly)
Scaling Automatic, scales to zero Manual, always-on
Best for Variable workloads, development Predictable, high-throughput
Cold start Possible after idle None (always warm)

Pricing Overview

  • Free tier: 100K vectors, 1 serverless index. Good for development.
  • Standard: Pay-as-you-go for storage ($0.33/GB/month) and queries.
  • Enterprise: Custom pricing, SLAs, dedicated support, SSO.
When to choose Pinecone: Choose Pinecone when you want zero infrastructure management, need production-ready reliability, and prefer a managed service. It is the easiest vector database to get started with and scales seamlessly.

💡 Try It Yourself

Sign up for a free Pinecone account, create a serverless index, and upsert 10 sample vectors. Then query them and experiment with metadata filters.

Hands-on practice is the fastest way to learn. Try different filter operators: $eq, $ne, $gt, $lt, $in, $nin.