Pinecone
Pinecone is a fully managed, serverless vector database designed for production AI applications. Zero infrastructure to manage — just create an index and start building.
What is Pinecone?
Pinecone is a cloud-native vector database that handles all the complexity of vector search infrastructure — indexing, scaling, replication, and optimization. You interact with it through a simple API, and Pinecone handles everything else.
- Fully managed: No servers to provision, no indexes to tune manually.
- Serverless: Pay only for what you use. Scales to zero when idle.
- Fast: Sub-100ms query latency at any scale.
- Metadata filtering: Combine vector similarity with attribute filters.
- Namespaces: Partition data within a single index for multi-tenancy.
Getting Started
-
Create an Account
Sign up at pinecone.io. The free tier includes enough resources for development and small projects.
-
Install the Python SDK
Terminalpip install pinecone -
Get Your API Key
Find your API key in the Pinecone dashboard under API Keys. Store it as an environment variable:
Terminalexport PINECONE_API_KEY="your-api-key-here"
Creating an Index
from pinecone import Pinecone, ServerlessSpec
pc = Pinecone(api_key="your-api-key")
# Create a serverless index
pc.create_index(
name="my-index",
dimension=1536, # Must match your embedding model
metric="cosine", # cosine, euclidean, or dotproduct
spec=ServerlessSpec(
cloud="aws",
region="us-east-1"
)
)
# Connect to the index
index = pc.Index("my-index")
print(index.describe_index_stats())
text-embedding-3-small uses 1536 dimensions. text-embedding-3-large uses 3072. Always check your embedding model's output dimensions.Upserting Vectors
Upsert (update or insert) vectors into your index. Each vector needs a unique ID, the vector values, and optional metadata.
import openai
# Generate embeddings
client = openai.OpenAI()
documents = [
{"id": "doc1", "text": "Python is a popular programming language",
"category": "programming"},
{"id": "doc2", "text": "Machine learning uses algorithms to learn from data",
"category": "ai"},
{"id": "doc3", "text": "Neural networks are inspired by the human brain",
"category": "ai"},
]
# Batch embed
texts = [d["text"] for d in documents]
response = client.embeddings.create(
input=texts,
model="text-embedding-3-small"
)
# Prepare and upsert vectors
vectors = [
{
"id": doc["id"],
"values": emb.embedding,
"metadata": {"text": doc["text"], "category": doc["category"]}
}
for doc, emb in zip(documents, response.data)
]
index.upsert(vectors=vectors, namespace="articles")
Querying
# Create query embedding
query = "How do computers learn?"
query_response = client.embeddings.create(
input=[query],
model="text-embedding-3-small"
)
query_vector = query_response.data[0].embedding
# Search with metadata filter
results = index.query(
vector=query_vector,
top_k=5,
namespace="articles",
include_metadata=True,
filter={"category": {"$eq": "ai"}}
)
for match in results.matches:
print(f"Score: {match.score:.4f} | {match.metadata['text']}")
# Score: 0.9234 | Machine learning uses algorithms to learn from data
# Score: 0.8891 | Neural networks are inspired by the human brain
Namespaces
Namespaces partition data within a single index. Each namespace is isolated — queries only search within the specified namespace. This is perfect for multi-tenancy.
# Upsert to different namespaces
index.upsert(vectors=team_a_vectors, namespace="team-a")
index.upsert(vectors=team_b_vectors, namespace="team-b")
# Query only Team A's data
results = index.query(
vector=query_vector,
top_k=10,
namespace="team-a"
)
Serverless vs Pod-Based
| Feature | Serverless | Pod-Based |
|---|---|---|
| Pricing | Pay per query + storage | Pay per pod (fixed hourly) |
| Scaling | Automatic, scales to zero | Manual, always-on |
| Best for | Variable workloads, development | Predictable, high-throughput |
| Cold start | Possible after idle | None (always warm) |
Pricing Overview
- Free tier: 100K vectors, 1 serverless index. Good for development.
- Standard: Pay-as-you-go for storage ($0.33/GB/month) and queries.
- Enterprise: Custom pricing, SLAs, dedicated support, SSO.
💡 Try It Yourself
Sign up for a free Pinecone account, create a serverless index, and upsert 10 sample vectors. Then query them and experiment with metadata filters.
Lilly Tech Systems