Bedrock Knowledge Bases Advanced

Bedrock Knowledge Bases enable Retrieval-Augmented Generation (RAG) by connecting foundation models to your organization's data. Documents are automatically chunked, embedded, and stored in a vector database, allowing the model to retrieve relevant context when answering questions.

Knowledge Base Architecture

Architecture Flow

Ingestion:
  S3 Bucket (documents) → Chunking → Embedding Model → Vector Store

Query:
  User Question → Embedding → Vector Search → Relevant Chunks
  Relevant Chunks + Question → Foundation Model → Grounded Answer

Supported Vector Stores:
  - Amazon OpenSearch Serverless (recommended)
  - Amazon Aurora PostgreSQL (pgvector)
  - Pinecone
  - Redis Enterprise Cloud

Setting Up a Knowledge Base

Prepare data source
Upload documents (PDF, TXT, HTML, CSV, DOCX) to an S3 bucket.
Create knowledge base
Select embedding model (Titan Embeddings), configure chunking strategy, and choose vector store.
Sync data
Trigger ingestion to process documents into vector embeddings. Re-sync when documents change.
Query
Use the RetrieveAndGenerate API or connect to a Bedrock Agent.

Chunking Strategies

Strategy	Chunk Size	Best For
Fixed size	300-500 tokens	Uniform documents (articles, docs)
Semantic	Variable	Mixed content (long-form, tables)
Hierarchical	Parent + child	Structured documents (manuals, specs)
No chunking	Whole document	Short documents (<500 tokens each)

Querying the Knowledge Base

Python

client = boto3.client('bedrock-agent-runtime')

response = client.retrieve_and_generate(
    input={'text': 'What is our refund policy?'},
    retrieveAndGenerateConfiguration={
        'type': 'KNOWLEDGE_BASE',
        'knowledgeBaseConfiguration': {
            'knowledgeBaseId': 'KB_ID',
            'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-5-sonnet-20241022-v2:0'
        }
    }
)
print(response['output']['text'])

Quality Tip: The quality of RAG responses depends heavily on chunking strategy and embedding model. Experiment with different chunk sizes and overlap amounts. Use the Retrieve API to inspect which chunks are being retrieved before the model generates a response.

Ready for Best Practices?

The final lesson covers security, cost management, and production patterns for Bedrock.

Next: Best Practices →

← Agents Best Practices →