Advanced

Best Practices

Production-ready guidelines for prompt engineering, safety, cost optimization, and avoiding common mistakes when working with Claude.

Prompt Engineering Best Practices

These practices will consistently improve the quality of Claude's responses across all use cases.

1. Be Explicit About Output Format

Never assume Claude knows what format you want. Specify it clearly.

Example
// Good - explicit format
Analyze this text for sentiment. Return your
analysis as JSON with these exact fields:
{
  "sentiment": "positive" | "negative" | "neutral",
  "confidence": 0.0 to 1.0,
  "key_phrases": ["phrase1", "phrase2"],
  "reasoning": "one sentence explanation"
}

2. Use System Prompts for Consistency

When building applications, define Claude's behavior in the system prompt so every response follows the same pattern.

System Prompt Pattern
system = """You are a customer support assistant for Acme Corp.

Rules:
- Always be polite and professional
- If you don't know an answer, say so and offer
  to escalate to a human agent
- Never share internal company information
- Respond in the customer's language
- Keep responses under 150 words unless the
  customer asks for more detail
- Always end with: "Is there anything else I
  can help with?"
"""

3. Test with Diverse Inputs

Your prompt should handle edge cases gracefully. Test with:

  • Empty or minimal inputs
  • Very long inputs near the context limit
  • Inputs in different languages
  • Adversarial or confusing inputs
  • Inputs with special characters or formatting

4. Version Control Your Prompts

Treat prompts like code. Store them in version control, test changes systematically, and track which prompt version produces which results.

Python - Prompt Management
# Store prompts as named constants
CLASSIFY_PROMPT_V2 = """Classify the following support
ticket into one of these categories:
- billing
- technical
- account
- feature_request
- other

Return only the category name, nothing else.

Ticket: {ticket_text}"""

# Use with clear versioning
def classify_ticket(ticket_text: str) -> str:
    prompt = CLASSIFY_PROMPT_V2.format(
        ticket_text=ticket_text
    )
    response = client.messages.create(
        model="claude-haiku-3-5-20241022",
        max_tokens=20,
        temperature=0.0,
        messages=[{"role": "user", "content": prompt}]
    )
    return response.content[0].text.strip()

Safety and Responsible Use

Using AI responsibly is not just ethical — it protects your users, your brand, and your business.

Content Filtering

When building user-facing applications, add your own content filtering on top of Claude's built-in safety:

  • Validate and sanitize user inputs before sending to Claude
  • Review Claude's outputs before showing them to end users in sensitive contexts
  • Implement feedback mechanisms so users can report inappropriate responses

Data Privacy

  • Do not send personally identifiable information (PII) unless necessary and compliant with your privacy policy
  • Anonymize or pseudonymize data before sending to the API when possible
  • Understand and communicate Anthropic's data retention policies to your users
  • Consider data residency requirements for your jurisdiction

Transparency

  • Disclose to users when they are interacting with AI
  • Do not present Claude's outputs as human-written when accuracy and attribution matter
  • Always verify critical information — Claude can make mistakes
Important: Never use Claude for autonomous decision-making in high-stakes scenarios (medical diagnosis, legal advice, financial trading) without human oversight. Claude is a tool to assist humans, not replace human judgment in critical situations.

Cost Optimization

API costs scale with token usage. Here are proven strategies to keep costs down without sacrificing quality.

Choose the Right Model

Task Type Recommended Model Why
Simple classification Haiku 60x cheaper than Opus, fast enough for real-time
Code generation Sonnet Best quality-to-cost ratio for coding tasks
Complex analysis Opus (or Sonnet) Only use Opus when Sonnet quality is insufficient
Routing / triage Haiku Use Haiku to classify, then route to bigger models

Reduce Token Usage

  • Set appropriate max_tokens: Do not default to 4096 if you only need 100 tokens back
  • Trim context: Only include relevant information in the prompt, not entire documents if only a section matters
  • Cache common responses: If you ask the same question repeatedly, cache the result
  • Use prompt compression: Remove unnecessary whitespace, filler words, and redundant instructions
Python - Token-Efficient Pattern
import hashlib, json

# Simple response cache
cache = {}

def call_claude_cached(prompt, model, max_tokens):
    cache_key = hashlib.md5(
        f"{model}:{prompt}".encode()
    ).hexdigest()

    if cache_key in cache:
        return cache[cache_key]

    response = client.messages.create(
        model=model,
        max_tokens=max_tokens,
        messages=[{"role": "user", "content": prompt}]
    )
    result = response.content[0].text
    cache[cache_key] = result
    return result

Rate Limits and Quotas

The API has rate limits that vary by plan tier. Key limits to be aware of:

  • Requests per minute (RPM): Number of API calls you can make per minute
  • Tokens per minute (TPM): Total input + output tokens per minute
  • Tokens per day (TPD): Daily token budget
Handling rate limits: Implement exponential backoff with jitter. The SDK handles this automatically, but if using raw HTTP requests, check the retry-after header in 429 responses.

Testing and Iteration

Treat prompt development like software development with a testing cycle.

  1. Define Success Criteria

    Before writing a prompt, define what a good response looks like. Create 5-10 test cases with expected outputs.

  2. Write the Initial Prompt

    Start with a clear, simple prompt. Do not over-engineer on the first try.

  3. Test Against Your Criteria

    Run all test cases. Score each response. Identify patterns in failures.

  4. Iterate and Refine

    Adjust the prompt based on failures. Add examples, clarify instructions, or add constraints.

  5. Regression Test

    After changes, re-run all test cases to ensure improvements did not break previously working cases.

Production Deployment Tips

Use Pinned Model Versions

Always use the full model name with date in production to avoid unexpected behavior changes.

Model Versioning
# Development - ok to use latest
model = "claude-sonnet-4-latest"

# Production - pin the version
model = "claude-sonnet-4-20250514"

Implement Timeouts

Set appropriate timeouts to prevent your application from hanging on slow responses.

Monitor Usage

Track token usage, response times, and error rates. Set up alerts for unusual spikes.

Graceful Degradation

Plan for API outages. Have fallback responses or alternative workflows ready.

Common Pitfalls to Avoid

Pitfall 1: Prompt Injection. If your application passes user input into prompts, users can inject instructions. Always validate and sanitize user input, and use XML tags to clearly separate user content from system instructions.
Pitfall 2: Trusting Output Blindly. Claude can generate incorrect information confidently. For critical applications, always verify outputs against trusted sources. Never use Claude output directly in safety-critical systems.
Pitfall 3: Not Setting max_tokens. Without a reasonable max_tokens value, responses can be unexpectedly long and expensive. Set this parameter based on what you actually need.
Pitfall 4: Ignoring Token Costs in Loops. If you call Claude in a loop (processing 1000 items), calculate the total cost before running. A prompt that costs $0.01 per call costs $10 for 1000 calls.
Pitfall 5: One-Size-Fits-All Prompts. Using the same prompt for vastly different inputs leads to inconsistent results. Create specialized prompts for different input types and use routing logic to select the right one.

Frequently Asked Questions

Claude pricing is based on token usage. Haiku starts at $0.25 per million input tokens, Sonnet at $3, and Opus at $15. Output tokens are more expensive. For casual use, claude.ai offers free and paid subscription plans. Check Anthropic's pricing page for current rates.

By default, Claude cannot browse the internet. Its knowledge comes from training data with a knowledge cutoff date. However, you can provide Claude with current information by pasting text, uploading documents, or using tool use / function calling to give Claude access to external data sources.

Claude's training data has a knowledge cutoff date that varies by model version. For the latest models, this is typically within the past few months. Claude will tell you when a question falls outside its knowledge window. Always provide current data directly when you need analysis of recent events.

Anthropic's API data policy states that they do not train on API customer data by default. Data sent through the API is handled according to their data retention policy and terms of service. For the consumer product (claude.ai), policies may differ. Review Anthropic's current privacy policy and terms for the most up-to-date information.

When a conversation approaches the context limit, you have several options: (1) Summarize older messages and replace them with the summary, (2) Start a new conversation with relevant context from the old one, (3) Use a sliding window approach that keeps only the most recent N messages plus the system prompt, or (4) Implement retrieval-augmented generation (RAG) to pull in relevant context on demand.

Claude can analyze and understand images (vision), but it cannot generate or create new images. For image generation, you would need to use a separate tool or service. Claude can, however, generate text descriptions, SVG code, or ASCII art.

Claude can write and analyze code in virtually all popular programming languages, including Python, JavaScript, TypeScript, Java, C++, C#, Go, Rust, Ruby, PHP, Swift, Kotlin, SQL, HTML/CSS, and many more. It handles framework-specific code (React, Django, Spring Boot, etc.) and can work with configuration files, build scripts, and infrastructure-as-code tools.

Course Complete!

Congratulations on completing the Claude AI course! You now have a solid foundation in:

  • What Claude is and how it works
  • Choosing between Haiku, Sonnet, and Opus
  • Writing effective prompts from basic to advanced
  • Applying Claude to real-world use cases
  • Integrating Claude via the API
  • Production best practices and safety considerations
🏆
Next steps: Explore the Claude Code CLI course to learn how to use AI-powered coding assistance directly in your terminal, or check out the Claude CoWork course for team collaboration.