XML Output Intermediate

XML is an often-overlooked format for structured AI output. While JSON dominates API usage, XML excels at mixed content (text interleaved with data), streaming extraction, and scenarios where Claude in particular produces more reliable results.

When to Use XML over JSON

Scenario JSON XML Recommendation
Pure data extraction Great Good Use JSON
Mixed text + data Awkward Natural Use XML
Multiple output sections Possible Excellent Use XML
Streaming partial parsing Hard Easy (tag-based) Use XML
Claude-specific prompting Good Excellent Use XML

Prompting for XML Output

Python
import anthropic
import re

client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[{
        "role": "user",
        "content": """Analyze this review and provide your analysis in XML format:

"Great product but shipping was slow. The quality exceeded my expectations."

Use this format:
<analysis>
  <sentiment>positive/negative/neutral</sentiment>
  <score>1-10</score>
  <summary>Brief summary</summary>
  <pros>
    <item>Pro point</item>
  </pros>
  <cons>
    <item>Con point</item>
  </cons>
</analysis>"""
    }]
)

text = response.content[0].text

Parsing XML Output

Python
import xml.etree.ElementTree as ET
import re

def extract_xml(text, tag):
    """Extract content of an XML tag from model output."""
    pattern = f"<{tag}>(.*?)</{tag}>"
    match = re.search(pattern, text, re.DOTALL)
    return match.group(1).strip() if match else None

def extract_xml_list(text, wrapper_tag, item_tag):
    """Extract a list of items from XML."""
    wrapper = extract_xml(text, wrapper_tag)
    if not wrapper:
        return []
    return re.findall(f"<{item_tag}>(.*?)</{item_tag}>", wrapper, re.DOTALL)

# Parse the response
sentiment = extract_xml(text, "sentiment")
score = int(extract_xml(text, "score"))
summary = extract_xml(text, "summary")
pros = extract_xml_list(text, "pros", "item")
cons = extract_xml_list(text, "cons", "item")

XML for Multi-Section Responses

XML shines when you need the model to produce multiple distinct sections:

Prompt Template
Provide your response in this format:

<thinking>
Your step-by-step reasoning (not shown to user)
</thinking>

<answer>
The concise answer for the user
</answer>

<confidence>
high/medium/low
</confidence>

<sources>
  <source>Reference 1</source>
  <source>Reference 2</source>
</sources>
Claude + XML: Claude models are particularly good at following XML formatting instructions. Anthropic's own documentation uses XML tags for structuring prompts, so the model has strong training signal for XML patterns.
Caveat: Unlike JSON mode, there is no provider-level guarantee that the output will be valid XML. Always wrap your XML parsing in try/except blocks and have fallback parsing strategies.