Intermediate

GraphQL for AI APIs

Build flexible, type-safe AI APIs with GraphQL — from schema design and resolvers to subscriptions for real-time AI features.

Why GraphQL for AI?

GraphQL offers advantages for complex AI applications where clients need different views of the same data:

  • Flexible queries: Clients request exactly the fields they need, reducing payload sizes for mobile and edge clients.
  • Type safety: Strong schema types catch errors at development time, not in production.
  • Single endpoint: One endpoint serves multiple AI capabilities, simplifying client code.
  • Subscriptions: Built-in real-time support for streaming AI results to clients.
  • Introspection: Self-documenting schema that generates documentation automatically.

Schema Design for AI

type Query {
  models: [Model!]!
  model(id: ID!): Model
  predict(input: PredictionInput!): PredictionResult!
}

type Mutation {
  createFineTuneJob(input: FineTuneInput!): FineTuneJob!
  cancelJob(jobId: ID!): FineTuneJob!
}

type Subscription {
  streamCompletion(input: CompletionInput!): CompletionChunk!
  jobStatus(jobId: ID!): JobStatusUpdate!
}

type Model {
  id: ID!
  name: String!
  version: String!
  capabilities: [String!]!
  maxTokens: Int!
  pricing: Pricing!
}

input PredictionInput {
  modelId: ID!
  text: String!
  temperature: Float = 0.7
  maxTokens: Int = 256
}

type PredictionResult {
  id: ID!
  text: String!
  confidence: Float!
  tokens: TokenUsage!
  model: Model!
  latencyMs: Int!
}

type TokenUsage {
  input: Int!
  output: Int!
  total: Int!
}
💡
Design schemas around capabilities, not models: Instead of gpt4Predict and claudePredict, use a single predict query with a modelId parameter. This lets you add new models without changing the schema.

Implementing Resolvers

import strawberry
from strawberry.types import Info

@strawberry.type
class Query:
    @strawberry.field
    async def predict(self, input: PredictionInput, info: Info) -> PredictionResult:
        # Validate input
        model = await get_model(input.model_id)
        if not model:
            raise ValueError(f"Model {input.model_id} not found")

        # Run inference
        start_time = time.time()
        result = await model.predict(
            text=input.text,
            temperature=input.temperature,
            max_tokens=input.max_tokens,
        )
        latency = int((time.time() - start_time) * 1000)

        # Track usage
        await track_usage(info.context.user, result.token_usage)

        return PredictionResult(
            id=generate_id(),
            text=result.output,
            confidence=result.confidence,
            tokens=result.token_usage,
            model=model,
            latency_ms=latency,
        )

GraphQL Subscriptions for Streaming

Use subscriptions to stream AI model output token by token:

@strawberry.type
class Subscription:
    @strawberry.subscription
    async def stream_completion(
        self, input: CompletionInput
    ) -> AsyncGenerator[CompletionChunk, None]:
        model = await get_model(input.model_id)

        async for chunk in model.stream(
            text=input.text,
            temperature=input.temperature,
        ):
            yield CompletionChunk(
                id=chunk.id,
                delta=chunk.text,
                finish_reason=chunk.finish_reason,
                index=chunk.index,
            )

GraphQL vs REST for AI APIs

CriterionChoose RESTChoose GraphQL
API complexitySingle model, simple interfaceMultiple models, complex queries
ClientsDiverse (curl, any language)Modern frontends (React, mobile)
CachingHTTP caching is straightforwardRequires Apollo/Relay caching
Team familiarityUniversal knowledgeRequires GraphQL expertise
File uploadsNative multipart supportRequires extra configuration
Consider a hybrid approach: Many production AI platforms offer both REST and GraphQL. Use REST for simple prediction endpoints and GraphQL for complex management and analytics queries. This gives developers the choice that best fits their use case.