Beginner

Introduction to Azure OpenAI Infrastructure

Understand the architecture, available models, and enterprise capabilities of Azure OpenAI Service — Microsoft's managed platform for deploying OpenAI foundation models.

What is Azure OpenAI Service?

Azure OpenAI Service provides enterprise-grade access to OpenAI's models (GPT-4, GPT-4o, DALL-E, Whisper) with Azure's security, compliance, and networking capabilities. Unlike using OpenAI directly, Azure OpenAI gives you data residency, private networking, and integration with Azure's identity and monitoring ecosystem.

Available Model Families

ModelCapabilitiesBest For
GPT-4o / GPT-4o miniText, vision, audioGeneral-purpose, multimodal applications
GPT-4Text generation, reasoningComplex reasoning, code generation
GPT-3.5 TurboFast text generationHigh-volume, cost-sensitive workloads
EmbeddingsText embeddingsSearch, RAG, classification
DALL-E 3Image generationCreative content, design
WhisperSpeech to textTranscription, voice interfaces
💡
Good to know: Azure OpenAI Service processes your data within the Azure region you deploy to. Your prompts and completions are not sent to OpenAI, not used to train models, and are subject to Azure's enterprise compliance certifications.

Architecture Overview

💻

Resource & Deployments

An Azure OpenAI resource contains one or more model deployments, each with its own quota and configuration.

Quotas & Limits

Tokens-per-minute (TPM) and requests-per-minute (RPM) quotas are allocated per deployment per region.

🔒

Content Safety

Built-in content filtering for harmful content detection, with configurable severity thresholds.

🔄

API Compatibility

Compatible with the OpenAI Python SDK, making migration from OpenAI to Azure OpenAI straightforward.

Key takeaway: Azure OpenAI Service is the enterprise path to production LLM deployments. It provides the same OpenAI models with Azure's security, compliance, and networking features. The key infrastructure decisions you'll make are around deployment topology, quota management, pricing model (PTU vs PayGo), and network architecture.