Intermediate

Implement NLP Solutions (25-30%)

This is the largest exam domain. Master Conversational Language Understanding (CLU), Text Analytics, Translator, question answering, custom text classification, and speech services to maximize your score.

Highest-Weighted Domain: NLP accounts for 25-30% of the exam. Expect 12-18 questions on these topics. Invest extra study time here for the biggest impact on your score.

Conversational Language Understanding (CLU)

CLU (formerly LUIS) is Microsoft's service for building natural language understanding models that interpret user intents and extract entities from text:

Key Concepts

  • Intents — What the user wants to do (e.g., BookFlight, GetWeather, OrderFood). Every project has a built-in None intent for out-of-scope utterances.
  • Entities — Key data to extract from utterances (e.g., destination city, date, food item). Types include learned, list, prebuilt, and regex entities.
  • Utterances — Example sentences labeled with intents and entities used to train the model.

CLU Workflow

  1. Create a Language resource in Azure Portal
  2. Create a CLU project in Language Studio
  3. Define intents (include the None intent with examples of out-of-scope queries)
  4. Define entities (learned, list, prebuilt, or regex)
  5. Add utterances (15-30 per intent minimum, label entities in each)
  6. Train the model (standard or advanced training)
  7. Evaluate using precision, recall, and F1 score per intent and entity
  8. Deploy to a deployment slot and call via REST API or SDK
# Call a CLU model
from azure.ai.language.conversations import ConversationAnalysisClient
from azure.core.credentials import AzureKeyCredential

client = ConversationAnalysisClient(
    endpoint="https://<resource>.cognitiveservices.azure.com/",
    credential=AzureKeyCredential("<key>")
)

result = client.analyze_conversation(
    task={
        "kind": "Conversation",
        "analysisInput": {
            "conversationItem": {
                "id": "1",
                "text": "Book a flight to London next Friday",
                "participantId": "user1"
            }
        },
        "parameters": {
            "projectName": "FlightBooking",
            "deploymentName": "production"
        }
    }
)

top_intent = result["result"]["prediction"]["topIntent"]
entities = result["result"]["prediction"]["entities"]
print(f"Intent: {top_intent}")
for entity in entities:
    print(f"Entity: {entity['category']} = {entity['text']}")
💡
Exam Tip: The None intent is critical. Always add example utterances to the None intent (about 10% of total utterances). Without it, the model may incorrectly classify out-of-scope queries with high confidence.

Text Analytics / Azure AI Language

Pre-built NLP capabilities available through the Azure AI Language service:

Key Features

FeatureDescriptionOutput
Sentiment AnalysisDetermine positive, negative, neutral, or mixed sentimentSentiment label + confidence scores per sentence and document
Key Phrase ExtractionExtract main topics and talking pointsList of key phrases
Named Entity Recognition (NER)Identify people, places, organizations, dates, quantitiesEntity text + category + subcategory + confidence
Entity LinkingLink entities to Wikipedia entries for disambiguationEntity + Wikipedia URL + data source
Language DetectionDetect the language of input textLanguage name + ISO code + confidence score
PII DetectionDetect and redact personally identifiable informationPII entities + redacted text

Custom Text Classification

When pre-built models do not fit your needs, you can train custom models:

  • Single-label classification — Assign exactly one category per document (e.g., support ticket routing)
  • Multi-label classification — Assign one or more categories per document (e.g., article tagging)
  • Training data is stored in Azure Blob Storage as text files with a JSON labels file
  • Minimum 10 labeled documents per class; 50+ recommended for good accuracy

Custom Named Entity Recognition

Extract domain-specific entities that pre-built NER does not cover:

  • Label training documents with custom entity types in Language Studio
  • Example: Extract product codes, contract numbers, or medical terms
  • Requires at least 10 labeled documents; 50+ recommended

Question Answering

Build a knowledge base that answers natural language questions from your content:

  • Sources — Import from URLs (FAQ pages), PDF/Word documents, or manual QnA pairs
  • Multi-turn conversations — Create follow-up prompts for guided conversations
  • Chit-chat — Add a pre-built personality (professional, friendly, witty, caring, enthusiastic)
  • Active learning — Review user queries that the model was uncertain about to improve accuracy
  • Precise answering — Return the exact answer span within a longer passage
💡
Exam Tip: Question Answering replaces QnA Maker. Know the migration path and that the new service is part of Azure AI Language (not a separate resource type).

Translator Service

Azure Translator provides real-time text and document translation:

  • Text Translation — Translate text between 100+ languages via REST API
  • Document Translation — Translate entire documents (PDF, DOCX, HTML, etc.) while preserving formatting
  • Custom Translator — Train custom translation models with your domain-specific terminology using parallel documents
  • Transliteration — Convert text from one script to another (e.g., Japanese Kanji to Latin)
  • Language detection — Auto-detect source language when not specified
  • Dictionary lookup — Get alternative translations and usage examples

Speech Services

Azure Speech service provides speech-to-text, text-to-speech, speech translation, and speaker recognition:

Speech-to-Text

  • Real-time transcription — Streaming audio to text
  • Batch transcription — Process stored audio files at scale
  • Custom Speech — Train custom models for domain-specific vocabulary, accents, or noisy environments
  • Phrase lists — Boost recognition of specific terms without full custom training

Text-to-Speech

  • Neural voices — Natural-sounding synthesis in 100+ languages
  • Custom Neural Voice — Create a unique voice from training recordings
  • SSML — Speech Synthesis Markup Language for controlling pronunciation, rate, pitch, pauses, and emphasis
<!-- SSML example -->
<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xml:lang="en-US">
  <voice name="en-US-JennyNeural">
    <prosody rate="-10%" pitch="+5%">
      Welcome to Azure AI services.
    </prosody>
    <break time="500ms"/>
    <emphasis level="strong">Let us get started.</emphasis>
  </voice>
</speak>

Practice Questions

📝
Question 1: You are building a chatbot that needs to understand user intents like "book a room" and extract entities like dates and room types. The existing pre-built models do not cover your domain-specific terminology. Which service should you use?

A. Text Analytics sentiment analysis
B. Conversational Language Understanding (CLU)
C. Question Answering
D. Azure OpenAI GPT-4
Show Answer
Answer: B. CLU is designed for intent classification and entity extraction from conversational text. You define custom intents, entities, and provide labeled utterances to train a model specific to your domain. Question Answering is for FAQ-style responses, not intent/entity extraction.
📝
Question 2: Your CLU model has high confidence scores but frequently misclassifies out-of-scope user queries as valid intents. What is the most likely cause?

A. Too many intents defined
B. The None intent has too few or no example utterances
C. The model needs more training iterations
D. The entities are not properly labeled
Show Answer
Answer: B. The None intent must have representative examples of out-of-scope queries. Without sufficient None utterances, the model has no examples of irrelevant input and will force-classify everything into one of the defined intents, even with high confidence.
📝
Question 3: A healthcare company needs to detect and redact patient names, medical record numbers, and Social Security numbers from clinical notes before sharing them with researchers. Which feature should you use?

A. Named Entity Recognition (NER)
B. PII Detection with redaction
C. Key Phrase Extraction
D. Custom text classification
Show Answer
Answer: B. PII Detection identifies personally identifiable information and can return redacted text with PII replaced by category labels (e.g., "[PERSON]", "[SSN]"). Standard NER identifies entities but does not provide redaction functionality.
📝
Question 4: You need to translate product documentation from English to 15 languages while preserving the original document formatting (tables, headers, images). Which Translator capability should you use?

A. Text Translation API
B. Document Translation
C. Custom Translator
D. Transliteration API
Show Answer
Answer: B. Document Translation translates entire documents (PDF, DOCX, HTML, PPTX, etc.) while preserving the original structure and formatting. Text Translation only handles plain text strings. Custom Translator is for training domain-specific translation models.
📝
Question 5: You are building a voice-enabled application. The application must pronounce technical terms correctly and add pauses between sections. What should you use to control the speech output?

A. Custom Neural Voice
B. Phrase lists
C. SSML (Speech Synthesis Markup Language)
D. Audio Content Creation tool
Show Answer
Answer: C. SSML provides fine-grained control over speech output including pronunciation (<phoneme>), pauses (<break>), rate, pitch, emphasis, and voice selection. Custom Neural Voice creates a new voice but does not control pronunciation of specific terms.

Key Takeaways

  • CLU (formerly LUIS) is for intent classification and entity extraction. Always populate the None intent.
  • Text Analytics provides pre-built NLP: sentiment, key phrases, NER, PII detection, and language detection.
  • Question Answering replaces QnA Maker and is now part of Azure AI Language.
  • Document Translation preserves formatting; Text Translation handles plain text only.
  • SSML controls speech synthesis: pronunciation, pauses, rate, pitch, and emphasis.
  • This domain is 25-30% of the exam — know these services thoroughly.