Intermediate

Text Labeling

Create training data for NLP models with named entity recognition, text classification, sentiment analysis, and relation extraction using Label Studio.

Named Entity Recognition (NER)

NER involves identifying and classifying named entities in text — people, organizations, locations, dates, and more. Annotators highlight text spans and assign entity labels.

XML - NER Template
<View>
  <Labels name="ner" toName="text">
    <Label value="Person" background="#FF6B6B" />
    <Label value="Organization" background="#4ECDC4" />
    <Label value="Location" background="#45B7D1" />
    <Label value="Date" background="#FFA07A" />
    <Label value="Money" background="#98D8C8" />
  </Labels>
  <Text name="text" value="$text" />
</View>

Import data as JSON with a text field:

JSON
[
  {"text": "Apple Inc. announced today that CEO Tim Cook will visit Paris on March 15, 2026 to discuss a $2B investment."},
  {"text": "Google's headquarters in Mountain View, California employs over 180,000 people worldwide."}
]

Text Classification

Classify entire documents or sentences into categories. Useful for sentiment analysis, topic classification, spam detection, and intent recognition:

XML - Text Classification Template
<View>
  <Text name="text" value="$text" />

  <Header value="What is the sentiment?" />
  <Choices name="sentiment" toName="text"
           choice="single" showInline="true">
    <Choice value="Positive" />
    <Choice value="Neutral" />
    <Choice value="Negative" />
  </Choices>

  <Header value="Select all applicable topics" />
  <Choices name="topics" toName="text"
           choice="multiple" showInline="true">
    <Choice value="Technology" />
    <Choice value="Business" />
    <Choice value="Science" />
    <Choice value="Politics" />
    <Choice value="Sports" />
  </Choices>
</View>

Relation Extraction

Extract relationships between entities in text. First label entities, then draw relations between them:

XML - Relation Extraction Template
<View>
  <Relations>
    <Relation value="works_at" />
    <Relation value="founded_by" />
    <Relation value="located_in" />
    <Relation value="acquired_by" />
  </Relations>
  <Labels name="ner" toName="text">
    <Label value="Person" background="#FF6B6B" />
    <Label value="Organization" background="#4ECDC4" />
    <Label value="Location" background="#45B7D1" />
  </Labels>
  <Text name="text" value="$text" />
</View>

NER + Classification Combined

Combine entity tagging with document-level classification in a single annotation task:

XML - Combined NER + Classification
<View>
  <Labels name="entities" toName="text">
    <Label value="Product" background="#FF6B6B" />
    <Label value="Feature" background="#4ECDC4" />
    <Label value="Issue" background="#FFA07A" />
  </Labels>
  <Text name="text" value="$text" />
  <Choices name="sentiment" toName="text">
    <Choice value="Positive Review" />
    <Choice value="Negative Review" />
    <Choice value="Bug Report" />
    <Choice value="Feature Request" />
  </Choices>
</View>
Annotation guidelines: Always write clear annotation guidelines before starting a labeling project. Define what constitutes each entity type, how to handle ambiguous cases, and provide examples. This dramatically improves inter-annotator agreement.

Audio Transcription

Label Studio also supports audio annotation for speech-to-text, speaker diarization, and sound event detection:

XML - Audio Transcription Template
<View>
  <Audio name="audio" value="$audio" />
  <Header value="Transcribe the audio" />
  <TextArea name="transcription" toName="audio"
            rows="4" editable="true"
            maxSubmissions="1" />
  <Choices name="quality" toName="audio">
    <Choice value="Clear" />
    <Choice value="Noisy" />
    <Choice value="Unintelligible" />
  </Choices>
</View>
Exporting NER annotations: When exporting NER data, choose the format that matches your training framework. Use spaCy format for spaCy models, CoNLL format for sequence labeling, or the native JSON format for custom training pipelines.

What's Next?

In the next lesson, we will explore ML-assisted labeling — connecting ML backends to provide pre-annotations and using active learning to prioritize the most informative samples.