Best Practices
Learn team workflow management, quality control strategies, inter-annotator agreement, choosing export formats, and scaling your annotation pipeline for production ML projects.
Annotation Guidelines
The single most important factor for annotation quality is clear, comprehensive guidelines. Your guidelines should include:
- Definitions: Precise definition of each label category with boundary cases
- Examples: Positive and negative examples for each label
- Edge cases: How to handle ambiguous situations
- Conventions: Whether to include punctuation in NER spans, how tight bounding boxes should be, etc.
- Versioning: Track guideline changes and re-annotate affected tasks when definitions change
Quality Control Strategies
-
Overlap / Redundancy
Have multiple annotators label the same tasks (typically 2-3x overlap). Use agreement metrics to identify problematic tasks and annotators who need retraining.
-
Review Workflow
Assign senior annotators or domain experts as reviewers. They approve, reject, or correct annotations before they enter the training dataset.
-
Gold Standard Tasks
Mix in pre-labeled "gold" tasks that you know the correct answer for. Monitor annotator accuracy on these to detect quality drops.
-
Spot Checks
Randomly sample completed annotations for manual review. Calculate per-annotator accuracy and provide feedback.
Inter-Annotator Agreement
Measure how consistently your annotators label the same data. Common metrics include:
| Metric | Use Case | Range |
|---|---|---|
| Cohen's Kappa | Two annotators, categorical labels | -1 to 1 (>0.8 = excellent) |
| Fleiss' Kappa | Multiple annotators, categorical labels | -1 to 1 (>0.6 = good) |
| IoU (Jaccard) | Bounding boxes, segmentation | 0 to 1 (>0.7 = good) |
| F1 Score | NER span matching | 0 to 1 (>0.8 = good) |
Export Format Selection
Choose your export format based on your ML framework and task type:
# Object Detection YOLO → YOLOv5/v8 training COCO → Detectron2, MMDetection Pascal VOC → TensorFlow Object Detection API # NLP / Text spaCy → spaCy NER training CoNLL → Sequence labeling (CRF, BiLSTM) JSON → Custom pipelines, HuggingFace # General JSON → Most flexible, full annotation data JSON-MIN → Simplified, smaller file size CSV → Classification tasks, spreadsheets
Using the API
Automate your annotation pipeline with the Label Studio API:
from label_studio_sdk import Client # Connect to Label Studio ls = Client( url="http://localhost:8080", api_key="your-api-key" ) # Create a project project = ls.start_project( title="NER Project", label_config=""" <View> <Labels name="ner" toName="text"> <Label value="Person" /> <Label value="Organization" /> </Labels> <Text name="text" value="$text" /> </View> """ ) # Import tasks project.import_tasks([ {"text": "John works at Google."}, {"text": "Mary founded Acme Corp."}, ]) # Export annotations annotations = project.export_tasks( export_type="JSON" )
Scaling Tips
Use PostgreSQL
Switch from SQLite to PostgreSQL for projects with more than 10,000 tasks or multiple concurrent annotators.
Cloud Storage
Store data files in S3/GCS/Azure instead of uploading directly. This reduces server load and enables larger datasets.
Task Distribution
Use task assignment rules to distribute work evenly across annotators and prevent duplicate effort.
Monitor Progress
Track annotation speed, quality metrics, and remaining tasks. Set daily targets and identify bottlenecks early.
- Starting annotation without clear guidelines
- Not measuring inter-annotator agreement
- Using the wrong export format for your ML framework
- Not backing up your Label Studio database regularly
- Ignoring annotator feedback about ambiguous cases
Course Summary
Congratulations on completing the Label Studio course! You have learned how to install and configure Label Studio, create labeling templates for images and text, connect ML backends for assisted labeling, and implement quality control workflows for production annotation pipelines.
Lilly Tech Systems