Span Categorization · Prodigy · An annotation tool for AI, Machine Learning & NLP

Fast and flexible annotation

Prodigy’s web-based annotation app has been carefully designed to be as efficient as possible. The manual interface lets you label spans by highlighting words text by hand. Your annotations snap to token boundaries, and you can mark single-word spans by double-clicking.

Try it live and highlight spans!

This live demo requires JavaScript to be enabled.

Try it live and highlight entities!

This live demo requires JavaScript to be enabled.

patterns.jsonl{"pattern": "septic shock", "label": "CONDITION"}
{"pattern": [{"like_num": true}, {"orth": "-"}, {"lower": "day"}, {"lower": "mortality"}], "label": "EFFECT"}

This live demo requires JavaScript to be enabled.

Bootstrap with powerful patterns

Prodigy is a fully scriptable annotation tool, letting you automate as much as possible with custom rule-based logic. You don’t want to waste time labeling every instance of common phrases by hand. Instead, give Prodigy rules or a list of examples, review the spans in context and annotate the exceptions.

Immediately train spancat models

Once you've got your first annotations you can immediately have Prodigy train spaCy models for span categorization. You can point the trainto the datasets of interest and immediately get a machine learning pipeline for text classification. You can even train a model that handles multiple tasks and choose to override the settings from the command line.

From here, you can re-use the model to make annotation easier via spans.correct to pre-highlight annotations for you.

Example
prodigytrain./spancat-model--spancat dataset_a,dataset_b--training.max-steps 1000

Example
prodigyspans.correctspans-dataset./spancat-modelexamples.jsonl--label condition,effect

View the documentation