Radically efficient data annotation tool

Fully scriptable.
Made for machine learning
and developers.

From the creators of spaCy
pip install ./prodigy.whl
Successfully installed prodigy

prodigy ner.manual reviews_ner en_core_web_sm ./data.jsonl --label PRODUCT,PERSON,ORG

✨ Starting the web server on port 8080...
Open the app in your browser and start annotating!

Train a new AI model in hours

Prodigy is a scriptable annotation tool so efficient that data scientists can do the annotation themselves, enabling a new level of rapid iteration.

Today’s transfer learning technologies mean you can train production-quality models with very few examples. With Prodigy you can take full advantage of modern machine learning by adopting a more agile approach to data collection. You'll move faster, be more independent and ship far more successful projects.

How it works

The missing piece in your data science workflow

Prodigy brings together state-of-the-art insights from machine learning and user experience. With its continuous active learning system, you're only asked to annotate examples the model does not already know the answer to. The web application is powerful, extensible and follows modern UX principles. The secret is very simple: it's designed to help you focus on one decision at a time and keep you clicking – like Tinder for data.

Everyone knows data scientists should spend more time looking at their data. When good habits are hard to form, the trick is to remove the friction. Prodigy makes the right thing easy, encouraging you to spend more time understanding your problem and interpreting your results.

Try the demo

Try it live and highlight entities!

This live demo requires JavaScript to be enabled.

Try it live and select text categories!

This live demo requires JavaScript to be enabled.

Try it live and draw bounding boxes!

This live demo requires JavaScript to be enabled.

Try it live and type some text!

This live demo requires JavaScript to be enabled.

Prodigy users include

Try out new ideas quickly

Annotation is usually the part where projects stall. Instead of having an idea and trying it out, you start scheduling meetings, writing specifications and dealing with quality control. With Prodigy, you can have an idea over breakfast and get your first results by lunch. Once the model is trained, you can export it as a versioned Python package, giving you a smooth path from prototype to production.

Read more

What others say

Fully scriptable and extensible

Prodigy is fully scriptable, and slots neatly into the rest of your Python-based data science workflow. As the makers of spaCy, a popular library for Natural Language Processing, we understand how to make tools programmers love. The simple secret is this: programmers want to be able to program. Good developer tools need to let you in, not lock you out. That's why Prodigy comes with a rich Python API, elegant command-line integration, and a super productive Jupyter extension. Using custom recipe scripts, you can adapt Prodigy to read and write data however you like, and plug in custom models using any of your favourite frameworks.

recipe.pyimport prodigy
from prodigy.components.loaders import JSONL

@prodigy.recipe("custom")
def custom_recipe(dataset, source):
    return {
        "dataset": dataset,
        "stream": JSONL(source),
        "view_id": "classification"
    }

Command-line usage

prodigycustommy_dataset./data.jsonl-F recipe.py

Browse features