Potato Documentation
Potato (POrTable Annotation TOol) is a fully free data annotation tool supporting a wide range of features throughout your entire annotation pipeline.
Guides
Role-based guides that walk you through Potato for your specific use case:
- Getting Started Guide - First-time setup and your first annotation project
- Administrator Guide - Managing annotators, quality control, and monitoring
- Developer Guide - Extending Potato, API integration, and custom schemas
- Crowdsourcing Guide - Running tasks on Prolific and MTurk
- Agent Evaluation Guide - Evaluating AI agents, coding agents, and web agents
- AI-Assisted Annotation Guide - Using LLMs, active learning, and Solo Mode
Getting Started
- Quick Start - Get running in 5 minutes
- Installation & Usage - Detailed setup guide
- Configuration Reference - Complete config options
- Comparison with Other Tools - How Potato compares to alternatives
Annotation Schemas
- Choosing the Right Annotation Type - Decision guide for selecting the best schema for your task
- Schema Gallery - All annotation types with examples
- Instance Display - Display images, video, audio, and text separately from annotation collection
- Conditional Logic - Show/hide questions based on prior answers
- Image Annotation - Bounding boxes, polygons, and landmarks
- Audio Annotation - Audio segmentation with waveform visualization
- Video Annotation - Frame-by-frame video labeling
- Tiered Annotation - ELAN-style hierarchical multi-tier annotation
- Triage - Rapid accept/reject/skip data curation interface
- Entity Linking - Link spans to external knowledge bases (Wikidata, UMLS)
- Coreference Annotation - Group mentions of the same entity
- Conversation Tree Annotation - Annotate hierarchical conversation structures
- Format Support - PDF, Word, code, and spreadsheet annotation
- Span Linking - Relationship linking between text spans
- Soft Label - Probability distribution across labels via constrained sliders
- Confidence Annotation - Pair any annotation with an explicit confidence rating
- Constant Sum - Allocate a fixed budget of points across categories
- Semantic Differential - Bipolar adjective scales measuring connotative meaning
- Ranking - Drag-and-drop ordering of items by preference
- Range Slider - Dual-thumb slider for selecting an acceptable min-max range
- Hierarchical Multi-Label Selection - Select labels from an expandable tree taxonomy
- Visual Analog Scale (VAS) - Continuous analog scale for fine-grained magnitude estimation
- Extractive QA - SQuAD-style answer span highlighting
- Rubric Evaluation - Multi-criteria rubric grid for LLM evaluation
- Text Edit / Post-Edit - Inline text editing with diff tracking
- Error Span (MQM) - Error annotation with typed severity and quality scoring
- Card Sorting - Drag-and-drop grouping of items into categories
- Conjoint Analysis - Discrete choice between multi-attribute profiles
- Pairwise Comparison - Binary or scale-based A/B comparisons
- Multi-Dimensional Pairwise - Compare items on multiple axes simultaneously
- Best-Worst Scaling - Select best and worst from tuples
- Dialogue Annotation - Multi-turn conversation annotation
- Text Annotation - Free-text input and rationale annotation
- Event Annotation - N-ary event structures with triggers and arguments
- Trajectory Evaluation - Per-step error annotation for agent traces
Workflow & Quality
- Annotation Navigation - Navigation tools and status indicators
- Task Assignment - Assignment strategies and configuration
- Heterogeneous Coverage - Single-annotator default with a multi-annotator overlap sample, adaptive boost, per-annotator quotas, and full IAA reporting
- Diversity Ordering - Embedding-based clustering for diverse item presentation
- Training Phase - Annotator training and qualification
- Quality Control - Attention checks and gold standards
- Adjudication - Multi-annotator disagreement resolution
- MACE - Multi-Annotator Competence Estimation via variational inference
- Iterative BWS - Adaptive Best-Worst Scaling for fine-grained ordinal rankings
- Category Assignment - Category-based item assignment
- Surveyflow - Pre/post annotation surveys
- Annotation Filtering - Filter data based on prior annotations
- Survey Instruments - 55 pre-built validated psychological instruments
- Memos - Universal annotator notes (instance/span-anchored, private/shared)
- Search - Universal FTS5 search; admin search + guarded annotator search-and-claim
- Codebook - Universal mutable code set (nested, opt-in per scheme, on-the-fly add)
- Cases - Group instances into units of analysis; QDA auto-detect; crosstab integration
Agent Evaluation
- Coding Agent Annotation - Evaluate agentic coding systems (Claude Code, SWE-Agent, Aider) with diff rendering, PRM annotation, and code review
- Agent Traces - Evaluate AI agent traces and trajectories
- Live Agent Interaction - Observe and interact with a live AI agent in real time
- Web Agent Annotation - Review and create web agent browsing traces
Solo Mode
- Solo Mode - Human-LLM collaborative annotation workflow
- Solo Mode Advanced Features - Edge case rules, labeling functions, confidence routing
- Solo Mode Developer Guide - Architecture and extension points
AI & Intelligence
- AI Support - AI-powered label suggestions
- Active Learning - ML-based prioritization
- Active Learning Strategies - Query strategies reference (BADGE, BALD, hybrid, cold-start)
- ICL Labeling - In-context learning for labeling
- Visual AI Support - YOLO and vision LLM support for image/video annotation
- Chat Support - LLM-powered sidebar for annotator assistance
- Option Highlighting - AI-assisted highlighting of likely annotation options
- Embedding Visualization - UMAP-based instance similarity dashboard
Authentication & User Management
- Users & Collaboration - User registration, access control, and collaboration
- Password Management - Password security, reset flows, database backend, and shared credentials
- Passwordless Login - Authentication without passwords
- SSO & OAuth Authentication - Google, GitHub, and institutional SSO login
Crowdsourcing
- Crowdsourcing Guide - Prolific and MTurk integration
- MTurk Integration - Detailed Amazon MTurk setup guide
Administration
- Admin Dashboard - Monitoring and management
- Behavioral Tracking - User behavior analytics
- Annotation History - Tracking annotation changes
Data & Output
- Data Format - Input and output data formats
- Export Formats - Export to COCO, YOLO, CoNLL, and more
- HuggingFace Hub Export - Push annotations to HuggingFace Hub
- HuggingFace Datasets Integration - Load annotations as DatasetDict or DataFrame
- Remote Data Sources - Load data from S3, Google Drive, Dropbox, URLs, and databases
- Data Directory - Load data from a directory with optional live watching
UI & Customization
- UI Configuration - Interface customization
- Layout Customization - Custom CSS layouts and styling
- Form Layout - Grid layout, column spanning, styling, and alignment
- Multilingual - Localization and RTL support
Integrations
- Webhooks - Outgoing webhook notifications for annotation events
- HuggingFace Spaces - Deploy Potato on HuggingFace Spaces
- LangChain Integration - Send LangChain agent traces to Potato
Tools & Utilities
- Preview CLI - Preview configs without running server
- Migration CLI - Upgrade v1 configs to v2
- Debugging Guide - Debug flags and troubleshooting
- Simulator - Annotation simulation tool
- API Reference - REST API endpoints documentation
Productivity Features
- Productivity - Tooltips, shortcuts, and highlights
Release Notes
- v2.4.4 - Span Annotation Fixes & UX Improvements
- v2.4.3 - Coding Agent Annotation, Localization & Stability
- v2.4.1 - Bug Fixes
- v2.4.0 - Agent Evaluation, AI-Assisted Annotation & Enterprise Integration
- v2.3.0 - Solo Mode, Agent Workflows & Security Hardening
- v2.2.0 - Comprehensive Annotation & Export Platform
- v2.1.0 - Adjudication & Multi-Modal Annotation
- v2.0.0 - Backend Refactor
Contributing
- Contributing Guide - How to contribute to Potato
Quick Links
| Task | Documentation |
|---|---|
| Set up a basic annotation task | Quick Start |
| Choose an annotation type | Schema Gallery |
| Display images/video with radio buttons | Instance Display |
| Show/hide questions based on answers | Conditional Logic |
| Annotate PDFs, Word docs, or code | Format Support |
| Set up SSO/OAuth login | SSO Authentication |
| Reset a user's password | Password Management |
| Use a database for user storage | Password Management |
| Configure for MTurk | MTurk Integration |
| Configure for Prolific | Crowdsourcing Guide |
| Monitor annotation progress | Admin Dashboard |
| Add AI suggestions | AI Support |
| Set up quality control | Quality Control |
| Present items diversely | Diversity Ordering |
| Debug configuration issues | Debugging Guide |
| Create custom visual layouts | Layout Customization |
| Rapidly filter/triage data | Triage |
| Link entities to knowledge bases | Entity Linking |
| Annotate coreference chains | Coreference Annotation |
| Annotate conversation trees | Conversation Tree Annotation |
| Navigate efficiently through items | Annotation Navigation |
| Evaluate AI agent traces | Agent Traces |
| Use Solo Mode for collaborative annotation | Solo Mode |
| Export annotations to Parquet | Export Formats |
| Export to COCO/YOLO/CoNLL | Export Formats |
| Push annotations to HuggingFace Hub | HuggingFace Export |
| Deploy on HuggingFace Spaces | HuggingFace Spaces |
| Set up webhook notifications | Webhooks |
| Use LLM chat assistant for annotators | Chat Support |
| Evaluate coding agents | Coding Agent Annotation |
| Review web agent traces | Web Agent Annotation |
| Load data from S3/GDrive/Dropbox | Remote Data Sources |
| Use pre-built survey instruments | Survey Instruments |
| Set up multilingual interface | Multilingual |
| Resolve annotator disagreements | Adjudication |
| Browse the REST API | API Reference |
Example Projects
Ready-to-use example configurations are available in the examples/ directory:
# Run a simple radio button example
python potato/flask_server.py start examples/classification/single-choice/config.yaml -p 8000
# Run a sophisticated layout example (content moderation, dialogue QA, medical review)
python potato/flask_server.py start examples/custom-layouts/content-moderation/config.yaml -p 8000
See the examples directory for more examples, including:
classification/- Classification annotation examples (radio, checkbox, likert, etc.)span/- Span annotation examples (NER, linking, coreference, etc.)image/- Image annotation examplesvideo/- Video annotation examplesaudio/- Audio annotation examplesadvanced/- Advanced features (conditional logic, quality control, etc.)agent-traces/- Agent trace evaluation examples (RAG, GUI agents, comparisons)custom-layouts/- Sophisticated custom layout examples