Potato Documentation

Potato (POrTable Annotation TOol) is a fully free data annotation tool supporting a wide range of features throughout your entire annotation pipeline.

Guides

Role-based guides that walk you through Potato for your specific use case:

Getting Started Guide - First-time setup and your first annotation project
Administrator Guide - Managing annotators, quality control, and monitoring
Developer Guide - Extending Potato, API integration, and custom schemas
Crowdsourcing Guide - Running tasks on Prolific and MTurk
Agent Evaluation Guide - Evaluating AI agents, coding agents, and web agents
AI-Assisted Annotation Guide - Using LLMs, active learning, and Solo Mode

Getting Started

Quick Start - Get running in 5 minutes
Installation & Usage - Detailed setup guide
Configuration Reference - Complete config options
Comparison with Other Tools - How Potato compares to alternatives

Annotation Schemas

Choosing the Right Annotation Type - Decision guide for selecting the best schema for your task
Schema Gallery - All annotation types with examples
Instance Display - Display images, video, audio, and text separately from annotation collection
Conditional Logic - Show/hide questions based on prior answers
Image Annotation - Bounding boxes, polygons, and landmarks
Audio Annotation - Audio segmentation with waveform visualization
Video Annotation - Frame-by-frame video labeling
Tiered Annotation - ELAN-style hierarchical multi-tier annotation
Triage - Rapid accept/reject/skip data curation interface
Entity Linking - Link spans to external knowledge bases (Wikidata, UMLS)
Coreference Annotation - Group mentions of the same entity
Conversation Tree Annotation - Annotate hierarchical conversation structures
Format Support - PDF, Word, code, and spreadsheet annotation
Span Linking - Relationship linking between text spans
Soft Label - Probability distribution across labels via constrained sliders
Confidence Annotation - Pair any annotation with an explicit confidence rating
Constant Sum - Allocate a fixed budget of points across categories
Semantic Differential - Bipolar adjective scales measuring connotative meaning
Ranking - Drag-and-drop ordering of items by preference
Range Slider - Dual-thumb slider for selecting an acceptable min-max range
Hierarchical Multi-Label Selection - Select labels from an expandable tree taxonomy
Visual Analog Scale (VAS) - Continuous analog scale for fine-grained magnitude estimation
Extractive QA - SQuAD-style answer span highlighting
Rubric Evaluation - Multi-criteria rubric grid for LLM evaluation
Text Edit / Post-Edit - Inline text editing with diff tracking
Error Span (MQM) - Error annotation with typed severity and quality scoring
Card Sorting - Drag-and-drop grouping of items into categories
Conjoint Analysis - Discrete choice between multi-attribute profiles
Pairwise Comparison - Binary or scale-based A/B comparisons
Multi-Dimensional Pairwise - Compare items on multiple axes simultaneously
Best-Worst Scaling - Select best and worst from tuples
Dialogue Annotation - Multi-turn conversation annotation
Text Annotation - Free-text input and rationale annotation
Event Annotation - N-ary event structures with triggers and arguments
Trajectory Evaluation - Per-step error annotation for agent traces

Workflow & Quality

Annotation Navigation - Navigation tools and status indicators
Task Assignment - Assignment strategies and configuration
Heterogeneous Coverage - Single-annotator default with a multi-annotator overlap sample, adaptive boost, per-annotator quotas, and full IAA reporting
Diversity Ordering - Embedding-based clustering for diverse item presentation
Training Phase - Annotator training and qualification
Quality Control - Attention checks and gold standards
Adjudication - Multi-annotator disagreement resolution
MACE - Multi-Annotator Competence Estimation via variational inference
Iterative BWS - Adaptive Best-Worst Scaling for fine-grained ordinal rankings
Category Assignment - Category-based item assignment
Surveyflow - Pre/post annotation surveys
Annotation Filtering - Filter data based on prior annotations
Survey Instruments - 55 pre-built validated psychological instruments
Memos - Universal annotator notes (instance/span-anchored, private/shared)
Search - Universal FTS5 search; admin search + guarded annotator search-and-claim
Codebook - Universal mutable code set (nested, opt-in per scheme, on-the-fly add)
Cases - Group instances into units of analysis; QDA auto-detect; crosstab integration

Agent Evaluation

Coding Agent Annotation - Evaluate agentic coding systems (Claude Code, SWE-Agent, Aider) with diff rendering, PRM annotation, and code review
Agent Traces - Evaluate AI agent traces and trajectories
Live Agent Interaction - Observe and interact with a live AI agent in real time
Web Agent Annotation - Review and create web agent browsing traces

Solo Mode

Solo Mode - Human-LLM collaborative annotation workflow
Solo Mode Advanced Features - Edge case rules, labeling functions, confidence routing
Solo Mode Developer Guide - Architecture and extension points

AI & Intelligence

AI Support - AI-powered label suggestions
Active Learning - ML-based prioritization
Active Learning Strategies - Query strategies reference (BADGE, BALD, hybrid, cold-start)
ICL Labeling - In-context learning for labeling
Visual AI Support - YOLO and vision LLM support for image/video annotation
Chat Support - LLM-powered sidebar for annotator assistance
Option Highlighting - AI-assisted highlighting of likely annotation options
Embedding Visualization - UMAP-based instance similarity dashboard

Authentication & User Management

Users & Collaboration - User registration, access control, and collaboration
Password Management - Password security, reset flows, database backend, and shared credentials
Passwordless Login - Authentication without passwords
SSO & OAuth Authentication - Google, GitHub, and institutional SSO login

Crowdsourcing

Crowdsourcing Guide - Prolific and MTurk integration
MTurk Integration - Detailed Amazon MTurk setup guide

Administration

Admin Dashboard - Monitoring and management
Behavioral Tracking - User behavior analytics
Annotation History - Tracking annotation changes

Data & Output

Data Format - Input and output data formats
Export Formats - Export to COCO, YOLO, CoNLL, and more
HuggingFace Hub Export - Push annotations to HuggingFace Hub
HuggingFace Datasets Integration - Load annotations as DatasetDict or DataFrame
Remote Data Sources - Load data from S3, Google Drive, Dropbox, URLs, and databases
Data Directory - Load data from a directory with optional live watching

UI & Customization

UI Configuration - Interface customization
Layout Customization - Custom CSS layouts and styling
Form Layout - Grid layout, column spanning, styling, and alignment
Multilingual - Localization and RTL support

Integrations

Webhooks - Outgoing webhook notifications for annotation events
HuggingFace Spaces - Deploy Potato on HuggingFace Spaces
LangChain Integration - Send LangChain agent traces to Potato

Tools & Utilities

Preview CLI - Preview configs without running server
Migration CLI - Upgrade v1 configs to v2
Debugging Guide - Debug flags and troubleshooting
Simulator - Annotation simulation tool
API Reference - REST API endpoints documentation

Productivity Features

Productivity - Tooltips, shortcuts, and highlights

Release Notes

v2.4.4 - Span Annotation Fixes & UX Improvements
v2.4.3 - Coding Agent Annotation, Localization & Stability
v2.4.1 - Bug Fixes
v2.4.0 - Agent Evaluation, AI-Assisted Annotation & Enterprise Integration
v2.3.0 - Solo Mode, Agent Workflows & Security Hardening
v2.2.0 - Comprehensive Annotation & Export Platform
v2.1.0 - Adjudication & Multi-Modal Annotation
v2.0.0 - Backend Refactor

Contributing

Contributing Guide - How to contribute to Potato

Quick Links

Task	Documentation
Set up a basic annotation task	Quick Start
Choose an annotation type	Schema Gallery
Display images/video with radio buttons	Instance Display
Show/hide questions based on answers	Conditional Logic
Annotate PDFs, Word docs, or code	Format Support
Set up SSO/OAuth login	SSO Authentication
Reset a user's password	Password Management
Use a database for user storage	Password Management
Configure for MTurk	MTurk Integration
Configure for Prolific	Crowdsourcing Guide
Monitor annotation progress	Admin Dashboard
Add AI suggestions	AI Support
Set up quality control	Quality Control
Present items diversely	Diversity Ordering
Debug configuration issues	Debugging Guide
Create custom visual layouts	Layout Customization
Rapidly filter/triage data	Triage
Link entities to knowledge bases	Entity Linking
Annotate coreference chains	Coreference Annotation
Annotate conversation trees	Conversation Tree Annotation
Navigate efficiently through items	Annotation Navigation
Evaluate AI agent traces	Agent Traces
Use Solo Mode for collaborative annotation	Solo Mode
Export annotations to Parquet	Export Formats
Export to COCO/YOLO/CoNLL	Export Formats
Push annotations to HuggingFace Hub	HuggingFace Export
Deploy on HuggingFace Spaces	HuggingFace Spaces
Set up webhook notifications	Webhooks
Use LLM chat assistant for annotators	Chat Support
Evaluate coding agents	Coding Agent Annotation
Review web agent traces	Web Agent Annotation
Load data from S3/GDrive/Dropbox	Remote Data Sources
Use pre-built survey instruments	Survey Instruments
Set up multilingual interface	Multilingual
Resolve annotator disagreements	Adjudication
Browse the REST API	API Reference

Example Projects

Ready-to-use example configurations are available in the examples/ directory:

# Run a simple radio button example
python potato/flask_server.py start examples/classification/single-choice/config.yaml -p 8000

# Run a sophisticated layout example (content moderation, dialogue QA, medical review)
python potato/flask_server.py start examples/custom-layouts/content-moderation/config.yaml -p 8000

See the examples directory for more examples, including:

classification/ - Classification annotation examples (radio, checkbox, likert, etc.)
span/ - Span annotation examples (NER, linking, coreference, etc.)
image/ - Image annotation examples
video/ - Video annotation examples
audio/ - Audio annotation examples
advanced/ - Advanced features (conditional logic, quality control, etc.)
agent-traces/ - Agent trace evaluation examples (RAG, GUI agents, comparisons)
custom-layouts/ - Sophisticated custom layout examples