Automation Rules¶
The automation engine closes the production → eval loop: a programmable
filter → sampling rate → actions pipeline runs over every item entering Potato
— whether loaded from data_files or ingested at runtime via the
trace webhook / tracing SDK.
Each matching rule can route the item to the annotation queue, curate it into an
eval dataset, run an
evaluator, fire an outbound webhook, or notify annotators.
Enabling¶
automation:
enabled: true
rules:
- name: route-errors
when: {field: status, in: [error, failed]}
sample_rate: 1.0 # 0.0–1.0 (default 1.0 = every match)
actions:
- {type: add_to_queue, priority: 100, reason: "Agent errored"}
- {type: add_to_dataset, dataset: errors-to-fix}
- {type: run_evaluator, evaluator: trajectory_match}
- {type: fire_webhook, url: "https://example.com/hook"}
- {type: notify, message: "New error trace"}
Rules¶
A rule fires for an item when both:
whenmatches — the shared condition grammar (same as triage):equals,in,contains,exists,lt/lte/gt/gte, dotted field paths (metadata.score). A list of conditions is AND-ed; an empty/absentwhenmatches everything.sample_rateselects it — deterministic sampling on a hash of(item id, rule name), so re-processing the same item yields the same decision (idempotent, replay-safe).1.0= always,0.0= never.
Common fields on an ingested trace:
metadata.source(webhook/langsmith/langfuse),task_description, plus any top-level fields your payload includes that survive normalization.
Actions¶
| Action | When it runs | Effect |
|---|---|---|
add_to_queue |
inline (fast) | Boost the item's triage priority so the priority assignment strategy surfaces it. Params: priority, reason. |
add_to_dataset |
inline (fast) | Append the item as an example to a dataset (created if absent). Params: dataset. |
notify |
inline (fast) | Notify connected annotators via SSE. Params: message. |
run_evaluator |
background worker | Score the item with an evaluator; the score is stored on the item (metadata.automation_eval). Params: evaluator, params. |
fire_webhook |
background worker | POST {rule, item_id, item_data} to an external URL. Params: url, headers. |
Fast actions run inline in the ingestion path (cheap, in-process). Heavy
actions (run_evaluator, fire_webhook) are dispatched to a background worker
so ingestion never blocks. Every action records an outcome; failures are caught
and logged as error outcomes — automation never breaks ingestion.
Ordering note: actions within a rule run in listed order, but heavy actions complete asynchronously, so a
fire_webhookmay finish after a later inline action. (Mirrors LangSmith's per-rule scheduling caveat.)
Inspecting¶
The admin dashboard links to Automation (/admin/automation), showing
configured rules, activity counters, and recent action outcomes. JSON API:
| Path | Returns |
|---|---|
GET /admin/automation/status |
rules + counters + per-action breakdown |
GET /admin/automation/outcomes?limit=N |
recent action outcomes |
Example¶
examples/agent-traces/automation-loop/ is a runnable demo:
python potato/flask_server.py start examples/agent-traces/automation-loop/config.yaml -p 8000
curl -X POST http://localhost:8000/api/traces/webhook -H "Content-Type: application/json" \
-d '{"id":"run-1","task_description":"buy milk","status":"error","steps":[{"action_type":"click"}]}'
Related¶
- Datasets & Experiments —
add_to_datasettargets - Programmatic Evaluators —
run_evaluator - Triage Queue — shares the condition grammar
- Tracing SDK / Agent traces — sources of incoming items