Annotation History and Timestamps
Potato provides comprehensive tracking of all annotation actions with fine-grained timestamp metadata. This enables performance analysis, quality assurance, and detailed audit trails.
Overview
The annotation history system tracks:
- Every annotation action: Label selections, span annotations, text inputs
- Precise timestamps: Server and client-side timestamps
- Action metadata: User, instance, schema, old/new values
- Performance metrics: Processing times, action rates
- Suspicious activity: Unusually fast or burst activity patterns
Features
Action Tracking
Every annotation change is recorded as an AnnotationAction with:
| Field | Description |
|---|---|
action_id |
Unique UUID for each action |
timestamp |
Server-side timestamp |
client_timestamp |
Browser-side timestamp (if available) |
user_id |
User who performed the action |
instance_id |
Instance being annotated |
action_type |
Type of action performed |
schema_name |
Annotation schema name |
label_name |
Specific label within the schema |
old_value |
Previous value (for updates/deletes) |
new_value |
New value (for adds/updates) |
span_data |
Span details for span annotations |
session_id |
Browser session identifier |
server_processing_time_ms |
Server processing time |
metadata |
Additional context (browser info, etc.) |
Action Types
The system tracks these action types:
add_label- New label selectionupdate_label- Label value changeddelete_label- Label removedadd_span- New span annotation createdupdate_span- Span annotation modifieddelete_span- Span annotation removed
Configuration
Annotation history tracking is enabled by default. No additional configuration required.
Accessing History Data
Annotation history is saved with user state and can be accessed via the admin dashboard or API.
Performance Metrics
The system calculates performance metrics from action history:
from potato.annotation_history import AnnotationHistoryManager
# Get metrics for a list of actions
metrics = AnnotationHistoryManager.calculate_performance_metrics(actions)
# Returns:
{
'total_actions': 150,
'average_action_time_ms': 45.2,
'fastest_action_time_ms': 12,
'slowest_action_time_ms': 234,
'actions_per_minute': 8.5,
'total_processing_time_ms': 6780
}
Metrics Explained
| Metric | Description |
|---|---|
total_actions |
Total number of annotation actions |
average_action_time_ms |
Mean server processing time |
fastest_action_time_ms |
Minimum processing time |
slowest_action_time_ms |
Maximum processing time |
actions_per_minute |
Rate of annotation activity |
total_processing_time_ms |
Sum of all processing times |
Suspicious Activity Detection
The system can detect potentially problematic annotation patterns:
from potato.annotation_history import AnnotationHistoryManager
# Analyze actions for suspicious patterns
analysis = AnnotationHistoryManager.detect_suspicious_activity(
actions,
fast_threshold_ms=500, # Actions faster than this are flagged
burst_threshold_seconds=2 # Actions closer than this are flagged
)
# Returns:
{
'suspicious_actions': [...],
'fast_actions_count': 5,
'burst_actions_count': 12,
'fast_actions_percentage': 3.3,
'burst_actions_percentage': 8.0,
'suspicious_score': 15.2,
'suspicious_level': 'Low'
}
Suspicious Levels
| Score | Level | Interpretation |
|---|---|---|
| 0-10 | Normal | Typical annotation behavior |
| 10-30 | Low | Some fast actions, likely acceptable |
| 30-60 | Medium | Notable pattern, may warrant review |
| 60-80 | High | Concerning pattern, review recommended |
| 80-100 | Very High | Likely quality issue, immediate review |
Detection Parameters
| Parameter | Default | Description |
|---|---|---|
fast_threshold_ms |
500 | Actions faster than this are flagged |
burst_threshold_seconds |
2 | Actions closer together than this are flagged |
API Reference
AnnotationAction
from potato.annotation_history import AnnotationAction
# Create an action
action = AnnotationAction(
action_id="uuid-here",
timestamp=datetime.now(),
user_id="annotator1",
instance_id="doc_001",
action_type="add_label",
schema_name="sentiment",
label_name="positive",
old_value=None,
new_value=True,
span_data=None,
session_id="session_123",
client_timestamp=None,
server_processing_time_ms=45,
metadata={"browser": "Chrome"}
)
# Serialize to dictionary
data = action.to_dict()
# Deserialize from dictionary
action = AnnotationAction.from_dict(data)
AnnotationHistoryManager
from potato.annotation_history import AnnotationHistoryManager
# Create a new action with current timestamp
action = AnnotationHistoryManager.create_action(
user_id="annotator1",
instance_id="doc_001",
action_type="add_label",
schema_name="sentiment",
label_name="positive",
old_value=None,
new_value=True
)
# Filter actions by time range
filtered = AnnotationHistoryManager.get_actions_by_time_range(
actions,
start_time=datetime(2024, 1, 1),
end_time=datetime(2024, 1, 31)
)
# Filter actions by instance
instance_actions = AnnotationHistoryManager.get_actions_by_instance(
actions,
instance_id="doc_001"
)
# Filter actions by type
label_actions = AnnotationHistoryManager.get_actions_by_type(
actions,
action_type="add_label"
)
# Calculate performance metrics
metrics = AnnotationHistoryManager.calculate_performance_metrics(actions)
# Detect suspicious activity
analysis = AnnotationHistoryManager.detect_suspicious_activity(actions)
Use Cases
Quality Assurance
Monitor annotator behavior for quality issues:
# Check for suspiciously fast annotators
for user_id in get_all_users():
user_actions = get_user_actions(user_id)
analysis = AnnotationHistoryManager.detect_suspicious_activity(user_actions)
if analysis['suspicious_level'] in ['High', 'Very High']:
flag_for_review(user_id, analysis)
Performance Analysis
Identify annotation bottlenecks:
# Find slow schemas
schema_times = defaultdict(list)
for action in all_actions:
schema_times[action.schema_name].append(action.server_processing_time_ms)
for schema, times in schema_times.items():
avg_time = sum(times) / len(times)
print(f"{schema}: {avg_time:.1f}ms average")
Audit Trail
Track changes for regulatory compliance:
# Get complete history for an instance
instance_actions = AnnotationHistoryManager.get_actions_by_instance(
all_actions, "doc_001"
)
# Export for audit
audit_log = [action.to_dict() for action in instance_actions]
with open("audit_doc_001.json", "w") as f:
json.dump(audit_log, f, indent=2)
Time Analysis
Understand annotation timing patterns:
# Actions by hour of day
from collections import Counter
hours = Counter(action.timestamp.hour for action in all_actions)
print("Peak annotation hours:", hours.most_common(5))
Data Storage
Annotation history is stored in the user state files:
output/
annotations/
user_state_annotator1.json # Includes action history
user_state_annotator2.json
Export Format
Actions are serialized with ISO 8601 timestamps:
{
"action_id": "550e8400-e29b-41d4-a716-446655440000",
"timestamp": "2024-01-15T10:30:45.123456",
"user_id": "annotator1",
"instance_id": "doc_001",
"action_type": "add_label",
"schema_name": "sentiment",
"label_name": "positive",
"old_value": null,
"new_value": true,
"span_data": null,
"session_id": "session_abc123",
"client_timestamp": "2024-01-15T10:30:45.100000",
"server_processing_time_ms": 23,
"metadata": {"browser": "Chrome 120"}
}
Best Practices
- Regular monitoring: Check suspicious activity reports periodically
- Threshold tuning: Adjust detection thresholds based on your task complexity
- Export backups: Regularly export history for long-term storage
- Privacy compliance: Consider data retention policies for timestamps
Related Documentation
- Admin Dashboard - View annotation statistics
- Annotator Stats - Per-annotator metrics
- User and Collaboration - User management