Annotating Interactions
The Dashboard Training tab lets you browse every LLM interaction Wilson has made, inspect the full prompts and responses, rate quality, and create preference pairs for DPO training.
Opening the Training Tab
Section titled “Opening the Training Tab”wilson # Dashboard starts at http://localhost:3141/dashboard # Opens in your browserClick the Training tab in the navigation bar.
Training Stats
Section titled “Training Stats”The stats panel at the top shows:
| Stat | Description |
|---|---|
| Total | Total LLM interactions recorded |
| Annotated | How many have been rated or labeled |
| SFT Ready | Interactions rated 4+ stars (eligible for SFT export) |
| DPO Pairs | Number of chosen/rejected preference pairs |
Interaction Browser
Section titled “Interaction Browser”The main table lists all recorded interactions with:
- ID — Database row ID
- Run — First 8 characters of the run UUID
- Type —
agent,summarize,relevance, etc. - Model — Which LLM was used
- Tokens — Total token count
- Rating — Star rating if annotated
- Time — When the call was made
Filters
Section titled “Filters”Use the dropdowns above the table to filter by:
- Type — Show only agent calls, summarize calls, etc.
- Annotated — Show only annotated or unannotated interactions
- Rating — Filter by minimum star rating
Interaction Detail
Section titled “Interaction Detail”Click any row to expand its full detail panel:
- System Prompt — The complete system prompt sent to the model
- User Prompt — The user’s query or iteration prompt with tool results
- Response — The model’s full text response
- Tool Calls — JSON of any tool calls the model requested
- Tool Results — Output of each tool execution
This gives you complete visibility into what the model saw and what it produced.
Rating Interactions
Section titled “Rating Interactions”Rate each interaction from 1 to 5 stars:
| Rating | Meaning | Training Use |
|---|---|---|
| 1 star | Bad — wrong answer, hallucination, off-topic | Excluded from SFT, candidate for DPO “rejected” |
| 2 stars | Poor — partially correct but significant issues | Excluded from SFT |
| 3 stars | Acceptable — correct but could be better | Excluded from SFT by default |
| 4 stars | Good — correct and well-structured | Included in SFT export |
| 5 stars | Excellent — ideal response to learn from | Included in SFT export |
Click the star icons in the annotation panel to set the rating. The default SFT export threshold is 4 stars — only high-quality interactions become training data.
Preference Pairs (DPO)
Section titled “Preference Pairs (DPO)”Direct Preference Optimization (DPO) training requires pairs: a chosen response and a rejected response for the same prompt.
Creating a Pair
Section titled “Creating a Pair”- Find two interactions with the same or similar user prompt but different responses
- Open the better response, set Preference to
Chosen - Enter a Pair ID (any string, e.g.,
groceries-1) - Open the worse response, set Preference to
Rejected - Enter the same Pair ID
The pair is now linked. When you export DPO training data, both interactions are combined into a single training example with the prompt, chosen response, and rejected response.
Pair ID Tips
Section titled “Pair ID Tips”- Use descriptive pair IDs:
categorize-groceries-1,spending-review-2 - Each pair needs exactly one
chosenand onerejectedinteraction with the same pair ID - You can create pairs across different models — useful for comparing a cloud model’s response against a local model’s response
Saving Annotations
Section titled “Saving Annotations”Click Save to persist the annotation. Annotations are stored in the interaction_annotations table and survive across sessions.
Saving is an upsert — if an annotation already exists for the interaction, it’s replaced with the new values.
API Endpoints
Section titled “API Endpoints”You can also annotate programmatically via the dashboard API:
List Interactions
Section titled “List Interactions”# All interactions (paginated)curl http://localhost:3141/api/interactions?limit=50&offset=0
# Filter by call typecurl http://localhost:3141/api/interactions?callType=agent
# Only unannotatedcurl http://localhost:3141/api/interactions?annotated=falseGet Interaction Detail
Section titled “Get Interaction Detail”curl http://localhost:3141/api/interactions/42Returns the interaction with its tool results and annotations.
Get All Interactions in a Run
Section titled “Get All Interactions in a Run”curl http://localhost:3141/api/runs/abc-123-uuidSave Annotation
Section titled “Save Annotation”curl -X POST http://localhost:3141/api/interactions/42/annotate \ -H "Content-Type: application/json" \ -d '{"rating": 5, "preference": "chosen", "pairId": "pair-1", "notes": "Great categorization"}'Annotation Stats
Section titled “Annotation Stats”curl http://localhost:3141/api/annotations/statsReturns total interactions, annotated count, rating distribution, DPO pair count, and SFT-ready count.