Data Preparation
Preparing your conversation data correctly ensures smooth imports and better evaluation results. This guide covers best practices for structuring your data.
Data Structure Guidelines
The simplest valid TurnWise file:
{
"conversations" : [
{
"messages" : [
{ "role" : "user" , "content" : "Hello" },
{ "role" : "assistant" , "content" : "Hi!" }
]
}
]
}
For better evaluation results, include more context:
{
"conversations" : [
{
"name" : "Customer Support - Order Inquiry" ,
"description" : "Customer asking about order status" ,
"meta" : {
"customer_id" : "cust_123" ,
"session_id" : "sess_456" ,
"timestamp" : "2024-01-20T14:30:00Z"
},
"agents" : [
{
"name" : "Support Agent" ,
"description" : "Customer support agent" ,
"tools" : {
"lookup_order" : {
"name" : "lookup_order" ,
"description" : "Look up order details" ,
"parameters" : {
"order_id" : "string"
}
}
}
}
],
"messages" : [
{
"role" : "system" ,
"content" : "You are a helpful customer service agent."
},
{
"role" : "user" ,
"content" : "Where is my order?"
},
{
"role" : "assistant" ,
"content" : "I can help you track your order. What's your order ID?" ,
"steps" : [
{
"model_name" : "gpt-4" ,
"thinking" : "Customer wants order status. I need their order ID." ,
"output_content" : "I can help you track your order. What's your order ID?"
}
]
}
]
}
]
}
Field Guidelines
Conversations
Required : messages array
Recommended :
name: Human-readable identifier
description: Context about the conversation
meta: Custom metadata (customer IDs, timestamps, tags)
Optional but Valuable :
agents: Agent definitions with tools (enables tool evaluation)
Messages
Required :
role: One of user, assistant, system, tool
Recommended :
content: Message text content
Optional but Valuable :
steps: Reasoning steps (enables step-level evaluation)
meta: Message metadata (timestamps, latency, etc.)
Include content even with steps - Content provides quick reference without parsing steps.
Steps
Required : At least one content field (thinking, tool_call, tool_result, output_content)
Recommended :
model_name: Model used for this step
agent_name: Agent that executed this step (for multi-agent systems)
Content Fields :
thinking: Model’s reasoning
tool_call: Tool invocation details
tool_result: Tool execution result
output_content: Final text output
output_structured: Structured output (JSON)
When to Use Steps
Use steps when you want to evaluate:
Reasoning Quality Evaluate if the agent’s thinking is sound
Tool Selection Check if the right tools were used
Parameter Accuracy Verify tool arguments are correct
Error Handling See how agent responds to failures
Example : If your agent uses tools, include steps:
{
"role" : "assistant" ,
"content" : "Your order has shipped!" ,
"steps" : [
{
"thinking" : "Need to look up order status" ,
"tool_call" : {
"name" : "lookup_order" ,
"arguments" : { "order_id" : "ORD-123" }
},
"tool_result" : {
"status" : "shipped" ,
"tracking" : "1Z999..."
}
},
{
"output_content" : "Your order has shipped!"
}
]
}
When to Define Agents
Define agents when:
Tool Usage : Your agent uses tools
Multi-Agent Systems : Multiple agents in one conversation
Tool Evaluation : You want to evaluate tool selection/usage
Documentation : Self-documenting data
Example :
{
"agents" : [
{
"name" : "Support Agent" ,
"description" : "Handles customer inquiries" ,
"tools" : {
"lookup_order" : {
"name" : "lookup_order" ,
"description" : "Look up order details" ,
"parameters" : {
"order_id" : "string"
}
},
"process_refund" : {
"name" : "process_refund" ,
"description" : "Process refund" ,
"parameters" : {
"order_id" : "string" ,
"amount" : "number"
}
}
}
}
]
}
Use meta fields to store custom data:
{
"conversations" : [
{
"meta" : {
"customer_id" : "cust_123" ,
"session_id" : "sess_456" ,
"channel" : "web" ,
"priority" : "high" ,
"tags" : [ "refund" , "escalation" ]
}
}
]
}
{
"messages" : [
{
"role" : "assistant" ,
"content" : "Hello!" ,
"meta" : {
"timestamp" : "2024-01-20T14:30:00Z" ,
"latency_ms" : 150 ,
"model" : "gpt-4" ,
"tokens_used" : 50
}
}
]
}
Metadata is preserved - Store any custom data you need. It won’t affect evaluations unless referenced in custom metrics.
Handling Large Datasets
Splitting Large Files
If you have thousands of conversations:
Split by Date : Group by time period
Split by Category : Group by conversation type
Split by Size : Keep files under 10MB
Batch Import
Import in batches:
# Import batch 1
curl -X POST "http://localhost:8000/datasets/1/import" \
-F "file=@batch1.json"
# Import batch 2
curl -X POST "http://localhost:8000/datasets/1/import" \
-F "file=@batch2.json"
Compress JSON : Remove unnecessary whitespace
Omit Optional Fields : If not needed, omit them
Message Order : Message order is automatically inferred from array position
Data Quality Checklist
Before importing, verify:
Common Patterns
Pattern 1: Simple Chat
{
"conversations" : [
{
"messages" : [
{ "role" : "user" , "content" : "Hello" },
{ "role" : "assistant" , "content" : "Hi!" }
]
}
]
}
Pattern 2: With System Message
{
"conversations" : [
{
"messages" : [
{ "role" : "system" , "content" : "You are helpful." },
{ "role" : "user" , "content" : "Hello" },
{ "role" : "assistant" , "content" : "Hi!" }
]
}
]
}
Pattern 3: With Steps
{
"conversations" : [
{
"messages" : [
{
"role" : "assistant" ,
"content" : "I can help!" ,
"steps" : [
{
"thinking" : "User needs help" ,
"output_content" : "I can help!"
}
]
}
]
}
]
}
{
"conversations" : [
{
"agents" : [
{
"name" : "Agent" ,
"tools" : {
"lookup" : {
"name" : "lookup" ,
"description" : "Lookup tool" ,
"parameters" : { "id" : "string" }
}
}
}
],
"messages" : [
{
"role" : "assistant" ,
"content" : "Found it!" ,
"steps" : [
{
"tool_call" : { "name" : "lookup" , "arguments" : { "id" : "123" } },
"tool_result" : { "result" : "found" }
},
{
"output_content" : "Found it!"
}
]
}
]
}
]
}
Next Steps