Data Preparation

Preparing your conversation data correctly ensures smooth imports and better evaluation results. This guide covers best practices for structuring your data.

Data Structure Guidelines

Minimal Valid Format

The simplest valid TurnWise file:

{
  "conversations": [
    {
      "messages": [
        { "role": "user", "content": "Hello" },
        { "role": "assistant", "content": "Hi!" }
      ]
    }
  ]
}

Recommended Format

For better evaluation results, include more context:

{
  "conversations": [
    {
      "name": "Customer Support - Order Inquiry",
      "description": "Customer asking about order status",
      "meta": {
        "customer_id": "cust_123",
        "session_id": "sess_456",
        "timestamp": "2024-01-20T14:30:00Z"
      },
      "agents": [
        {
          "name": "Support Agent",
          "description": "Customer support agent",
          "tools": {
            "lookup_order": {
              "name": "lookup_order",
              "description": "Look up order details",
              "parameters": {
                "order_id": "string"
              }
            }
          }
        }
      ],
      "messages": [
        {
          "role": "system",
          "content": "You are a helpful customer service agent."
        },
        {
          "role": "user",
          "content": "Where is my order?"
        },
        {
          "role": "assistant",
          "content": "I can help you track your order. What's your order ID?",
          "steps": [
            {
              "model_name": "gpt-4",
              "thinking": "Customer wants order status. I need their order ID.",
              "output_content": "I can help you track your order. What's your order ID?"
            }
          ]
        }
      ]
    }
  ]
}

Field Guidelines

Conversations

Required: messages array Recommended:

name: Human-readable identifier
description: Context about the conversation
meta: Custom metadata (customer IDs, timestamps, tags)

Optional but Valuable:

agents: Agent definitions with tools (enables tool evaluation)

Messages

Required:

role: One of user, assistant, system, tool

Recommended:

content: Message text content

Optional but Valuable:

steps: Reasoning steps (enables step-level evaluation)
meta: Message metadata (timestamps, latency, etc.)

Include content even with steps - Content provides quick reference without parsing steps.

Steps

Required: At least one content field (thinking, tool_call, tool_result, output_content) Recommended:

model_name: Model used for this step
agent_name: Agent that executed this step (for multi-agent systems)

Content Fields:

thinking: Model’s reasoning
tool_call: Tool invocation details
tool_result: Tool execution result
output_content: Final text output
output_structured: Structured output (JSON)

When to Use Steps

Use steps when you want to evaluate:

Reasoning Quality

Evaluate if the agent’s thinking is sound

Tool Selection

Check if the right tools were used

Parameter Accuracy

Verify tool arguments are correct

Error Handling

See how agent responds to failures

Example: If your agent uses tools, include steps:

{
  "role": "assistant",
  "content": "Your order has shipped!",
  "steps": [
    {
      "thinking": "Need to look up order status",
      "tool_call": {
        "name": "lookup_order",
        "arguments": { "order_id": "ORD-123" }
      },
      "tool_result": {
        "status": "shipped",
        "tracking": "1Z999..."
      }
    },
    {
      "output_content": "Your order has shipped!"
    }
  ]
}

When to Define Agents

Define agents when:

Tool Usage: Your agent uses tools
Multi-Agent Systems: Multiple agents in one conversation
Tool Evaluation: You want to evaluate tool selection/usage
Documentation: Self-documenting data

Example:

{
  "agents": [
    {
      "name": "Support Agent",
      "description": "Handles customer inquiries",
      "tools": {
        "lookup_order": {
          "name": "lookup_order",
          "description": "Look up order details",
          "parameters": {
            "order_id": "string"
          }
        },
        "process_refund": {
          "name": "process_refund",
          "description": "Process refund",
          "parameters": {
            "order_id": "string",
            "amount": "number"
          }
        }
      }
    }
  ]
}

Metadata Usage

Use meta fields to store custom data:

Conversation Metadata

{
  "conversations": [
    {
      "meta": {
        "customer_id": "cust_123",
        "session_id": "sess_456",
        "channel": "web",
        "priority": "high",
        "tags": ["refund", "escalation"]
      }
    }
  ]
}

Message Metadata

{
  "messages": [
    {
      "role": "assistant",
      "content": "Hello!",
      "meta": {
        "timestamp": "2024-01-20T14:30:00Z",
        "latency_ms": 150,
        "model": "gpt-4",
        "tokens_used": 50
      }
    }
  ]
}

Metadata is preserved - Store any custom data you need. It won’t affect evaluations unless referenced in custom metrics.

Handling Large Datasets

Splitting Large Files

If you have thousands of conversations:

Split by Date: Group by time period
Split by Category: Group by conversation type
Split by Size: Keep files under 10MB

Batch Import

Import in batches:

# Import batch 1
curl -X POST "http://localhost:8000/datasets/1/import" \
  -F "file=@batch1.json"

# Import batch 2
curl -X POST "http://localhost:8000/datasets/1/import" \
  -F "file=@batch2.json"

Performance Tips

Compress JSON: Remove unnecessary whitespace
Omit Optional Fields: If not needed, omit them
Message Order: Message order is automatically inferred from array position

Data Quality Checklist

Before importing, verify:

Valid JSON syntax
Root object has conversations array
Each conversation has messages array
Each message has role
Roles are valid (user, assistant, system, tool)
Messages are in chronological order (order is inferred from array position)
Steps have at least one content field
Tool calls match agent definitions (if agents defined)

Common Patterns

Pattern 1: Simple Chat

{
  "conversations": [
    {
      "messages": [
        { "role": "user", "content": "Hello" },
        { "role": "assistant", "content": "Hi!" }
      ]
    }
  ]
}

Pattern 2: With System Message

{
  "conversations": [
    {
      "messages": [
        { "role": "system", "content": "You are helpful." },
        { "role": "user", "content": "Hello" },
        { "role": "assistant", "content": "Hi!" }
      ]
    }
  ]
}

Pattern 3: With Steps

{
  "conversations": [
    {
      "messages": [
        {
          "role": "assistant",
          "content": "I can help!",
          "steps": [
            {
              "thinking": "User needs help",
              "output_content": "I can help!"
            }
          ]
        }
      ]
    }
  ]
}

Pattern 4: With Tools

{
  "conversations": [
    {
      "agents": [
        {
          "name": "Agent",
          "tools": {
            "lookup": {
              "name": "lookup",
              "description": "Lookup tool",
              "parameters": { "id": "string" }
            }
          }
        }
      ],
      "messages": [
        {
          "role": "assistant",
          "content": "Found it!",
          "steps": [
            {
              "tool_call": { "name": "lookup", "arguments": { "id": "123" } },
              "tool_result": { "result": "found" }
            },
            {
              "output_content": "Found it!"
            }
          ]
        }
      ]
    }
  ]
}

Getting Started

Data Format

Datasets

Metrics

Evaluation

Examples

Python SDK

Data Preparation

Data Preparation

Data Structure Guidelines

Minimal Valid Format

Recommended Format

Field Guidelines

Conversations

Messages

Steps

When to Use Steps

Reasoning Quality

Tool Selection

Parameter Accuracy

Error Handling

When to Define Agents

Metadata Usage

Conversation Metadata

Message Metadata

Handling Large Datasets

Splitting Large Files

Batch Import

Performance Tips

Data Quality Checklist

Common Patterns

Pattern 1: Simple Chat

Pattern 2: With System Message

Pattern 3: With Steps

Pattern 4: With Tools

Next Steps

Setup Guide

Data Format

Getting Started

Data Format

Datasets

Metrics

Evaluation

Examples

Python SDK

​Data Preparation

​Data Structure Guidelines

​Minimal Valid Format

​Recommended Format

​Field Guidelines

​Conversations

​Messages

​Steps

​When to Use Steps

Reasoning Quality

Tool Selection

Parameter Accuracy

Error Handling

​When to Define Agents

​Metadata Usage

​Conversation Metadata

​Message Metadata

​Handling Large Datasets

​Splitting Large Files

​Batch Import

​Performance Tips

​Data Quality Checklist

​Common Patterns

​Pattern 1: Simple Chat

​Pattern 2: With System Message

​Pattern 3: With Steps

​Pattern 4: With Tools

​Next Steps

Setup Guide

Data Format

Data Preparation

Data Structure Guidelines

Minimal Valid Format

Recommended Format

Field Guidelines

Conversations

Messages

Steps

When to Use Steps

When to Define Agents

Metadata Usage

Conversation Metadata

Message Metadata

Handling Large Datasets

Splitting Large Files

Batch Import

Performance Tips

Data Quality Checklist

Common Patterns

Pattern 1: Simple Chat

Pattern 2: With System Message

Pattern 3: With Steps

Pattern 4: With Tools

Next Steps