Skip to main content

Quickstart Guide

This guide will help you import your first dataset and run your first evaluation.

Prerequisites

  • A TurnWise account (sign up here)
  • Conversation data in JSON format

Step 1: Prepare Your Data

Create a JSON file with your conversations. Here’s the minimal format:
{
  "conversations": [
    {
      "name": "My First Conversation",
      "messages": [
        {
          "role": "user",
          "content": "Hello, I need help"
        },
        {
          "role": "assistant",
          "content": "Hi! I'd be happy to help. What do you need?"
        }
      ]
    }
  ]
}
Every message needs a role (user, assistant, system, or tool) and content. Message order is automatically inferred from the array position.

Step 2: Create a Dataset

  1. Go to the Datasets page
  2. Click “New Dataset” in the sidebar
  3. Enter a name and optional description
  4. Click “Create”

Step 3: Import Conversations

  1. Open your new dataset
  2. Click the “Import” button in the header
  3. Drag and drop your JSON file or click to browse
  4. Click “Import”
If your data has validation errors, TurnWise will use AI to analyze your format and suggest how to transform it.

Step 4: Add an Evaluation Metric

  1. Click “Add Column” to create a new metric
  2. Describe what you want to evaluate (e.g., “Is the response helpful?”)
  3. TurnWise will generate a metric configuration
  4. Review and save the metric

Step 5: Run Evaluations

  1. Click “Run All” to evaluate all conversations
  2. Wait for evaluations to complete
  3. Review results in the data table

Using the Python SDK

Prefer to run evaluations programmatically? Use the TurnWise Python SDK:
pip install turnwise-sdk
from turnwise import TurnWiseClient, Metric, EvaluationLevel, OutputType

client = TurnWiseClient(
    turnwise_api_key="tw_xxx",
    openrouter_api_key="sk-or-xxx"
)

metric = Metric(
    name="Helpfulness",
    prompt="Evaluate: @CURRENT_MESSAGE.output",
    evaluation_level=EvaluationLevel.MESSAGE,
    output_type=OutputType.PROGRESS,
)

results = await client.evaluate(dataset_id=1, metric=metric)
See the Python SDK Guide for complete documentation.

What’s Next?