Pipeline Executions
Pipeline executions track when and how evaluations were run. Understanding executions helps you manage evaluation history, debug issues, and analyze performance.What Are Pipeline Executions?
A pipeline execution represents a single run of an evaluation metric:- When: Timestamp of execution
- What: Which metric was evaluated
- Where: Which dataset and entities
- Result: Evaluation results
- Status: Success or failure
Execution Structure
Execution Lifecycle
States
- Pending: Execution created but not started
- Processing: Currently running
- Completed: Finished successfully
- Failed: Encountered errors
- Cancelled: User cancelled
Viewing Executions
Via UI
- Open Dataset
- Click “Executions” Tab
- View Execution List
- See all executions for this dataset
- Filter by metric, status, date
- Sort by various columns
Via API
Execution Details
Each execution includes:Basic Info
- ID: Unique execution identifier
- Pipeline: Which metric was evaluated
- Dataset: Which dataset
- Status: Current status
- Timestamps: Started, completed times
Statistics
- Total Evaluations: Number of entities evaluated
- Successful: Number that succeeded
- Failed: Number that failed
- Duration: Total execution time
Results
- Individual Results: Each conversation/message/step result
- Aggregated Results: Summary statistics
- Errors: Any failures with details
Execution History
TurnWise maintains a history of all executions:Why History Matters
- Track Changes: See how metrics perform over time
- Debug Issues: Identify when problems occurred
- Compare Results: Compare different evaluation runs
- Audit Trail: Complete record of evaluations
Viewing History
- Open Dataset
- Click “Executions” Tab
- Browse History
- See all past executions
- Filter by date range
- Search by metric name
Re-Running Evaluations
When to Re-Run
- Metric Updated: Prompt or configuration changed
- Data Updated: Conversations modified
- Failed Executions: Retry failed evaluations
- Model Changed: Using different LLM
How to Re-Run
Via UI
- Select Execution
- Click “Re-Run”
- Confirm
- Monitor Progress
Via API
Execution Comparison
Compare executions to see:- Metric Changes: How results changed
- Performance: Execution time differences
- Accuracy: Success rate changes
Comparing Results
- Select Two Executions
- Click “Compare”
- View Differences
- Side-by-side comparison
- Highlighted changes
- Statistical analysis
Execution Metadata
Executions store metadata:- Debugging: Understand execution context
- Analysis: Filter by execution parameters
- Auditing: Track who ran what
Execution Results
Individual Results
Each evaluation produces a result:Aggregated Results
Summary statistics:Exporting Results
Via UI
- Select Execution
- Click “Export”
- Choose Format
- CSV
- JSON
- Excel
Via API
Execution Performance
Monitoring Performance
Track execution metrics:- Duration: How long it took
- Throughput: Evaluations per second
- Success Rate: Percentage successful
- Cost: LLM API costs
Optimizing Performance
Use Async Mode
Enable concurrent execution
Batch Size
Optimize batch sizes
Model Selection
Use faster models when possible
Monitor Resources
Track performance metrics
Troubleshooting Executions
Failed Executions
Check:- Error messages in execution details
- LLM API status
- Data validity
- Prompt correctness
- Retry execution
- Fix data issues
- Update prompt
- Check API keys
Slow Executions
Check:- Execution mode (sync vs async)
- Model selection
- Batch size
- Network latency
- Enable async mode
- Use faster model
- Increase batch size
- Check network
Best Practices
Review History
Regularly review execution history
Monitor Performance
Track execution metrics
Export Results
Export important results
Document Changes
Note why executions were re-run