span-level replay is live · apache-2.0 · open source

the real debugger
for agents

trace, replay, eval — every signal in one place.
open a broken run, edit it, re-run it, and diff what changed.

get started
$docker compose up -d
or browse the docs →
what’s inside
replay

edit a prompt, model, or tool config; re-run against a real model; diff span by span with a determinism verdict.

agent-first / mcp

token-budgeted, LLM-legible run views over REST and MCP, so an agent can debug an agent.

eval-rigor

schema adherence, test-retest stability, inter-judge agreement — before you trust a score.

self-host

postgres + clickhouse + redis, one compose file. your vpc, your data, apache-2.0.

no proprietary sdk — plain otlp/http at /v1/traces from the frameworks you already use.