Cost per true success

Cost per true success is the total inference, tool, retry, and verification cost divided by externally validated successful task completions.

Definition

Cost per token and cost per accepted answer are incomplete when the verifier can be gamed. Cost per true success counts only completions that survive a held-out, external, or human-grounded audit. For tool agents, the numerator includes model calls, tool calls, retries, sandbox time, evaluator calls, and human escalation when required.

Why this matters

A cheap accepted answer can be expensive if it is wrong. A cheap agent run can be expensive if it mutates the wrong state or reports success without a real artifact. This metric forces the denominator to be real success, not local acceptance.

Production signal

Report cost per true success next to cost per accepted answer. The spread between the two is the observable exploit tax.


Glossary. Research index. Home.