Cost per true success #
Cost per true success is the total inference, tool, retry, and verification cost divided by externally validated successful task completions.
Definition #
Cost per token and cost per accepted answer are incomplete when the verifier can be gamed. Cost per true success counts only completions that survive a held-out, external, or human-grounded audit. For tool agents, the numerator includes model calls, tool calls, retries, sandbox time, evaluator calls, and human escalation when required.
Why this matters #
A cheap accepted answer can be expensive if it is wrong. A cheap agent run can be expensive if it mutates the wrong state or reports success without a real artifact. This metric forces the denominator to be real success, not local acceptance.
Production signal #
Report cost per true success next to cost per accepted answer. The spread between the two is the observable exploit tax.