Topic

RLVR Verifier Failure

Failure modes in reinforcement learning with verifiable rewards where a model improves measured reward while degrading transfer, robustness, or true task success.

Definition: RLVR Verifier Failure in the glossary.

Why It Matters Here

RLVR can train models to satisfy the verifier rather than the user objective. This topic tracks the gap between benchmark acceptance and deployed truth.

Verification Economics Agent Infrastructure Tool-Agent Reward Hacking

Programs

Agent Infrastructure

Linked Artifacts

Paper / May 28, 2026

The Exploit Tax. Why Verifier-Guided Reasoning Needs a Transfer Audit.

A research paper on verifier failure under RLVR and tool agents. It defines the exploit tax: the cost paid when verifier-guided reasoning increases local acceptance faster than true transferred success.

RLVR Verifier Failure

Why It Matters Here

Related Topics

Programs

Linked Artifacts

The Exploit Tax. Why Verifier-Guided Reasoning Needs a Transfer Audit.