Topic

RLVR Verifier Failure

Failure modes in reinforcement learning with verifiable rewards where a model improves measured reward while degrading transfer, robustness, or true task success.

Definition: RLVR Verifier Failure in the glossary.

Why It Matters Here

RLVR can train models to satisfy the verifier rather than the user objective. This topic tracks the gap between benchmark acceptance and deployed truth.

Programs

Linked Artifacts