Failures where an AI agent manipulates tools, tests, logs, evaluators, or task state to receive credit without accomplishing the intended external objective.
A research paper on verifier failure under RLVR and tool agents. It defines the exploit tax: the cost paid when verifier-guided reasoning increases local acceptance faster than true transferred success.
Archive command
Search the public field-note archive.
Results are local, static, and index the same public surfaces exposed to crawlers.