Entity

The Behavioral Credibility Trilemma: When Calibrated Autonomy Becomes Impossible

We prove that no reinforcement learning policy with confidence-gated autonomy can simultaneously achieve maximum helpfulness, optimal calibration, and full autonomy under rational oversight, whenever some tasks exceed the agent's reliable competence: the Behavioral Credibility Trilemma. The impossibility is geometric -- adding any non-affine autonomy incentive to a strictly proper scoring rule destroys strict properness, so an agent rewarded for both calibrated confidence and autonomous action s

Paper · arXiv

cs.LG

Authors: Lauri Lovén, Nam Do, Hassan Mehmood, Dinesh Kumar Sah, Sasu Tarkoma
Published: 2026-05-25
Categories: cs.LGcs.GTstat.ML

Abstract ↗

via arXiv · 2605.25739