Entity

SPADE-Bench: Evaluating Spontaneous Strategic Deception in Agents via Plan-Action Divergence

As LLM-based agents expand their operational scope, reliability becomes a prerequisite for real-world deployment. However, in practical applications, human users cannot monitor every immediate behavior; instead, the execution process often remains a black box, leaving users dependent solely on the agent's self-reported updates. This opacity creates a critical risk: agents may present observer-facing reports that diverge from their executed actions, rendering the system uncontrollable, especially

Paper · arXiv

cs.CL

Authors: Yuyan Bu, Haowei Li, Qirui Zheng, Bowen Dong, Kaiyue Yang + 5 more
Published: 2026-06-01
Categories: cs.CLcs.AI

Abstract ↗

via arXiv · 2606.0238