Entity

Harness-1: Reinforcement Learning for Search Agents with State-Externalizing Harnesses

Search agents are often trained as policies over growing transcripts: the model must decide how to search while also remembering what it has seen, which evidence is useful, which constraints remain open, and which claims have actually been checked. We argue that this formulation puts too much routine state management inside the policy: reinforcement learning is forced to optimize both semantic search decisions and recoverable bookkeeping that the environment can maintain more reliably. We introd

Paper · arXiv

cs.AI

Authors: Pengcheng Jiang, Zhiyi Shi, Kelly Hong, Xueqiang Xu, Jiashuo Sun + 3 more
Published: 2026-06-01
Categories: cs.AIcs.CLcs.IR

Abstract ↗

via arXiv · 2606.02373