Entity

Taiji: Pareto Optimal Policy Optimization with Semantics-IDs Trade-off for Industrial LLM-Enhanced Recommendation

Scaling recommender systems via large language models (LLMs) has become a prominent trend in the industry. However, aligning the LLM's semantic space with the recommender's ID space via post-training (e.g., SFT and RL) remains challenging. Existing LLM4Rec paradigms are bottlenecked by two main issues: (1) the difficulty of measuring and improving chain-of-thought (CoT) quality in open-domain recommendation during SFT, and (2) the neglect of the trade-off between LLM semantic rewards and recomme

Paper · arXiv

cs.IR

Authors: Yuecheng Li, Zeyu Song, Jing Yao, Chi Lu, Peng Jiang + 1 more
Published: 2026-06-02
Categories: cs.IRcs.AIcs.CL

Abstract ↗

via arXiv · 2606.03866