Entity

SimSD: Simple Speculative Decoding in Diffusion Language Models

Diffusion large language models (dLLMs) have recently emerged as a promising alternative to autoregressive (AR) LLMs, offering faster inference through parallel or blockwise decoding. However, their masked language modeling formulation remains incompatible with standard token-level speculative decoding, one of the most effective acceleration techniques for AR models. In AR decoding, the causal mask preserves temporally valid token-level contexts, enabling a target model to verify multiple drafte

Paper · arXiv

cs.CL

Authors: Junxia Cui, Haotian Ye, Runchu Tian, Hongcan Guo, Jinya Jiang + 7 more
Published: 2026-06-01
Categories: cs.CLcs.AI

Abstract ↗

via arXiv · 2606.02544