Entity

Channel-wise Vector Quantization

We present Channel-wise Vector Quantization (CVQ), a novel image tokenization paradigm that replaces patch-wise tokens with channel-wise tokens. Unlike conventional vector quantization, which assigns a discrete token to each patch feature vector, CVQ quantizes each channel of the feature map. This formulation represents an image as discrete levels of visual details, rather than as a grid of spatial patches. Based on CVQ, we introduce a new visual autoregressive framework with "next-channel predi

Paper · arXiv

cs.CV

Authors: Wei Song, Tianhang Wang, Yitong Chen, Tong Zhang, Zuxuan Wu + 3 more
Published: 2026-05-25
Categories: cs.CVcs.AI

Abstract ↗

via arXiv · 2605.26089