Visual Autoregressive Modeling: A Scalable and Efficient Approach to Image Generation
Visual Autoregressive (VAR) modeling redefines autoregressive learning on images as a coarse-to-fine "next-scale prediction" strategy, which allows autoregressive transformers to learn visual distributions fast and generalize well, surpassing diffusion models in image synthesis.