Feature-encoder-free ImageNet training
Develop a Drifting Model training procedure that succeeds on ImageNet 256×256 without relying on an external feature encoder by constructing a kernel or representation that effectively measures sample similarity directly in the generator’s output space (either SD‑VAE latent space or pixel space), thereby enabling the drifting field to function without auxiliary features.
References
On the other hand, we report that we were unable to make our method work on ImageNet without a feature encoder. In this case, the kernel may fail to effectively describe similarity, even in the presence of a latent VAE. We leave further study of this limitation for future work.
— Generative Modeling via Drifting
(2602.04770 - Deng et al., 4 Feb 2026) in ImageNet Experiments — Feature Space for Drifting