Artificial intelligence models have advanced significantly in recent years, particularly in their capacity to generate images, sounds, and videos. Among these models, diffusion-type AI models, such as Stable Diffusion and DALL-E, have been superior in their performance. This great performance comes at a cost because these models demand substantially more computing than standard single-step generative models such as generative adversarial networks (GANs).
To address this issue, a team of researchers at OpenAI has proposed the new model called the “Consistency Models” that can achieve state-of-the-art performance in single-step sample generation without adversarial training. The Consistency Models is designed to be computationally efficient while taking advantage of the iterative refinement process of diffusion models, making it a promising solution for reducing the computational load of generative AI models.
The Consistency Models are built on the stochastic flow ordinary differential equation (ODE) found in the continuous-time diffusion model. The model is trained to map any point at any time step to the beginning of the trajectory, making it possible to generate high-quality output in a single step. The output is learned to “match” the initial point on the same trajectory, hence the name given to the model family.
The researchers present two coherence model training approaches supporting extraction or isolation modes. The first approach uses a numerical ODE solver and a pre-trained diffusion model to obtain pairs of adjacent points on the PF ODE trajectory and efficiently extract the diffusion model into a coherent model. The second approach learns without relying on pre-trained diffusion models, effectively establishing coherent models as an independent generative model family.
The Consistency Models were tested on real image datasets, including CIFAR-10, ImageNet 64×64, LSUN Bedroom 256×256, and LSUN Cat 256×256. In the experiments, extraction via the coherence model achieved state-of-the-art FID scores of 3.55 on CIFAR-10 and 6.20 on ImageNet 64×64 for 1-step generation. The coherence model alone also outperformed existing 1-step non-regressive generative models.
Although the Consistency Model is still in its early stages and cannot be directly compared with diffusion models, it shows that OpenAI is actively researching next-generation use cases for generative AI. The model offers exciting prospects for cross-pollinating ideas and methods across AI research disciplines.