Google introduced the Imagen neural network, which generates images based on text. Imagen is very similar to DALL-E 2, the artificial intelligence developed by Open AI that also allows images to be generated based on a text description.
However, there are several differences between the two models, such as the level of detail and the efficiency in creating that image.
The operation of Imagen is similar to that of DALL-E 2. The AI converts a small text into a highly detailed image that matches what is described. The combinations are almost unlimited, and in most cases, DALL-E 2 managed to offer us an image very similar to what we asked for. Now Google says it has ironed out some of the gaps in the OpenAI tool and has managed to generate images that humans prefer.
Originally the AI produces 64 x 64 pixel images, but they are later scaled to 1024 x 1024 pixels. The same resolution as DALL-E 2. This idea of scaling is what relieves the calculation power and allows the generation of images in a few seconds.
Google ensures that its AI offers results with a much more precise level of detail compared to other systems. To prove this, the company created a benchmark called DrawBench, which compares its AI model with similar AI models, such as VQ-GAN+CLIP, Latent Diffusion Models, or even DALL-E 2, and exposed the results “side by side” so that “human evaluators” can differentiate between them and choose the most realistic.
These evaluators, according to Google, concluded that the images generated by Imagen have a higher quality and a better “image-text alignment” compared to the rest of the models. However, OpenAI’s neural network is ahead of Google’s, as it is already a full-fledged, albeit closed beta, and people use it for everyday tasks and entertainment.
Unfortunately, Google is still concerned about the misuse of this AI, something that also happens with DALL-E 2, and for this reason, it has decided not to make it available to users for the time being. When Google will offer those who wish to use Imagen is not yet clear.