Google Imagen Touted for Photorealistic Interpretation of Text-to-Image Captions


Text-to-image tech is one of the “next big things” out there on the horizon for both AI and digital media.

A person holding a brush and painting on paper. Photo by LOGAN WEAVER | @LGNWVR

And Google’s effort, Imagen, is apparently posting some impressive results.

Photorealistic, if reports are to be trusted, and we tend to put a lot of weight on what Engadget has to say.

The company’s response to DALL-E but only for internal use thus far, Imagen is apparently distinguished from the former by its use of what is being called “more realistic” image generation. Imagen works using tons of data, some of which could lead to unwanted end results which is why the product isn’t yet ready for public use.

Engadget quotes the research paper on the subject which notes:

“While this approach has enabled rapid algorithmic advances in recent years, datasets of this nature often reflect social stereotypes, oppressive viewpoints, and derogatory, or otherwise harmful, associations to marginalized identity groups…While a subset of our training data was filtered to removed noise and undesirable content, such as pornographic imagery and toxic language, we also utilized LAION-400M dataset, which is known to contain a wide range of inappropriate content including pornographic imagery, racist slurs and harmful social stereotypes.”

There is a small sample that is acceptable for public demonstration, however, and you can check it out at this link here. In addition to generating your own text-to-image illustrations using preset phrases, you can check out some of the curated examples of Imagen’s output as well. Also, don’t forget to check out our article on DALL-E from some time ago.

Of course, if you have any thoughts on text-to-image technology or can give us some insights into how it all works, let us know in the comments.

