AI Taught How to Generate Images Based on Simple Text Captions

We’ve got some more advances from the world of AI and photography. This time it is artificial intelligence capable of generating an image from a simple text caption.

As the DPReview article notes, these images don’t always make sense. Nonetheless, it is seen as a huge leap for AI tech.

Developed and studied by the Allen Institute for AI (AI2) (created by Microsoft co-founder Paul Allen), the study looked at how the AI would handle missing information or interpret text information to discover that missing element in the photo. Using OpenAI with its GPT-3 system, researchers would teach it how to interpret data and extrapolate it based upon inferences drawn from the AI’s previous work. One example demonstrated a clocktower and buildings around it. All of them were quite crudely composed as they were generated by the artificial intelligence itself but the image demonstrates the AI’s connection of clock towers with urban or populated areas.

From the engineers’ own study:

“Interestingly, our analysis leads us to the conclusion that LXMERT in its current form does not possess the ability to paint – it produces images that have little resemblance to natural images.

We introduce X-LXMERT that builds upon LXMERT and enables it to effectively perform discriminative as well as generative tasks … When coupled with our proposed image generator, X-LXMERT is able to generate rich imagery that is semantically consistent with the input captions. Importantly, X-LXMERT’s image generation capabilities rival state-of-the-art image generation models (designed only for generation), while its question-answering capabilities show little degradation compared to LXMERT.”

As DPReview notes (and as we have reported on multiple occasions), AI can already generate images on its own but what makes this novel is that it is taking text and then referencing that back to concepts within an image. That’s a whole new ballgame and one that makes you wonder what this kind of technology will be able to do in the future.

You can check out the study for yourself by clicking here.

Is AI the future of everything, including photography? Let us know your thoughts in the comments section below if you like.

Be sure to check out some of our other photography news on Light Stalking by clicking this link right here.

[DPReview]