AI Taught How to Generate Images Based on Simple Text Captions


We’ve got some more advances from the world of AI and photography. This time it is artificial intelligence capable of generating an image from a simple text caption.

Photo by Kevin Ku from Pexels.

As the DPReview article notes, these images don’t always make sense. Nonetheless, it is seen as a huge leap for AI tech.

Developed and studied by the Allen Institute for AI (AI2) (created by Microsoft co-founder Paul Allen), the study looked at how the AI would handle missing information or interpret text information to discover that missing element in the photo. Using OpenAI with its GPT-3 system, researchers would teach it how to interpret data and extrapolate it based upon inferences drawn from the AI’s previous work. One example demonstrated a clocktower and buildings around it. All of them were quite crudely composed as they were generated by the artificial intelligence itself but the image demonstrates the AI’s connection of clock towers with urban or populated areas.

From the engineers’ own study:

“Interestingly, our analysis leads us to the conclusion that LXMERT in its current form does not possess the ability to paint – it produces images that have little resemblance to natural images.

We introduce X-LXMERT that builds upon LXMERT and enables it to effectively perform discriminative as well as generative tasks … When coupled with our proposed image generator, X-LXMERT is able to generate rich imagery that is semantically consistent with the input captions. Importantly, X-LXMERT’s image generation capabilities rival state-of-the-art image generation models (designed only for generation), while its question-answering capabilities show little degradation compared to LXMERT.”

As DPReview notes (and as we have reported on multiple occasions), AI can already generate images on its own but what makes this novel is that it is taking text and then referencing that back to concepts within an image. That’s a whole new ballgame and one that makes you wonder what this kind of technology will be able to do in the future.

You can check out the study for yourself by clicking here.

Is AI the future of everything, including photography? Let us know your thoughts in the comments section below if you like.

Be sure to check out some of our other photography news on Light Stalking by clicking this link right here.


What We Recommend to Improve Your Photography Fast

It's possible to get some pretty large improvements in your photography skills very fast be learning some fundamentals. Consider this the 80:20 rule of photography where 80% of the improvements will come from 20% of the learnable skills. Those fundamentals include camera craft, composition, understanding light and mastering post-production. Here are the premium guides we recommend.

  1. html cleaner  Easy DSLR –  Friend of Light Stalking, Ken Schultz has developed this course over several years and it still remains the single best source for mastering your camera by identifying the main things that are holding you back.
  2. Word to html  Understanding Composition – As one of the core elements of a good photograph, getting your head around composition is essential. Photzy's guide to the subject is an excellent introduction. Their follow-up on Advanced Composition is also well worth a read.
  3. Word to html  Understanding Light – Also by Photzy, the other essential part of photography is covered in this epic guide and followed up in Understanding Light, Part 2. This is fundamental stuff that every photographer should aim to master.
  4. Word to html  5 Minute Magic Lightroom Workflow – Understanding post production is one of the keys to photographs that you will be proud of. This short course by one of the best in the business will show you how an award-winning photographer does it.

About Author


Kehl is our staff photography news writer and has over a decade of experience in online media and publishing and you can get to know him better here

Leave a Reply

Your email address will not be published. Required fields are marked *