Exploring AI imagery

A quick dive into the rapidly emerging field of visuals imagined by machines.

Dec 20, 2022

For years engineers have been working on developing models that can better understand the world and create new iterations of original outputs. In the last few years this has taken a giant leap forward in two key areas.

First is the ability to “understand” textual input. Using massive datasets, like all the tweets on twitter, software engineers have been able to train new neural network systems. These systems have gotten remarkably good at taking an input and being able to respond in a way that feels “human”. The context is right in a way that feels like it’s understanding and able to reply in a way that actually makes sense of the world.

Second is the ability to generate new unique outputs. Based on prompts these new AI models are able to compose a new story or produce a new original visual output that is based off the prompt they were presented with.

All of this of course is not necessarily original creativity, but rather a synthesis or complex composite of the massive set of training data. Say every painting ever made by every famous artist.

This then allows for me to ask an AI to make an image of Seattle in the style of Salvador Dali.

And in seconds I have a new original image. It understood my prompt and used it’s huge library to figure out how to generate a new original image that matched what I was asking for.

I can tweak it and add a lot more suggestions for the type of image I want, perhaps more realism, and in a few seconds more I have another new visual.

This opens up whole new worlds for creativity and questions around how to use this in responsible ways.

If you want to play around MidJourney, OpenAI, and Stable Diffusion are three very powerful options that just about anyone can jump into and start experimenting with.

The technology is being made accessible to builders too. In just a few hours I was able to use the OpenAI API to build a simple avatar generator. The hard work was all done for me, all I had to do was provide a user experience for configuring the parameters and pass it along to the AI system to create the image.

Make an Avatar

For storytellers this allows for the creation of imagery to accompany their written words. This will open up the ability for writers illustrate their own stories and collaborate directly with an AI to produce the visuals they have in their imaginations as they write the story.

For musicians this means you can create imagery to be used in videos and live performances that match the elements of the song. For example I took the lyrics for one of my favorite songs by Paul Simon and in 15 minutes created a rich set of visuals to accompany each line of the song. Click on the tweet below to see all the images in the thread.

Shawn Kemp 🦋 @shawnkemp

A little story that was written by @PaulSimonMusic with visuals by @midjourney - I hope you enjoy. Hello darkness, my old friend

For marketers it provides a new set of fast, low cost, and original imagery to use for blog posts, social media, and other channels were visual are needed without having to worry about copyright infringements. This post is a great example as all the imagery was produced by AI.

For artists it can act as a powerful new tool for creation and ideation. I use it extensively in my art practice to quickly imagine things using words, phrases, and pictures of my art. The outputs are often surprising and completely unique interpretations of my style. Providing me with thousands of new ideas to explore.

But what about the ethics and consequences of this technology?

It’s going to dramatically change and impact so many fields and areas. In many ways I view it as a primary disruptor. A technology so transformative that it reshapes society. Just like electricity, the automobile, the printing press, etc. And it’s here now, the genie is out of the bottle and isn’t going back in.

Will it make artists obsolete?
What are the ethics behind a computer creating new artworks based on the style of a human?
Will it increase the volume and ability to micro-target misinformation and propaganda?
Will it bring about a whole new type of job market of people who know how to use this tool in consistent ways?

Upspiring

Discussion about this post