I’ve been incredibly impressed with AI image generation, in how far it has come and how quickly advances are being made. It is one of those things that 5 years ago I would have been pretty confident that it wouldn’t be possible and now there seems to be significant breakthroughs almost weekly. If you don’t know much about AI image generation, hopefully this post will help you understand why it is so impressive and likely to cause huge disruptions in design.
As a quick background, AI image generation takes a written description and turns it into an image. The AI doesn’t find an image, it creates one based on being trained looking at millions of images and effectively learning patterns that fit the description. The results can be anywhere from photo-realistic to fantasy artwork or classical painting styles… almost anything. There are several projects available for pretty much anyone to try for free, including Midjourney, Stable Diffusion, and DALL-E 2. I am using DiffusionBee that runs from my MacBook, even without a network connection (i.e. it isn’t cheating and pulling images off of the Internet). Oh, and image generation is fast…. from a few seconds to about a minute.
If it isn’t obvious why this is amazing and likely to be massively disruptive, imagine you need a movie poster, blog images, magazine photos, book illustrations, a logo, or anything else that used to require a human to design. The computer is getting pretty good and can generate thousands of options in the time it takes a human to generate one. It can actually be faster to generate a brand new, totally unique image rather that search the web or look through stock photography options. For example, the robot image featured at the top of this posting was generated on Midjourney with about 5 minutes of effort.
As a practical example, recently Sticker Mule had one of their great 50 stickers for $19 deals and I wanted to create a version of the Banksy robot I use for my blog header, but I wanted a new style, something unique to me. However, I am not an artist so coming up with anything that would look good was unlikely. Then I remembered my new friends the robot overlords, and thought I would see if they could help me.
One of the cool things about AI image generation is you can seed it with an image to help give some structure to the end result. The first thing I did is take the original Banksy robot, remove the background and spray paint from the hand, and fix the feet a bit. This didn’t need to be perfect of even good, as it is just used by the AI for inspiration, for lack of a better word.
I loaded up DiffusionBee, included my robot image and simply asked it to render hundreds of images with the prompt “robot sticker”. And then I ate dinner. When I came back to my computer, I had a large gallery of images that were what I wanted… inspired by the original but all very different. Importantly, they looked like stickers!
If you look through the gallery, above, you can see the robot stickers have similarity in structure to the original, but there is huge variation on many dimensions. In some cases legs are segmented, sometimes solid metal. Heads can be square, round, or even… I’m not sure what shape it is. The drawing style varies from child-like art to abstract. And again, these are all new, unique images created by AI.
The biggest challenge I had was trying to pick just one… so many were very cool. I finally decided and it took about another 5 minutes to clean up the image a little to prepare it for stickerification.
When I looked at my contact sheet, my army of robots, it reminded me of my early days developing video games… if I had wanted to make a game where each player gets a unique avatar, it would have taken months to have somebody create these images. Today it takes hours.
I mentioned that things are progressing quickly… in the last few months we’ve gone from decent images to beautiful images to generating HD video from text prompts. It isn’t hard to imagine that, in the not-too-distant future, it will be possible to create a full length feature film from the comfort of your living room, and that the quality will be comparable to modern Hollywood blockbusters. The next Steven Spielberg or Quentin Tarantino won’t be gated by needing $100 million in backing to make their film, the barriers will be significantly smaller. AI has the potential to eliminate some creative professions, but it also has the ability to unlock opportunities for many others.
What are your thoughts? Is AI image generation an empowering technology that will democratize creative expression, a horrible development that will put designers out of work, or do you just welcome our new robot overlords? Leave a comment, below!
One Reply to “Robots Building Robots: AI Image Generation”
Whenever ‘disruption’ is mentioned I dream of a world where a machine learning framework plays a slightly off-brand guitar solo.