Can AI help design our mascot?

Background

We ordered some designs on a well known freelancer site.

As reference we added instructions in the orders. "A kind, helpful looking elf with a santa hat", "i'd like it to be in the style of a mascot logo or combination logo", "it should be cute" etc and attached a few images of various robots and then I drew a very rough sketch.

‍

Result from humans

Results from "AI"

‍

‍

This is going to change the way graphic design is done. It will probably only take a few years since most of the components are here already. It's just putting them together into a nice service thats missing.

‍

How did we do it?

We started experimenting with CLIP and BigGAN through Ryan Murdocks "Big Sleep". We were testing what kind of images we could get by asking it to draw a green robot rocket with big cute eyes. At first the results were not very promising, but then something clicked, by nudging it with multiple different instructions it would not get stuck at a certain spot. Creating an endless stream of robots.

Around 02:00 I realized that it probably would be able to understand what I was after!

‍

Phase 1

A frenzy of generation attempts and various methods followed from all three of us. We added reference images for the model to look at and also wrote some instructions such as "a rocket-robot" and "a robot with two cute eyes". We realized this was a viable method to get inspiration for our logo.

‍

What is creativity? What is design?

I'd argue that compression is a big part of it. Take cubism. The real world exists, cubes exists. Compress them and a creative piece is born. If there is something Neural networks and transformers are good at, it is compression. So if creativity is partly compression we would see evidence of "creativity" in these experiments. Searching and mixing between styles and objects to reach a goal.

Lets have a look at Filips meme-moment. LEMON DWARF!

Filip had multiple instructions. Such as "a robot with big eyes", an image of a robot, and another image of a dwarf. Then once in a while the instructions "A lemon" is added. So now its forced to change its normal struggle to output a mix of a robot and a dwarf. Now it has to add a lemon in to the mix. The result is a LEMON DWARF ROBOT!

Even if this is not true "AI-creativity" the lines are definitely turning blurrier. The model drawing is not trained on dwarfs or robots. It's just mixing its ability to draw a cassette player, a car and a dog to create something different, in between those objects.

Phase 2

After generating over 200.000 various cute robots we wanted something different. I created a new method for the interaction between BigGAN and CLIP to get different results. We switched around the reference images and started experiment a bit, creating reference images ourselves.

‍

Heres we used the old instructions and reference images but added minimalism as instruction. With the new method the results changed around a bit.

Can we make our current logo more awesome?

Lets add these two images to the model and some other instructions such as an astronauts helm and a big visor among other things.

Id say it works very very well. Scary well. This is a fantastic tool for creativity.

Sometimes its a hit or miss:

‍

Thought experiment:

We "show" it the images from above and some names of our favourite artists and this stunning image is generated.

‍

In the end, we ask ourselves the final question, in what stage of all this did the creativity happen?

Was it the invention of Artificial neural nets?
Was it the vast resources of images and text available online?
Was it DALL-E or VAEs?
Was it transformers and visual transformers?
Was it when Alec Radford et al designed CLIP?
Was it when Ryan Murdock put it all together?
Was it when we prompted it with images and texts and gave it a purpose?

‍

Summary

One problem remains

How can we possibly pick a favourite among everything that is generated? We really did not solve the problem of designing our logo. We've made this even more difficult for ourselves, when discovering an almost endless stream of robot designs.

‍

The future of infinite content is approaching.

‍

I recently did some fooling around with controlled generation in a different way:

Having some fun with @dribnet notebooks based on @RiversHaveWings and others. I noticed from older experiments with @advadnoun first notebooks, where you start seems to be key if you want some real control over output. #pixelart #DeepLearning
1/x
🧵🎞️⬇️ pic.twitter.com/NPV8yKRJZi
— Viktor Alm (@ViktorAlm) August 18, 2021

‍

Thanks to:

________________________

BigGAN

Large Scale GAN Training for High Fidelity Natural Image Synthesis

Andrew Brock, Jeff Donahue, Karen Simonyan

https://arxiv.org/abs/1809.11096

________________________

CLIP

Learning Transferable Visual Models From Natural Language Supervision

Alec Radford, Jong Wook Kim et al

https://cdn.openai.com/papers/Learning_Transferable_Visual_Models_From_Natural_Language_Supervision.pdf

_________________________

DALL-E

Zero-Shot Text-to-Image Generation

Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, Ilya Sutskever

https://arxiv.org/abs/2102.12092

___________________________

Methods inspired and developed from Ryan Murdocks initial method @adverb