Dalí + WALL·E = DALL·E
OpenAI released another incredible demo last week with DALL·E, a version of GPT-3 that is trained to generate images from text descriptions.
Last year, GPT-3 showed that we can use language to instruct a neural network to generate text, and now DALL·E uses the same approach to manipulate visual concepts through language. Some of the things it can do:
- Modify attributes of an object (give a brick the texture of a porcupine),
- Draw multiple objects (draw a penguin wearing a hat, gloves, and shirt),
- Visualize perspective (zoom in or rotate around an object),
- And a bunch of other stuff like create illustrations, do zero-shot reasoning, and apply geographic and temporal knowledge
You should definitely go check out the demos (scroll down the page and click on the black boxes). And please LMK about any and all startup ideas you come up with for DALL·E!