Unfortunately, I don’t have a talent for drawing. I look back focusing on the plastic breeding lessons where I saw the task in my head, but my hands didn’t carry out what I had in mind. Is it deception or am I not seeing what I want to draw accurately enough? Even in a drawing program, I cannot achieve a successful result. Fortunately, these days I can use Google Images or online image databases to search text or similar images for what’s more or less on my mind. Of course I rely on what someone else has already made. My wildest fantasies never come true.
or is he? In January, OpenAI, an American research lab co-founded by Elon Musk, launched an impressive new platform. With some text input, it displays images that best fit your description. But unlike Google Images, this is not a search engine that searches the web for desired images. It is a great application for machine learning, as the system creates the images by itself. The designers named their platform DALL-E, in reference to the surrealist painter Dali and the animator robot WALL-E.
DALL-E is an implementation of GPT-3, a software system that creates scripts. For this purpose, it uses an ingeniously trained neural network that has a lot of computing power with a massive set of text models. The part of the network is presented with a sentence and must predict the next word. Then another part checks how good this prediction is. Gradually, the network itself is learning how it can become better at its task. The result is a language model that can write news stories, and even entire poems and novels. This model is the first that language seems to really understand.
Only this is not the case. You can tell by “incidents” in the model output. GPT-3 produced a lot of meaningless text. Sometimes it has to do with gameplay errors, but unfortunately the outputs often turn out to be gender stereotypical or contain other embarrassments that the model naturally teaches these things from the training materials available to him. Developers can expect this.
Currently, serious work is being done to treat the system of teething problems. Oftentimes we see less dangerous as well as just fun apps appearing. So DALL-E falls into this category, for which GPT-3 has been supplemented by another huge set of examples. This time it did not consist of words and sentences, but pictures. And as output, the GPT-3 form does not generate text, but rather images.
Based on the image database, the system gains insight into what certain concepts and objects, such as armchairs and avocados, should look like. Because of the “understanding” language model behind it, it is able to produce images of not only more avocado-shaped armchairs, but also avocado-shaped armchairs.
Is it the end for furniture designers? No it is not. The network will not take into account the physical feasibility of the design, but it can be a source of inspiration for designers. In the meantime, I can see a fitting avocado chair in the living room. Anyone want to make one for me? Unfortunately I’m too clumsy for that.