Home
Online.ua Guide
Tech
The Complete Guide to AI Image Generators

Category: Tech
Publication date: 18 December 2023

The Complete Guide to AI Image Generators

Oleksiy Hrushevsky

Source: online.ua

Midjourney
Stable Diffusion
DALL-E 2
Bing Image Creator
GauGAN2
Lexica Aperture
Deep Dream
Dream by WOMBO
Conclusion

Is it true that absolutely anyone can become an artist? Previously, the answer to this question was unequivocal - no. After all, in order to simply learn to draw decently, it was necessary to spend several years in an art academy, not to mention constantly training, just to "get your hands full". But now the situation has changed. Absolutely anyone can create pretty decent digital images with virtually no training, using neural networks to create pictures. Theoretically.

In fact, you also need to learn how to use neural networks for drawing. However, it requires other skills. For example, the ability to clearly formulate requests within a specific program. After all, each neural network draws in its own way, since unique data sets were used for its training, on the basis of which unique regularities emerged. And now we will tell you about some of the most successful similar projects, as well as talk about the prospects for the development of neural networks for drawing in general.

Midjourney

At the moment - the best image generator, as it is able to work with complex descriptions - so-called "promts". This is something like a list of tags or SEO queries that must be present or absent in the picture. The result is such a complex image that it seems that artificial intelligence is drawing it in real time.

After processing the request, you receive several image options, from which you can choose one and continue to work with it further - increase the quality, add new elements, edit. All this happens in the Discord channel of the project. Initially, you will get 25 images in low quality, but you can buy additional service packages for money.

As for the disadvantages of this neural network for drawing, there are not so many of them. These are the need to use only the English language, limited stylistic potential, the mandatory purchase of a premium package in order to use the obtained images for commercial purposes, and the complexity of creating "promts". And yes, Midjourney doesn't do very well with landscapes and multi-component images, but as an online face generator it's probably the best on the market right now. Especially for serious artists and designers, since after it the pictures practically do not have to be "finished in Photoshop".

Stable Diffusion

Midjourney's main competitor, with its own advantages and disadvantages. Let's start with the pros. This neural network creates a picture at your request for free, online and without the need to use complex prompts. A fairly simple textual description is sufficient. But it works slower than Midjourney, and the detailing of the image, especially when it comes to portraits, is worse. However, it supports many more different styles, and also allows you to add your own, refines pictures quite effectively, and can even be used when restoring old photos.

In addition, most of the shortcomings of the free version are corrected in paid add-ons. After all, the code in Stable Diffusion is open, so many talented programmers have already figured out how to refine and improve it. As for the underlying neural network, it is still an excellent image generator that is constantly being improved.

DALL-E 2

Probably the best neural network for generating pictures from a simple text request. During its development, the GPT-3 language model and several billion pairs of "text description - image" were used. At first, she created small pictures, then larger ones with a resolution of 1024x1024. And then the Outpainting function was added, which allows you to "complete" the missing elements, focusing on the style of the image and its internal logic, as well as other additional features.

However, the generation of pictures by the DALL-E 2 neural network is still far from ideal. Long text is not always fully understood, especially "exclusion words" and professional terms. And it is better to work with her in English. Visual distortions and combinations of incompatible elements are also common, and the logic of finalizing images is far from always correct. However, the project is being finalized. For example, it already has the function of editing a part of an image, which allows you to correct individual elements. The disadvantages include the fact that this online neural network is paid. At least for new users. Those who have registered for a long time have up to 15 monthly free images.

Bing Image Creator

An imaging neural network built by Microsoft based on DALL-E and integrated into their Bing chatbot and Edge browser. It works quickly, on request it produces 4 image options. 25 generations are available for free per day, but this number can be increased by switching to the paid version of the program. When it comes to detail, this neural network for imaging works great in the "realism" genre. You can even say that it is a "photo generator".

The resulting pictures almost do not require proof, even if it is about fingers. After all, most neural networks have big problems with them. Simply because their normal processing requires either an extremely high resolution (image generation is the reverse process to recognition, which uses the same algorithms), or some original software solutions. Microsoft can afford to use a lot of computing power, so there are practically no problems with fingers in Bing Image Creator.

GauGAN2

A free neural network for generating landscapes from Nvidia. Available to users in the form of the NVIDIA Canvas program. No registration is required, no additional tambourine dances are required. Only GeForce RTX, NVIDIA RTX, Quadro RTX, TITAN RTX series graphics cards and a little more than 1 GB of free disk space are required. The main feature of this project is the simultaneous use of textual descriptions and a graphic basis. That is, you can simply enter the query "ocean and sandy beach", and the neural network will draw the most averaged and simplified result.

And then you can start making further changes to the resulting picture with the help of simple tools that are somewhat reminiscent of ordinary Paint. The result is quite detailed and realistic if you try. A surreal combination, however, is much easier to create. Generation is fast, the interface is quite simple and intuitive.

Lexica Aperture

Another rather interesting neural network for creating pictures with a high degree of detail in the genre of digital painting or photorealism. Works at the expense of complex prompts, accepts references, allows you to significantly edit the received image. It even suggests some stylization. Of the minuses - problems with fingers and not always fully following prompts. In a word, typical problems that arise when a neural network generates images.

However, the ease of operation (the request is entered in the browser, you can log in through a Google account), 25 free generations per user and the high quality of the results obtained make Lexica Aperture very popular. But for commercial use of the created pictures, you still need to purchase a paid version - as in many similar projects.

Deep Dream

Another free neural network for creating pictures. Mostly - surreal. It most clearly demonstrates the principle of building an image with similar programs. That is, if a certain combination of points is typical for the "eye" pattern, then it will be refined as an eye. And it is not important that it is not combined with other elements of the image. This mode is called "Deep Dream", and the technique described is the so-called II-amplification. It was he who made this neural network for image generation famous. And more "realistic" modes, such as Text 2 Dream, were created later.

The program is formally free - the neural network draws at the expense of "internal currency", which can be purchased for real money (from $19). But no one prevents you from simply registering a new account. Works both with photos and text description. However, the text interprets worse than other similar projects - it can simply ignore most words, clinging to one thing.

Dream by WOMBO

A neural network for image generation that produces an abstract image with relatively low detail upon a simple text request (up to 200 characters in English). But fast, free and in different styles. Plus, you can change the degree of abstraction, slightly edit the resulting image and add to it. There is a paid version with more functionality, the ability to study other people's products and correct the picture created at your request. The main disadvantage of this neural network for creating pictures is poor detail and lack of clarity. And she does not always understand text requests correctly. Nevertheless, her abstractions are simply wonderful.

Conclusion

So what conclusions can be drawn from all of the above? Quite interesting.

First, the neural network for drawing has already become one of the important tools for the work of digital artists. It really helps speed up work and optimize routine processes. With varying degrees of effectiveness – as before, different programs for graphic design.
Secondly, the appearance of radically new and successful projects is quite possible. Simply because the process of training each separate neural network to generate pictures can lead to unpredictable results in terms of its effectiveness. Therefore, the creators and owners of Midjourney or DALL-E 2 cannot rest on their laurels - the competition is not going anywhere.
Thirdly, new scandals await us, especially related to intellectual property. After all, many artists are already dissatisfied with the fact that the neural network draws, copying their signature style. And the further - the more actively they will oppose it.
And fourthly, artificial intelligence will definitely learn to draw fingers normally - just give it more data or time, and also increase the computing power.

So yes, the world of digital creativity has changed permanently and irreversibly. And we, as consumers of this same creativity, will only be better from it. As well as content creators who are ready to keep up with the times and adopt new technologies.

How do you like that?

Table of Contents

The Complete Guide to AI Image Generators

Table of Contents

Midjourney

Stable Diffusion

DALL-E 2

Bing Image Creator

GauGAN2

Lexica Aperture

Deep Dream

Dream by WOMBO

Conclusion

Table of Contents

Midjourney

Stable Diffusion

DALL-E 2

Bing Image Creator

GauGAN2

Lexica Aperture

Deep Dream

Dream by WOMBO

Conclusion

Share