Does Midjourney use Stable Diffusion?

With advancements in AI, you now have a ton of services that can help you create anything from scratch, even art. One of the popular tools that have emerged lately is Midjourney, an AI image generator that can create stunning artwork using text prompts in a matter of seconds. 

If you’re wondering whether Midjourney has links to Stable Diffusion and how different they are from each other, the following post should help you with everything you need to know. 

Does Midjourney use Stable Diffusion?

A simple search on the web would reveal what kind of technology is used on image-generating tools like DALL-E and Stable Diffusion. For instance, Stable Diffusion uses a latent diffusion model (LDM) that was developed by the CompVis group at Munich as well as a CLIP ViT-L/14 text encoder for conditioning on text. 

When it comes to Midjourney, its creators don’t offer any kind of information on what training models they use, or how it all comes together, nor have they released their source code to the public. At the time of writing, the tool currently runs on its V5 model which was released on March 15th, 2023. It’s an upgrade over the V4 model that is based on a brand-new AI architecture and codebase.

Midjourney says the new models were designed by themselves and were trained on their own AI supercluster. Midjourney’s V5 model supports high Coherency and offers all the other advanced features found on the V4 model like image prompting, multi-prompts, and the ability to add smaller details with creatures, places, and objects. 

While it doesn’t claim to use Stable Diffusion, it’s speculated that the changes made in the Midjourney V5 model result in images that look quite similar to the ones made via Stable Diffusion v2. So, there’s a possibility that Midjourney could be using a different version of the latent diffusion model that powers Stable Diffusion or shares certain elements of Stable Diffusion v2. 

How does Midjourney differ from Stable Diffusion

Based on the respective operations, Stable Diffusion encourages an open-source model as its shares its source code for others to use or develop. Anyone can use its code and design their own model by fine-turning the configuration in the way they want using Stable Diffusion. That’s not the case with Midjourney as the company offers its image generation tool as a proprietary product where they claim to develop and train the model on their own and haven’t yet released a source code for others to use. 

When it comes to input, Midjourney will use your main concept and add elements of its own to generate images. In this process, some of your input keywords may get ignored during the generation as the tool will create its own iterations with relevant colors, lighting, and other elements. If you wish to replicate these images on Stable Diffusion, you will have to provide a longer prompt with more details than what you may have entered on Midjourney. 

The images created on Stable Diffusion may look similar and random to the ones you generate using DALL-E. Midjourney’s output can sometimes look unique often leaning toward darker or more dramatic backgrounds. You can create distinct-looking images on Midjourney with decent prompts while you may need to add more details and be more specific if you wish to generate them on Stable Diffusion. This could be a good thing or a bad one depending on the level of customization you want to take advantage of. 

In terms of usage, it’s Stable Diffusion that gives you the library to create more images as it can be used for free, and for the most part, you own the rights to the images that you generate. Midjourney, on the other hand, only offers a limited trial that you can use to generate up to 25 creations. After that, you may need to subscribe to their paid tier which could set you back by $8-$48 per month, and your usage rights may vary depending on the plan you subscribe to. 

That’s all you need to know about whether Midjourney uses Stable Diffusion.

Posted by

Ambivalent, unprecedented, and on the run from everyone's idea of reality. A consonance of love for filter coffee, cold weather, Arsenal, AC/DC, and Sinatra.