MidJourney: FAQ
Welcome to the FAQ. While likely to split eventually into multiple pages, this is a consolidation of some content onto the wiki to be easily editable and searchable to start! Note this is heavily in need of organization :D. The majority of this content is from clarinet (@whatnostop#6700), Tally (@Tallath#0627), and shambibble (@shambibble#7257) (moved with permission).
General Questions[edit | edit source]
These questions are more general, or don't quite fit elsewhere :).
What is the “Three Basket Problem”?[edit | edit source]
The Three Basket Problem:
“There are three baskets. The first one is filled with blueberries, the second one is filled with apples, the last one is filled with strawberries.”
MJ can’t currently compose this collage. MJ does not currently support grammatical notions of direct objects or prepositional phrases with much reliability. MJ does not at this time support addressable objects, so pronouns and grammatical references (like “the first basket” or “it is”) are also unreliable. Bottom line: You might be able to get three baskets, but the current version of MJ does not support sorting the fruits. We have seen exactly one prompt get anywhere close, and it was not reproducible. But first prize is currently held by user @shambibble of #prompt-craft GO RATE IT UP! -- https://www.midjourney.com/app/jobs/99f9fa16-66af-4ae0-9bb0-92bb1ef85810/.
CLEAR PICTURES & YOU[edit | edit source]
👋To explain, let me make up some numbers, these are made-up numbers.
- imagine it takes (making this up) 📦1000 GPU units to render a good picture
- one /imagine command, producing a grid, applies about 📦2 GPU units
- when you upscale one of the grids, you've applied another 📦2 GPU units
- when you veeroll a grid selection, you've applied 📦2 GPU units
if it takes applying 📦1000 GPU units to render a clear picture, how many📦2-UNIT rolls would you need? 🦉🍭
👋 Once a Midjourney artist has a robust prompt in place (which is its own complex project), there's a good chance they will roll it many times to get it "baked".👩🍳🧑🏼🍳👨🏽🍳
⚠️All this said: Your mileage may vary. 🤔
Prompts are 🎲combinatorial, 💭language-based, 🦋chaotic.
🦈There's always a chance for your rolls to jump the shark. 🦈
(TLDR: Get a good prompt, then bake it good.)
(P.S. This may not be technically accurate but it's still usefully descriptive. Maybe someone can fix the technical accuracy.)
Why do people put artstation, 8k, 4k, etc...[edit | edit source]
MidJourney (MJ) is currently trained on hundreds of millions (soon to be billions) of images. These images are scraped from the internet and the text used to describe them are also pulled from that same page. When someone puts artstation, or octane render, they're trying to push MJ to use styles similar to images found with those tags or descriptions. Try it out with other websites such as pixiv, deviantart, quixel, etsy, etc.. and see how it changes the result!
What does "unreal engine" mean? (or "octane render" or "redshift" or "unity engine" or "cryengine" etc)[edit | edit source]
Each of these is a game or rendering engine, and it's intended to influence midjourney to output something similar to the images it was trained with that are tagged with the same values. This is just like saying "dog" to influence midjourney to output more dog-like images (because it's going to output images that look more like "dog" to it based on its training data).
There's a much more complete list available of terms here, with examples: https://rexwang8.github.io/resource/ai/teapot (click on the "Art Websites and Game Renderers" tab).
Note that this doesn't always do what users expect. It doesn't make midjourney USE these engines and remember that the images tagged with these terms in the training data are rendering ALL SORTS of different things. It does seem to help in many cases achieve more of the 3d rendered effects of these engines (which may be what you want).
Beyond the Manual[edit | edit source]
This section focuses on more specific details past what the online docs offer
How do I find my ‘random seed’ value/number?[edit | edit source]
There are two ways to find the seed number.
- If you are working with a current composition and it’s on the screen in front of you, you can react to it with an ✉️ envelope and wait. A moment later the bot will send you a display that includes the seed.
- If you are trying to find the seed from a prior creation, you will need to copy the job ID from the website details […] menu, then use the /show command with that ID, and then react to that display with the envelope.
What does a ‘start’ or ‘seed’ image really do?[edit | edit source]
When you provide a start image (URL) to MJ, it runs its AI image recognition process against the image, and produces a language prompt (just like ours), which it then prepends to whatever language prompt YOU give it.
MJ then uses the default weight, or the weight you provided with —iw, to process both its MJ-created language prompt AND your human-created language prompt together.
This translation of image-to-language-prompt is why feeding MJ seed images behaves nothing like a Photoshop filter: MJ picks up the subject matter and concepts, i.e. nouns - verbs - adjectives - anything that might surface as a word in a language prompt.
Is there any way to make MJ interpret my prompt with complete accuracy?[edit | edit source]
We have found only one 100% accurate prompt:

No but seriously: Natural language is your best bet, we’ve found, Midjourney is striving to understand “correctly written English.” Since it does not quite understand it, you’ll have luck with strings of comma-separated values with little or no grammar. But since it unpredictably and weakly understands grammar, you often increase your chances by including it.
Variation, Upscale, Uplight, Stop? What’s the difference?[edit | edit source]
- What we call a ”veeroll” creates a variation on the selected composition.
- An upScale (“yooroll”) pursues the same composition but pushes it a little further along in its rendering, which is typically increasing the “richness” of details.
--upLight
also works on the same composition but uses a finessed “lighter touch” on the rendering, so simplifies the details.--stop N
is like manually pulling the handbrake on the render process atN%
, no finesse.
This is something else that you’ll get a sense of after experimenting a few times. You can experiment endlessly in relax mode without using your valuable fast minutes.
Multiprompting and You[edit | edit source]
What is a multiprompt?
a multiprompt :: the answer to your question
Very funny, but really.
Multiprompts (called weighting in the official docs) set two independent targets for the diffuser to match. Every time it renders a scene, the AI is constantly looking back and forth between what it has and what (it thinks) you want. This just means that when it decides which direction to go in, it will take all of those prompts into account. (You can weight each prompt with a number after the :: see the official FAQ for details)
Okay, but how is that different from just putting two things in a prompt?
Single prompts, no matter how many things you add with commas, are going to point the AI in a single direction that's roughly an average of all of the stuff you specified and you can't focus it on any one thing. So it'll get your awesome character concept "trending on artstation" and "detailed and realistic" and then forget to draw an arm.
Then it will notice it forgot to draw an arm, look at your prompt again, realize that "trending on artstation" and "detailed and realistic" are six words, while "man" is one word and its only missing one little piece anyway, how important can that be? and happily go on dotting your armless character with finely textured skin pores.
So if I want two specific things, I should multiprompt each of them?
Not necessarily! Many people think this, and you sometimes get lucky, but doing "one red cube :: one blue cube" is not any more efficient because as long as there is at least one (or more) cubes on screen, and parts of them are red and blue, the diffuser will look at each side and go "hell yeah, think I nailed it." It's more likely to merge the multiprompted concepts than separate them.
Instead, prompt with something like this "two segregated cubes :: a red cube and a blue cube". Now you've given a second direction that essentially tells the AI to double-check both the quantity and the arrangement. You've reinforced at least one concept in both prompts ("cube") so both diffuser directions can start with common ground. And most importantly, we WANT it to merge those two concepts, because merging "segregated" with "red and blue" cashes out to "I should group those red and blue pixels" https://www.midjourney.com/app/jobs/c41b22db-5352-4acd-9fb6-2b7ddbf042bf/
You should not think of multi-prompts as discrete subjects in your composition. Think of them more like overlays, or emphasis. If you want two subjects, go for "two [subjects] :: information about them" and that way when it merges the subject with the details it will hopefully at least get you over the hump of vrolling for something that isn't a Tuvix. I generally have one "subject" prompt with minimal, key information, and then one "style" prompt that has all of the style, details, framing, etc.
And keep in mind, once you start going past more than 2-3 multiprompts, you'll just re-create the same issues you have with single prompts.
Okay, so you can make red, green, and blue fruit baskets. What else?
Text! If you want to shorten your time vrolling for legible text, don't put "legible text" in your prompt like a chump. Instead, do something like this:
"TEXT" :: details about your scene or logo or poster or whatever with "TEXT"
Reinforcing the text itself on both sides of a multi-prompt has two advantages: it gives the AI a "guide" to help steer it through vrolls and hopefully load the dice for you, and it also lets you say things ABOUT the text (like its on a credit card held by a particular woman you image prompted) while reducing the chance that those instructions get rendered AS text (which will often happen in single prompts) since you've now told it where the emphasis goes. https://www.midjourney.com/app/jobs/80d33ba8-cf46-48c7-b129-9098778a6cba/
How do multiprompt weights work?[edit | edit source]
So here's how it works (excuse some simplification):
dog:: cat::
= is dog (once), cat (once), averaged
dog::2 cat::2
= is dog dog (twice now) cat cat (twice now), averaged
dog::4 cat:1
= dog dog dog dog cat, averaged
dog::1 cat:3
= dog cat cat cat, averaged
You can play with these values to influence how they render:
Something::1
Lightly Something::0.5
Eliminate Something::-1
How do I control my v-rolls?[edit | edit source]
If you aren't using an image prompt, you can pretend for a minute that --iw
stands for Intact.
/imagine A sphere in a forest --c 10 --iw 4
Intact
can be helpful if you want to lock in (or keep intact) some characteristics of your grid selection once you've found something you like.
Higher values of --iw
tell Midge to keep some unknowable stuff about your grid selection intact during v-rolls. It literally tells Midge to use the grid selection as a heavily weighted image prompt. (Shhh. That's because "Intact" is just --iw
without an image prompt, and when you're not using an image prompt, your grid selection image is the only one in play.)
Two key uses we know about:
🦋#1 - Temper the chaos. Higher values of --c
create increasingly random compositions for your grid. Higher values of --iw
tell Midge to use the selected image from the grid as a heavily-weighted image prompt, resulting in a grid of more compositions like it when you veeroll.
Using a combination of values like --c 10 --iw 4
tells Midge you want a lot of choices on your first grid, but then you want it to "settle down" when you make a v-roll choice from that grid.
🦥#2 - Slow down v-roll degradation. If you suffer from prompts that "look worse" each v-roll, using a value of --iw
could help Midge allow the grid selection you've chosen to overpower the prompt you wrote, which is (unfortunately) in play as a cause for the trend. If Midge got something right, a high value of so-called "Intact" may keep that rightness intact as you evolve your creation. (It may keep wrongness too.)
✨As usual, AI is magic. ✨ The only way to figure out what this means for your workflow is to try it. Let us know what you discover!
There are TWO categories of punctuation: Functional and Noisy.[edit | edit source]
1️⃣ Functional: Three strings are functional.
Only THREE strings matter programmatically: Double-Hyphens, Double-Colons, Spaces
- ☑️Double-hyphens delimit parameters.
-- aspect 9:16
- ☑️Double-Colons delimit integers (positive and negative) for weights.
::-0.5
- ☑️Spaces are used as the de facto noise to separate tokens, so are in effect special noise.
2️⃣ Noisy: Other punctuation adds "intriguing noise”.
All punctuation adds what we will call intriguing noise.
Comma-noise and hyphen-noise are helpful noise for troubleshooting.
When troubleshooting, correct use of commas is recommended to help grouping just like it does in ordinary writing.Ornate shadowed massive sentient
Ornate, shadowed, massive, sentient
When troubleshooting, hyphenation is recommended to increase the relationship between tokens.Antique brass candlestick
Brass-antique-candlestick
What does --ar mean?[edit | edit source]
(or --aspect or --v or --version or --hd or --no or --stop or --uplight or --seed or --sameseed or --stylize or --s or --quality or --q or --video or --iw or anything else in the midjourney bot commands for that matter)
Read the official docs! https://midjourney.gitbook.io/docs/imagine-parameters
Does stop affect upscale?[edit | edit source]
No! The upscale process will generate all of its detail as it normally does, but if stopping early causes the base grid to look different, the upscale will still rely on what was produce by the grid and create a very different upscale.
I can't open anything from MidJourney and I'm running Avast Antivirus. What do I do?[edit | edit source]
Instructions for Avast users to add Midjourney as a "safe" destination: Open Avast > go to Menu > Settings > General > Exceptions > [Add Exception] button > enter "www.midjourney.com"
Can I include multiple "--no" parameters?[edit | edit source]
See MidJourney: Can I use multiple --no parameters
Fast and Relax[edit | edit source]
Standard members have the option to switch to a slower mode using /relax This wont eat up your fast hours, but will take 2-10 minutes to generate the image, you can have 3 concurrently running jobs, and 10 items in your queue. It's a fantastic way to preserve those fast hours (Note: fast hours reset every month and do not roll over- make sure you do take advantage of them! They're great for max upscales!)
What do the +'s between prompts on the website mean?[edit | edit source]
Those are multiprompts, the :: operator
Prompt Troubleshooting[edit | edit source]
This section focuses on things you might ask as you are trying to "fix" an image you've asked for
General Prompt Troubleshooting[edit | edit source]
- Try synonyms or alternate phrases. Trash can not working? Try waste basket, etc..
- Playing with multiprompts and weights can help ensure a certain aspect of the image appears, and can help color specific objects (see:
/help
) - You're not limited to only referencing one arist, see who's styles you can combine!
- Emoji's are valid and make fantastic results.
Why is it so hard to get specific compositional arrangements?[edit | edit source]
Conjecture: MJ relies on the "art direction" of its sources to decide how to arrange things for you. How does that play out? It means there are the places in your prompt where the sourcing is noticeably influencing your composition:
Direct objects: The dog barks at the ball.
Prepositional phrases: A cat climbs up a curtain. Pronouns: It glows in his hand. Subject References: The second basket is full of apples. |
MJ will source "dog, barks, ball" and find the most common compositions that meet these criteria. It might not be 'barking at' (your language) but ONE of the grid selections may eventually land there or near there.
ACTION: To improve your chances, your job is to [1] select words with maximum specificity ('lounging' is more specific than 'lying down', 'dalmatian' is more specific than 'white dog with black spots'), [2] use grammatically correct language, and then [3] work with MJ through grid selections to bring it incrementally closer to your vision.
Can you direct complex concepts with specific composition and multiple subjects?[edit | edit source]
well, you're in for a lot of dice-rolling... but one way to slightly shorten it may be an image prompt mockup, fed alongside your text, adjusting the image weights as necessary; with this you can try to align midjourney's classifiers with its generators and "push through" a concept
midjourney parses images at a max of 256x256, so you don't need to be a photoshop wizard or anything, a crude collage of the things you want can be sufficient, whether they're previous midjourney renders or even just stock clip art you grabbed online
HOW DO I RECREATE ONE OF MY IMAGES WITH A SMALL CHANGE?[edit | edit source]
For those of you who know the term PRIMARY KEY from relational databases...
$string of your prompt + the $integer of your seed = primary key for the image output
If you've made an image of a red bird on a white background and want to make it a red bird on a black background, and you change just that one word and roll the prompt again with its $seed again, you will almost get the same picture but with the new color. If you change the aspect ratio, watch for MJ to possibly completely reinterpret the prompt for the new canvas size.
TLDR: If you use EXACT $string + $seed
again, you've used a primary key to recreate a close approximation of your original composition. But NEVER pixel-to-pixel exactly the same.
FAQ: My character doesn't look right![edit | edit source]
👋🏽TLDR; If your model doesn't look quite right, it could be because you didn't give the image enough time to bake in the GPU oven, or because you didn't provide a clear prompt.
MORE DETAILS:
Through testing, we've discovered at least two common reasons why models might look unclear.
1️⃣ The prompt is ambiguous or broken. Describe the role and/or context of the person you want to see. For example, athletic firefighter
will work better than a man wearing a fireman outfit
. You can use phrases like athletic, full-figured, androgynous, tomboy, etc. You can also name a famous model or celebrity as a starting point, i.e. like David Spade with long red hair
. If you don't do any of these things, you may see Midge substitute a body type based on other cues in the prompt. In fact, if you are ambiguous about the gender of your subject--such as saying that the model wears a woman's swimsuit
or a ballgown
without explicitly saying it is a woman wearing it--then Midge may select a model of any gender.
2️⃣ The composition needs more GPU time. If you've only used /imagine
once and stopped at the first grid, your composition has not had very long to bake in the GPU oven. Bake it a little longer by making grid selections to guide it toward the look you're chasing, or, if you dislike all the options, reroll the whole grid. If, after 2-3 rolls, you see that NOTHING looks right, it's time to analyze and edit the text of your prompt and start again. Consider, as well, using higher values of --quality
.
👋🏽TLDR; If you want your model to look a certain way, make sure your prompt makes use of specific roles and contexts, names a roughly corresponding model or celebrity, and gives Midge GPU time to render the look you want. And hey, you can extrapolate this lesson to almost everything you do in Midjourney.
Can I edit an existing prompt?[edit | edit source]
If you want to change he prompt, aspect ratio, etc.. you'll need to start over. There's no current way to change the --ar, --stop, text, image prompt, etc.. after an image has been generated.
I put an image in and it's not what I expected[edit | edit source]
Image prompts are for inspiration only and will not apply a style filter to the image. You can try to influence the final image more with input images by adjusting the --iw X
image weight.
My face has weird rings around the eyes[edit | edit source]
It's possible MJ is attempting to add glasses, you can try using --no glasses
(or glasses::-.5
) to reduce the chance MJ creates them. Tired of creepy grins? Try --no teeth
.This is useful in many other situations as well, such as when you have birds in your landscapes that come out all wonky, just add --no birds
.
My faces are weird[edit | edit source]
Eyes and faces are hard! Don't give up! In addition to some of the previous tips, try out symmetric eyes
or airbrushed
I have too many Heads[edit | edit source]
Check your aspect ratio! Try out 2:3 or 4:5, you can also try --no double faces
My image doesn't look right![edit | edit source]
👋🏽 TLDR; If your image doesn't look quite right, it could be because you didn't give the image enough time to bake in the GPU oven, or because you didn't provide a clear prompt.
Through testing, we've discovered at least two common reasons why images might look unclear.
1️⃣ The prompt is ambiguous or broken.2️⃣ The composition needs more GPU time. If you've only used
- 🧑🤝🧑For portraits: Describe the role and/or context of the person you want to see. For example,
athletic firefighter
will work better thana man wearing a fireman outfit
. You can use phrases like athletic, full-figured, androgynous, tomboy, etc. You can also name a famous model or celebrity as a starting point, i.e.like David Spade with long red hair
. If you don't do any of these things, you may see Midge substitute a body type based on other cues in the prompt. In fact, if you are ambiguous about the gender of your subject--such as saying that the model wears awoman's swimsuit
ora ballgown
without explicitly saying it is a woman wearing it--then Midge may select a model of any gender.- 🖼️ For compositions: Just like for portraits, verbose description is weaker than the right vocabulary. For example,
a tree on its side knocked down by the wind
is less effective than using the term directly from forestry,a windblown tree
. You will very likely need to search the internet and use a thesaurus to get where you're going.
/imagine
once and stopped at the first grid, your composition has not had very long to bake in the GPU oven. Bake it a little longer by making grid selections to guide it toward the look you're chasing, or, if you dislike all the options, reroll the whole grid. If, after 2-3 rolls, you see that NOTHING looks right, it's time to analyze and edit the text of your prompt and start again. Consider, as well, using higher values of --quality
.
👋🏻TLDR; If you want your model to look a certain way, make sure your prompt makes use of specific roles and contexts, names a roughly corresponding model or celebrity, and gives Midge GPU time to render the look you want. And hey, you can extrapolate this lesson to almost everything you do in Midjourney.
🤿 DEEPER DIVE https://bit.ly/Clarinet-Prompt-Troubleshooting
Specific Prompt Questions[edit | edit source]
This section is less on troubleshooting, and more "prompt clip lite". For more detailed prompt clips, see MidJourney: Prompt Clips.
HOW TO GET A FULL BODY PORTRAIT[edit | edit source]
You need three things to get a full body portrait:
- an aspect ratio tall enough to account for a full body, which means something like 1152x2048, 9:16, 5:9, 1:2
- A source of poses that includes full body examples, which means adding "stock photography" to the prompt for example
- details for MJ to add to the whole figure. If you mention just her shirt, she might not have pants or shoes. It is best to drag the camera from head to toe touching with a detail every part you want MJ to render. e.g, As soon as you mention shoes, MJ knows he has to show you the whole figure.
Is there a good guide to camera control lingo I can use in MJ?[edit | edit source]
We don't know which of these terms MJ understands but we think you should experiment and report back!
https://www.studiobinder.com/blog/ultimate-guide-to-camera-shots/
How can I add legible text to my composition?[edit | edit source]
Your mileage may vary, but here are the four elements to rendering text that we think might be necessary.

The best chance of creating a full weapon[edit | edit source]
- Use an aspect ratio that suits the most common orientation of the weapon. Swords are vertical, rifles are horizontal.
- Google the weapon in question and find its specific terms. Don't say "bow" - say "recurve bow". Don't say "sword in a lake" - say "Excalibur".
- Find artists and other style cues that correspond to the weapon you're after. What media, games, movies, comics, artists, genres, etc represent your weapon well? Include these in your prompt. Some of them might work. Others will be dead weight.
- Finally, if you want an action pose, then VERB your weapon. Do not say "an orc with a sword" - say "a Warhammer 40k orc fighting the wind with a broadsword" or "a 16th century samurai warrior striking a wooden dummy with Excalibur"
How to check first if Midjourney even understands your sourcing reference[edit | edit source]
TLDR: /imagine something you'll recognize as being in that style.
- You want to say
in the style of Ren & Stimpy
(for example) but you don't know if Midjourney will understand that. - You think about something that appears commonly in that style. For example, something that appears often in Ren & Stimpy is a cartoon chihuahua (that's Ren himself).
- You do this simple test:
/imagine a cartoon chihuahua in the style of Ren & Stimpy
- If output looks like it's adopted the style you named, you're golden.
If it appears generic with lots of orange and teal colors, you're looking at Midjourney 'defaults' which is an error message meaning NOT FOUND.
DOES MIDJOURNEY KNOW THIS STYLE, ARTIST, OR MEDIA?[edit | edit source]
To check if Midjourney understands your sourcing reference, you can /imagine
something that's commonly found in that style, and if the output looks like it's adopting the style you named, you're golden. If the output looks generic with lots of orange and teal colors, you're looking at Midjourney 'defaults' which is an error message meaning NOT FOUND.
Tips for anime[edit | edit source]
- Reference an artist if you can, MJ is great at picking up styles
- Using --uplight and --stop x can help reduce noise and smooth out an image
- Fluff words such as pretty, cute, beautiful, etc.. make a drastic difference. (edited)