Skip to content

Text-to-Image Generation

Text-to-image models create still images from a text description — no source media required. Use them for concept art, storyboard frames, thumbnail options, social media assets, or any visual that does not exist yet.

In Premiere Pro, go to Window > Extensions > modelBridge.ai.

Open the model selector and search for a text-to-image model. Some options to start with:

ModelStrengthsTypical Cost
FLUX 2 ProPhotorealistic, prompt-accurate~$0.04–0.06 per image
FLUX 2 DevFast, lower cost~$0.02–0.04 per image
Recraft v3Strong typography, design-oriented~$0.04 per image

Use the Image Gen filter chip to show only text-to-image models.

Describe the image you want. Text-to-image models respond well to specific, layered prompts.

Good prompt example:

“Close-up portrait of a weathered fisherman, golden hour side lighting, shallow depth of field, film grain, Kodak Portra 400 look”

Weak prompt example:

“Man outside”

Tips for effective prompts:

  • Describe composition — “close-up”, “wide establishing shot”, “overhead flat lay”
  • Specify lighting — “harsh midday sun”, “neon backlight”, “soft window light”
  • Include style references — “35mm film”, “editorial photography”, “oil painting”
  • Mention mood — “melancholic”, “vibrant”, “clinical”
  • Be specific about subjects — “a black Labrador retriever on a mossy forest trail” rather than “a dog in nature”

Adjust the settings below the prompt field:

  • Resolution / aspect ratio — match your project dimensions or intended use (16:9 for timeline, 1:1 for social, 9:16 for vertical)
  • Number of images — some models can generate multiple variations in a single request
  • Guidance scale — controls how closely the model follows your prompt (higher = more literal, lower = more creative)

Click the sparkle button below the prompt field to have AI rewrite your prompt for better results. This is useful when you know what you want but are not sure how to describe it in terms the model responds to. The optimization costs approximately $0.01.

The cost badge next to the Generate button updates live as you change parameters. Most text-to-image generations cost under $0.10. Resolution and number of images are the main cost drivers.

Click Generate. Text-to-image models are typically fast — most results arrive in 5–30 seconds.

The progress stages:

  1. Submitting your request
  2. Queued (waiting for GPU)
  3. Generating
  4. Downloading the result
  5. Importing to Premiere Pro

The generated image appears in the Source Monitor for review. From there:

  • Import to Timeline — places the image on the timeline at the playhead position
  • Save to Project Bin — imports without timeline placement
  • Generate again — try a different prompt or different model for comparison

Text-to-image models do not need a clip selected on the timeline. The media card will show that no selection is needed, and the Generate button will be active as long as you have entered a prompt.

Once imported, a generated image is a standard Premiere Pro asset. You can:

  • Set its duration on the timeline
  • Apply transitions and effects
  • Use it as a source for image-to-video generation (animate it with AI)
  • Export it as a still frame

AI models interpret prompts differently. Try rephrasing, adding more detail, or switching to a different model. FLUX models tend to follow prompts more literally; other models may take more creative liberty.

Check the resolution setting. Some models default to lower resolutions for speed. Increasing resolution improves quality but may also increase cost.

Make sure the aspect ratio parameter matches your intended use. Generating a 1:1 image when you need 16:9 wastes a generation. Set this before clicking Generate.