Decision aid for marketing, communications & IT

AI models for image, video and audio comparison matrix 2026

Generative media models are procurement-ready in 2026, but the risks differ from coding tools: output rights, provenance obligations and licence pitfalls decide, not benchmarks.

This matrix compares 28 image, video and audio models plus one model-agnostic workflow tool against the criteria that actually derail adoption in Switzerland and the EU: may the outputs be used commercially? Is there IP indemnification? Can the model run on-prem, and does the licence even apply in the EU?

No commitment · via Teams
models & tools compared 29
media categories Image · Video · Audio
status 25.06.2026 · editorial assessment

01 · Context

Three media, one procurement problem.

What changed substantially from 2025 to 2026, and why the selection criteria differ from LLMs.

Image: multimodal models like GPT-Image-2 and Google's Nano Banana Pro have retired the old exclusion criterion of unreadable text. Among open weights, FLUX.2 klein (Apache 2.0, from Germany) and Qwen-Image make commercial self-hosting comparatively licence-clear for the first time, while Tencent's Hunyuan licences exclude use in the EU, UK and South Korea under their licence terms (check the licence case by case).

Video: the biggest break is OpenAI's retreat, the Sora 2 API is set to sunset per OpenAI (communicated date: 24 Sep 2026; as of June 2026, subject to change). Google Veo 3.1 (SynthID, EU region Frankfurt), Runway and Adobe take over the enterprise field; on-prem, Alibaba Wan 2.2 and Lightricks LTX-2 have become viable for pilot and production scenarios.

Audio: in music the legal situation is shifting, after the first label settlements (Warner ↔ Suno) licensed models with C2PA watermarks are gaining ground, while Sony and UMG keep litigating or negotiating; a landmark decision in the case against Suno is expected in summer 2026 (a summary-judgment hearing). Voice is shifting to real-time agents; GDPR-compliant on-prem transcription has become technically straightforward with NVIDIA's open models; data protection, operations and model governance still need case-by-case review.

02 ·

Media models: rights, provenance, on-prem

The dimensions on which media-AI procurement in Switzerland and the EU fails or stalls.

Filter · what matters to you? Filter tools shown
● Editorial assessment, as of June 2026. positive / uncritical with caveats review critically
ModelTypeLicenceOutput rightsProvenanceStatusOn-prem / hardwarePrice (indicative)DACH risk
Image – generation & design
FLUX.2 familyBlack Forest Labs · Germany klein-4B: · dev: FLUX Non-Commercial klein & API free · dev needs paid licence API: yes · : no 11/25 · klein 01/26 Yes · dev ~80 GB · klein from consumer GPU API from $0.03/MP · weights free EU vendor · check licence tiers
Stable Diffusion 3.5Stability AI · UK Community License free below revenue threshold unclear 3.5 · 10/2024 Yes · consumer–high-end GPU weights free · API credits Getty: UK lost 11/25, US ongoing
GPT-Image-2OpenAI · USA API free · () 04/2026 No $0.006–0.21/image
Nano Banana ProGoogle (Gemini 3 Pro Image) · USA API free · on Vertex AI (always on) GA 05/2026 · 4K preview No $0.13/2K image · Imagen 4 from $0.02
Firefly Image 5Adobe · USA API free · full () 03/2026 No · custom models possible CC Pro $69.99/mo · API credits comparatively low
Midjourney V8.1Midjourney · USA API SaaS subscription >$1M revenue: Pro/Mega plan required No V8.1 · 06/2026 No Basic $10 / Standard $30 / Pro $60 / Mega $120 /mo Disney/Universal lawsuit · no
Recraft V4 / V4 ProRecraft · USA API free on paid · brand/vector focus No 05/2026 No $0.04 · Pro $0.25/image young vendor
Qwen-Image-2512Alibaba · China free · no thresholds No () 12/2025 Yes · 20B, quantised from ~24 GB weights free · opaque training data
HunyuanImage 3.0Tencent · China Hunyuan Community License EU/UK/South Korea use excluded by licence No 09/2025 (Yes · 80B , ≥3×80 GB) weights free (outside EU) Licence excludes EU/UK/South Korea
Video – generation
Veo 3.1Google DeepMind · USA API free · on Vertex AI 01/2026 · 4K No · EU region Frankfurt $0.05–0.75/s US vendor · EU region available
Sora 2OpenAI · USA API only until sunset + Watermark EOL 24.09.2026 No $0.10–0.70/s service being discontinued
Runway Gen-4.5Runway · USA API free on paid · enterprise contracts 12/2025 No ~$0.25/s · plans from $12/mo credit costs scale fast
Kling 3.0Kuaishou · China API free on paid · no on free tier 02/2026 · 4K No $0.08–0.17/s CN data routing · check
Seedance 2.0ByteDance · China API free on paid global since 04/2026 (BytePlus/fal) No ~$0.24/s (fal, 720p) CN vendor · check /data routing
Firefly VideoAdobe · USA API free · , licensed data () 2026 No Premium $199.99/mo (first-party unlim., promo to 08/26) comparatively low
Luma Ray3Luma AI · USA API free from paid plans no default Ray3 · 4K/HDR No Plus $30 / Pro $90 / Ultra $300 /mo smaller vendor
Wan 2.2Alibaba · China free · no royalties No () 2.2 · open (2025) Yes · 14B, from 24 GB ( from 16 GB) weights free · own governance needed
LTX-2Lightricks · Israel Community License (<$10M free) free below threshold optional 01/2026 · 4K + audio Yes · from RTX-4090 class weights free · API via fal/Replicate mind the threshold contractually
HunyuanVideo 1.5Tencent · China Hunyuan Community License EU/UK/South Korea use excluded by licence No 11/2025 (Yes · RTX 4090) weights free (outside EU) Licence excludes EU/UK/South Korea
Audio – voice, music & transcription
Eleven v3ElevenLabs · USA/UK API free from Starter · music (indie deals: Merlin/Kobalt) music: · : classifier v3 · 2026 No · EU data residency Starter $6 / Creator $22 / Pro $99 / Scale $299 / Business $990 /mo compliance ·
GPT-Realtime-2OpenAI · USA API free (API terms) No 05/2026 No · Azure EU possible $32/$64 per 1M audio tokens
Cartesia Sonic 3.5Cartesia · USA API free from paid plans No ~90 ms latency private deployments (enterprise) ~$35/1M characters young vendor
Azure AI SpeechMicrosoft · USA free · gated by approval available HD-Update 03/2026 Partial (containers) · EU regions $15–22/1M characters strong DACH compliance story
Fish Audio S2Fish Audio (OpenAudio) Research License commercial ONLY with licence agreement No open since 03/2026 Yes · 1 GPU weights free (non-commercial) licence ambiguity · cloning
Kokoro 82MHexgrad / Community free No #1 -Arena 01/26 Yes · CPU/small GPU (~300 MB) free · hosting costs in cents no enterprise support
NVIDIA Canary / ParakeetNVIDIA · USA · STT CC-BY-4.0 transcripts freely usable v2/v3 · 04/2026 Yes · 1 GPU · NIM support weights free · AI Enterprise plan optional ideal on-prem transcription
Suno v5.5Suno · USA · Music API from Pro plan · Warner deal, lawsuits open -Watermark licensed gen. 03/2026 No Pro $10 / Premier $30 /mo UMG/Sony lawsuits ongoing
Stable Audio 3.0Stability AI · UK · Music Open weights (Small/Med) + API free · fully licensed training data unclear 3.0 · 05/2026 Yes · open weights (Small/Med) usage-based · enterprise custom comparatively licence-safe music option
Workflow & orchestration – model-agnostic
ComfyUIComfy Org · open source Workflow tool GPL-3.0 depends on the loaded model depends on the model $30M round 04/2026 · Cloud + API nodes Yes · 1 GPU (model-dependent) free (GPL) · Comfy Cloud optional check GPL copyleft when embedding

Image – generation & design

FLUX.2 familyBlack Forest Labs · Germany
Licence
klein-4B: · dev: FLUX Non-Commercial
On-prem / hardware
Yes · dev ~80 GB · klein from consumer GPU
Price (indicative)
API from $0.03/MP · weights free
DACH risk
EU vendor · check licence tiers
Stable Diffusion 3.5Stability AI · UK
Licence
Community License
On-prem / hardware
Yes · consumer–high-end GPU
Price (indicative)
weights free · API credits
DACH risk
Getty: UK lost 11/25, US ongoing
GPT-Image-2OpenAI · USA
API
Licence
On-prem / hardware
No
Price (indicative)
$0.006–0.21/image
DACH risk
Nano Banana ProGoogle (Gemini 3 Pro Image) · USA
API
Licence
On-prem / hardware
No
Price (indicative)
$0.13/2K image · Imagen 4 from $0.02
DACH risk
Firefly Image 5Adobe · USA
API
Licence
On-prem / hardware
No · custom models possible
Price (indicative)
CC Pro $69.99/mo · API credits
DACH risk
comparatively low
Midjourney V8.1Midjourney · USA
API
Licence
SaaS subscription
On-prem / hardware
No
Price (indicative)
Basic $10 / Standard $30 / Pro $60 / Mega $120 /mo
DACH risk
Disney/Universal lawsuit · no
Recraft V4 / V4 ProRecraft · USA
API
Licence
On-prem / hardware
No
Price (indicative)
$0.04 · Pro $0.25/image
DACH risk
young vendor
Qwen-Image-2512Alibaba · China
Licence
On-prem / hardware
Yes · 20B, quantised from ~24 GB
Price (indicative)
weights free
DACH risk
· opaque training data
HunyuanImage 3.0Tencent · China
Licence
Hunyuan Community License
On-prem / hardware
(Yes · 80B , ≥3×80 GB)
Price (indicative)
weights free (outside EU)
DACH risk
Licence excludes EU/UK/South Korea

Video – generation

Veo 3.1Google DeepMind · USA
API
Licence
On-prem / hardware
No · EU region Frankfurt
Price (indicative)
$0.05–0.75/s
DACH risk
US vendor · EU region available
Sora 2OpenAI · USA
API
Licence
On-prem / hardware
No
Price (indicative)
$0.10–0.70/s
DACH risk
service being discontinued
Runway Gen-4.5Runway · USA
API
Licence
On-prem / hardware
No
Price (indicative)
~$0.25/s · plans from $12/mo
DACH risk
credit costs scale fast
Kling 3.0Kuaishou · China
API
Licence
On-prem / hardware
No
Price (indicative)
$0.08–0.17/s
DACH risk
CN data routing · check
Seedance 2.0ByteDance · China
API
Licence
On-prem / hardware
No
Price (indicative)
~$0.24/s (fal, 720p)
DACH risk
CN vendor · check /data routing
Firefly VideoAdobe · USA
API
Licence
On-prem / hardware
No
Price (indicative)
Premium $199.99/mo (first-party unlim., promo to 08/26)
DACH risk
comparatively low
Luma Ray3Luma AI · USA
API
Licence
On-prem / hardware
No
Price (indicative)
Plus $30 / Pro $90 / Ultra $300 /mo
DACH risk
smaller vendor
Wan 2.2Alibaba · China
Licence
On-prem / hardware
Yes · 14B, from 24 GB ( from 16 GB)
Price (indicative)
weights free
DACH risk
· own governance needed
LTX-2Lightricks · Israel
Licence
Community License (<$10M free)
On-prem / hardware
Yes · from RTX-4090 class
Price (indicative)
weights free · API via fal/Replicate
DACH risk
mind the threshold contractually
HunyuanVideo 1.5Tencent · China
Licence
Hunyuan Community License
On-prem / hardware
(Yes · RTX 4090)
Price (indicative)
weights free (outside EU)
DACH risk
Licence excludes EU/UK/South Korea

Audio – voice, music & transcription

Eleven v3ElevenLabs · USA/UK
API
Licence
On-prem / hardware
No · EU data residency
Price (indicative)
Starter $6 / Creator $22 / Pro $99 / Scale $299 / Business $990 /mo
DACH risk
compliance ·
GPT-Realtime-2OpenAI · USA
API
Licence
On-prem / hardware
No · Azure EU possible
Price (indicative)
$32/$64 per 1M audio tokens
DACH risk
Cartesia Sonic 3.5Cartesia · USA
API
Licence
On-prem / hardware
private deployments (enterprise)
Price (indicative)
~$35/1M characters
DACH risk
young vendor
Azure AI SpeechMicrosoft · USA
Licence
On-prem / hardware
Partial (containers) · EU regions
Price (indicative)
$15–22/1M characters
DACH risk
strong DACH compliance story
Fish Audio S2Fish Audio (OpenAudio)
Licence
Research License
On-prem / hardware
Yes · 1 GPU
Price (indicative)
weights free (non-commercial)
DACH risk
licence ambiguity · cloning
Kokoro 82MHexgrad / Community
Licence
On-prem / hardware
Yes · CPU/small GPU (~300 MB)
Price (indicative)
free · hosting costs in cents
DACH risk
no enterprise support
NVIDIA Canary / ParakeetNVIDIA · USA · STT
Licence
CC-BY-4.0
On-prem / hardware
Yes · 1 GPU · NIM support
Price (indicative)
weights free · AI Enterprise plan optional
DACH risk
ideal on-prem transcription
Suno v5.5Suno · USA · Music
API
Licence
On-prem / hardware
No
Price (indicative)
Pro $10 / Premier $30 /mo
DACH risk
UMG/Sony lawsuits ongoing
Stable Audio 3.0Stability AI · UK · Music
Licence
Open weights (Small/Med) + API
On-prem / hardware
Yes · open weights (Small/Med)
Price (indicative)
usage-based · enterprise custom
DACH risk
comparatively licence-safe music option

Workflow & orchestration – model-agnostic

ComfyUIComfy Org · open source
Workflow tool
Licence
GPL-3.0
On-prem / hardware
Yes · 1 GPU (model-dependent)
Price (indicative)
free (GPL) · Comfy Cloud optional
DACH risk
check GPL copyleft when embedding
  • Prices indicative: vendor list prices, before volume discounts.
  • Compiled editorially: as of 25 Jun 2026; points that could not be fully verified are rated conservatively.
  • Not legal advice: have licence and rights questions reviewed individually.
  • ComfyUI: listed as a model-agnostic workflow tool; output rights and provenance follow the model loaded into it.

Information in this market changes frequently and at short notice; provided without warranty and without claim to completeness. For binding terms, always consult the official vendor page.

Looking for coding agents and LLM tooling instead of media models? Open the AI coding tools comparison →

03 · Context

Tooling around the models.

Beyond the models there is an ecosystem of interfaces and platforms, deliberately kept out of the matrix because they are not a model decision in themselves.

Workflow interfaces

ComfyUI is the de-facto standard for node-based media pipelines (hence the only tool in the matrix). Alternatives such as Invoke or SwarmUI can be easier for some teams, the model question stays the same.

Inference platforms

fal.ai, Replicate and the like host open models behind an API, fast start without your own GPUs, but data flows through the platform vendor. No answer for on-prem requirements, often the best one for prototypes.

Suite integrations

Adobe, Canva and increasingly DAM systems embed models directly into creative workflows. Convenient for teams, the rights and provenance questions from the matrix apply unchanged.

04 ·

Which model fits your initiative?

Reply within 24 h · hourly basis · no lock-in

05 · Key terms explained

Key terms explained (21) +

The most important technical terms from this comparison, explained neutrally.

Content Credentials
Adobe's implementation of the C2PA standard: visible provenance labels on images and videos, who created and edited what, with which tool.
IP indemnity
The vendor indemnifies you against third-party claims if generated content infringes someone else's rights, a key procurement criterion for media AI.
voice cloning
Replicating a real person's voice from short audio samples. Legally sensitive: consent and EU AI Act transparency duties apply.
Open-Weights
The model weights are freely available, the model can run on your own hardware. Not necessarily open source in the strict sense; check the licence in detail.
indemnity
The vendor assumes liability and defence costs if third-party claims arise from generated content.
watermark
Watermarks in AI content, visible or invisible (e.g. SynthID). Helps with labelling duties; depending on the vendor it may not be switchable.
SynthID
Google's invisible, robust watermark for AI-generated images, video and audio, always on for Google models and machine-verifiable.
AI Act
EU AI regulation: governs transparency and labelling duties for AI-generated content and deepfakes. Effectively relevant for Swiss vendors with EU customers too.
GDPR
EU General Data Protection Regulation. Relevant for Swiss companies alongside the FADP as soon as data of EU persons is processed.
C2PA
Open industry standard (Adobe, Microsoft, Google et al.) for cryptographically signed provenance data in media files, the technical basis for provenance duties.
GGUF
Compressed format for quantised AI models, cuts memory needs substantially and lets large models run on smaller hardware.
ARR
Annual recurring revenue, some «free» licences only apply below an ARR threshold.
self-host
Running the software/model on your own infrastructure instead of the vendor's, full data control, full operational burden.
TTS
Text-to-speech synthesis. Enterprise criteria: voice quality, latency, languages, cloning controls.
US Cloud Act
US law that can oblige American providers to hand data to US authorities, even when servers are located in Europe. Relevant for Swiss data-protection assessments.
MoE
Mixture of Experts, a model architecture where only part of the model is active per request. High capability at lower running cost.
Proprietary
Closed licence: usage only via the provider, no insight into model or code, no self-hosting.
Apache 2.0
Permissive open-source licence including commercial use and a patent-protection clause.
CN origin
Chinese vendor: uncritical when self-hosting the open weights; Chinese data law applies when using the vendor's API.
Hybrid
Flexible deployment: cloud, your own private cloud, or fully in your own data centre.