Decision aid for marketing, communications & IT
AI models for image, video and audio comparison matrix 2026
Generative media models are procurement-ready in 2026, but the risks differ from coding tools: output rights, provenance obligations and licence pitfalls decide, not benchmarks.
This matrix compares 28 image, video and audio models plus one model-agnostic workflow tool against the criteria that actually derail adoption in Switzerland and the EU: may the outputs be used commercially? Is there IP indemnification? Can the model run on-prem, and does the licence even apply in the EU?
01 · Context
Three media, one procurement problem.
What changed substantially from 2025 to 2026, and why the selection criteria differ from LLMs.
Image: multimodal models like GPT-Image-2 and Google's Nano Banana Pro have retired the old exclusion criterion of unreadable text. Among open weights, FLUX.2 klein (Apache 2.0, from Germany) and Qwen-Image make commercial self-hosting comparatively licence-clear for the first time, while Tencent's Hunyuan licences exclude use in the EU, UK and South Korea under their licence terms (check the licence case by case).
Video: the biggest break is OpenAI's retreat, the Sora 2 API is set to sunset per OpenAI (communicated date: 24 Sep 2026; as of June 2026, subject to change). Google Veo 3.1 (SynthID, EU region Frankfurt), Runway and Adobe take over the enterprise field; on-prem, Alibaba Wan 2.2 and Lightricks LTX-2 have become viable for pilot and production scenarios.
Audio: in music the legal situation is shifting, after the first label settlements (Warner ↔ Suno) licensed models with C2PA watermarks are gaining ground, while Sony and UMG keep litigating or negotiating; a landmark decision in the case against Suno is expected in summer 2026 (a summary-judgment hearing). Voice is shifting to real-time agents; GDPR-compliant on-prem transcription has become technically straightforward with NVIDIA's open models; data protection, operations and model governance still need case-by-case review.
02 ·
Media models: rights, provenance, on-prem
The dimensions on which media-AI procurement in Switzerland and the EU fails or stalls.
| Model | Type | Licence | Output rights | Provenance | Status | On-prem / hardware | Price (indicative) | DACH risk |
|---|---|---|---|---|---|---|---|---|
| Image – generation & design | ||||||||
| FLUX.2 familyBlack Forest Labs · Germany | klein-4B: · dev: FLUX Non-Commercial | klein & API free · dev needs paid licence | API: yes · : no | 11/25 · klein 01/26 | Yes · dev ~80 GB · klein from consumer GPU | API from $0.03/MP · weights free | EU vendor · check licence tiers | |
| Stable Diffusion 3.5Stability AI · UK | Community License | free below revenue threshold | unclear | 3.5 · 10/2024 | Yes · consumer–high-end GPU | weights free · API credits | Getty: UK lost 11/25, US ongoing | |
| GPT-Image-2OpenAI · USA | API | free · () | 04/2026 | No | $0.006–0.21/image | |||
| Nano Banana ProGoogle (Gemini 3 Pro Image) · USA | API | free · on Vertex AI | (always on) | GA 05/2026 · 4K preview | No | $0.13/2K image · Imagen 4 from $0.02 | ||
| Firefly Image 5Adobe · USA | API | free · full | () | 03/2026 | No · custom models possible | CC Pro $69.99/mo · API credits | comparatively low | |
| Midjourney V8.1Midjourney · USA | API | SaaS subscription | >$1M revenue: Pro/Mega plan required | No | V8.1 · 06/2026 | No | Basic $10 / Standard $30 / Pro $60 / Mega $120 /mo | Disney/Universal lawsuit · no |
| Recraft V4 / V4 ProRecraft · USA | API | free on paid · brand/vector focus | No | 05/2026 | No | $0.04 · Pro $0.25/image | young vendor | |
| Qwen-Image-2512Alibaba · China | free · no thresholds | No () | 12/2025 | Yes · 20B, quantised from ~24 GB | weights free | · opaque training data | ||
| HunyuanImage 3.0Tencent · China | Hunyuan Community License | EU/UK/South Korea use excluded by licence | No | 09/2025 | (Yes · 80B , ≥3×80 GB) | weights free (outside EU) | Licence excludes EU/UK/South Korea | |
| Video – generation | ||||||||
| Veo 3.1Google DeepMind · USA | API | free · on Vertex AI | 01/2026 · 4K | No · EU region Frankfurt | $0.05–0.75/s | US vendor · EU region available | ||
| Sora 2OpenAI · USA | API | only until sunset | + Watermark | EOL 24.09.2026 | No | $0.10–0.70/s | service being discontinued | |
| Runway Gen-4.5Runway · USA | API | free on paid · enterprise contracts | 12/2025 | No | ~$0.25/s · plans from $12/mo | credit costs scale fast | ||
| Kling 3.0Kuaishou · China | API | free on paid · no | on free tier | 02/2026 · 4K | No | $0.08–0.17/s | CN data routing · check | |
| Seedance 2.0ByteDance · China | API | free on paid | global since 04/2026 (BytePlus/fal) | No | ~$0.24/s (fal, 720p) | CN vendor · check /data routing | ||
| Firefly VideoAdobe · USA | API | free · , licensed data | () | 2026 | No | Premium $199.99/mo (first-party unlim., promo to 08/26) | comparatively low | |
| Luma Ray3Luma AI · USA | API | free from paid plans | no default | Ray3 · 4K/HDR | No | Plus $30 / Pro $90 / Ultra $300 /mo | smaller vendor | |
| Wan 2.2Alibaba · China | free · no royalties | No () | 2.2 · open (2025) | Yes · 14B, from 24 GB ( from 16 GB) | weights free | · own governance needed | ||
| LTX-2Lightricks · Israel | Community License (<$10M free) | free below threshold | optional | 01/2026 · 4K + audio | Yes · from RTX-4090 class | weights free · API via fal/Replicate | mind the threshold contractually | |
| HunyuanVideo 1.5Tencent · China | Hunyuan Community License | EU/UK/South Korea use excluded by licence | No | 11/2025 | (Yes · RTX 4090) | weights free (outside EU) | Licence excludes EU/UK/South Korea | |
| Audio – voice, music & transcription | ||||||||
| Eleven v3ElevenLabs · USA/UK | API | free from Starter · music (indie deals: Merlin/Kobalt) | music: · : classifier | v3 · 2026 | No · EU data residency | Starter $6 / Creator $22 / Pro $99 / Scale $299 / Business $990 /mo | compliance · | |
| GPT-Realtime-2OpenAI · USA | API | free (API terms) | No | 05/2026 | No · Azure EU possible | $32/$64 per 1M audio tokens | ||
| Cartesia Sonic 3.5Cartesia · USA | API | free from paid plans | No | ~90 ms latency | private deployments (enterprise) | ~$35/1M characters | young vendor | |
| Azure AI SpeechMicrosoft · USA | free · gated by approval | available | HD-Update 03/2026 | Partial (containers) · EU regions | $15–22/1M characters | strong DACH compliance story | ||
| Fish Audio S2Fish Audio (OpenAudio) | Research License | commercial ONLY with licence agreement | No | open since 03/2026 | Yes · 1 GPU | weights free (non-commercial) | licence ambiguity · cloning | |
| Kokoro 82MHexgrad / Community | free | No | #1 -Arena 01/26 | Yes · CPU/small GPU (~300 MB) | free · hosting costs in cents | no enterprise support | ||
| NVIDIA Canary / ParakeetNVIDIA · USA · STT | CC-BY-4.0 | transcripts freely usable | – | v2/v3 · 04/2026 | Yes · 1 GPU · NIM support | weights free · AI Enterprise plan optional | ideal on-prem transcription | |
| Suno v5.5Suno · USA · Music | API | from Pro plan · Warner deal, lawsuits open | -Watermark | licensed gen. 03/2026 | No | Pro $10 / Premier $30 /mo | UMG/Sony lawsuits ongoing | |
| Stable Audio 3.0Stability AI · UK · Music | Open weights (Small/Med) + API | free · fully licensed training data | unclear | 3.0 · 05/2026 | Yes · open weights (Small/Med) | usage-based · enterprise custom | comparatively licence-safe music option | |
| Workflow & orchestration – model-agnostic | ||||||||
| ComfyUIComfy Org · open source | Workflow tool | GPL-3.0 | depends on the loaded model | depends on the model | $30M round 04/2026 · Cloud + API nodes | Yes · 1 GPU (model-dependent) | free (GPL) · Comfy Cloud optional | check GPL copyleft when embedding |
Image – generation & design
- Licence
- klein-4B: · dev: FLUX Non-Commercial
- On-prem / hardware
- Yes · dev ~80 GB · klein from consumer GPU
- Price (indicative)
- API from $0.03/MP · weights free
- DACH risk
- EU vendor · check licence tiers
- Licence
- Community License
- On-prem / hardware
- Yes · consumer–high-end GPU
- Price (indicative)
- weights free · API credits
- DACH risk
- Getty: UK lost 11/25, US ongoing
- Licence
- On-prem / hardware
- No
- Price (indicative)
- $0.006–0.21/image
- DACH risk
- Licence
- On-prem / hardware
- No
- Price (indicative)
- $0.13/2K image · Imagen 4 from $0.02
- DACH risk
- Licence
- On-prem / hardware
- No · custom models possible
- Price (indicative)
- CC Pro $69.99/mo · API credits
- DACH risk
- comparatively low
- Licence
- SaaS subscription
- On-prem / hardware
- No
- Price (indicative)
- Basic $10 / Standard $30 / Pro $60 / Mega $120 /mo
- DACH risk
- Disney/Universal lawsuit · no
- Licence
- On-prem / hardware
- No
- Price (indicative)
- $0.04 · Pro $0.25/image
- DACH risk
- young vendor
- Licence
- On-prem / hardware
- Yes · 20B, quantised from ~24 GB
- Price (indicative)
- weights free
- DACH risk
- · opaque training data
- Licence
- Hunyuan Community License
- On-prem / hardware
- (Yes · 80B , ≥3×80 GB)
- Price (indicative)
- weights free (outside EU)
- DACH risk
- Licence excludes EU/UK/South Korea
Video – generation
- Licence
- On-prem / hardware
- No · EU region Frankfurt
- Price (indicative)
- $0.05–0.75/s
- DACH risk
- US vendor · EU region available
- Licence
- On-prem / hardware
- No
- Price (indicative)
- $0.10–0.70/s
- DACH risk
- service being discontinued
- Licence
- On-prem / hardware
- No
- Price (indicative)
- ~$0.25/s · plans from $12/mo
- DACH risk
- credit costs scale fast
- Licence
- On-prem / hardware
- No
- Price (indicative)
- $0.08–0.17/s
- DACH risk
- CN data routing · check
- Licence
- On-prem / hardware
- No
- Price (indicative)
- ~$0.24/s (fal, 720p)
- DACH risk
- CN vendor · check /data routing
- Licence
- On-prem / hardware
- No
- Price (indicative)
- Premium $199.99/mo (first-party unlim., promo to 08/26)
- DACH risk
- comparatively low
- Licence
- On-prem / hardware
- No
- Price (indicative)
- Plus $30 / Pro $90 / Ultra $300 /mo
- DACH risk
- smaller vendor
- Licence
- On-prem / hardware
- Yes · 14B, from 24 GB ( from 16 GB)
- Price (indicative)
- weights free
- DACH risk
- · own governance needed
- Licence
- Community License (<$10M free)
- On-prem / hardware
- Yes · from RTX-4090 class
- Price (indicative)
- weights free · API via fal/Replicate
- DACH risk
- mind the threshold contractually
- Licence
- Hunyuan Community License
- On-prem / hardware
- (Yes · RTX 4090)
- Price (indicative)
- weights free (outside EU)
- DACH risk
- Licence excludes EU/UK/South Korea
Audio – voice, music & transcription
- Licence
- On-prem / hardware
- No · EU data residency
- Price (indicative)
- Starter $6 / Creator $22 / Pro $99 / Scale $299 / Business $990 /mo
- DACH risk
- compliance ·
- Licence
- On-prem / hardware
- No · Azure EU possible
- Price (indicative)
- $32/$64 per 1M audio tokens
- DACH risk
- Licence
- On-prem / hardware
- private deployments (enterprise)
- Price (indicative)
- ~$35/1M characters
- DACH risk
- young vendor
- Licence
- On-prem / hardware
- Partial (containers) · EU regions
- Price (indicative)
- $15–22/1M characters
- DACH risk
- strong DACH compliance story
- Licence
- Research License
- On-prem / hardware
- Yes · 1 GPU
- Price (indicative)
- weights free (non-commercial)
- DACH risk
- licence ambiguity · cloning
- Licence
- On-prem / hardware
- Yes · CPU/small GPU (~300 MB)
- Price (indicative)
- free · hosting costs in cents
- DACH risk
- no enterprise support
- Licence
- CC-BY-4.0
- On-prem / hardware
- Yes · 1 GPU · NIM support
- Price (indicative)
- weights free · AI Enterprise plan optional
- DACH risk
- ideal on-prem transcription
- Licence
- On-prem / hardware
- No
- Price (indicative)
- Pro $10 / Premier $30 /mo
- DACH risk
- UMG/Sony lawsuits ongoing
- Licence
- Open weights (Small/Med) + API
- On-prem / hardware
- Yes · open weights (Small/Med)
- Price (indicative)
- usage-based · enterprise custom
- DACH risk
- comparatively licence-safe music option
Workflow & orchestration – model-agnostic
- Licence
- GPL-3.0
- On-prem / hardware
- Yes · 1 GPU (model-dependent)
- Price (indicative)
- free (GPL) · Comfy Cloud optional
- DACH risk
- check GPL copyleft when embedding
- → Prices indicative: vendor list prices, before volume discounts.
- → Compiled editorially: as of 25 Jun 2026; points that could not be fully verified are rated conservatively.
- ⚠ Not legal advice: have licence and rights questions reviewed individually.
- → ComfyUI: listed as a model-agnostic workflow tool; output rights and provenance follow the model loaded into it.
Information in this market changes frequently and at short notice; provided without warranty and without claim to completeness. For binding terms, always consult the official vendor page.
Looking for coding agents and LLM tooling instead of media models? Open the AI coding tools comparison →03 · Context
Tooling around the models.
Beyond the models there is an ecosystem of interfaces and platforms, deliberately kept out of the matrix because they are not a model decision in themselves.
Workflow interfaces
ComfyUI is the de-facto standard for node-based media pipelines (hence the only tool in the matrix). Alternatives such as Invoke or SwarmUI can be easier for some teams, the model question stays the same.
Inference platforms
fal.ai, Replicate and the like host open models behind an API, fast start without your own GPUs, but data flows through the platform vendor. No answer for on-prem requirements, often the best one for prototypes.
Suite integrations
Adobe, Canva and increasingly DAM systems embed models directly into creative workflows. Convenient for teams, the rights and provenance questions from the matrix apply unchanged.
04 ·
Which model fits your initiative?
05 · Key terms explained
Key terms explained (21) +
The most important technical terms from this comparison, explained neutrally.
- Content Credentials
- Adobe's implementation of the C2PA standard: visible provenance labels on images and videos, who created and edited what, with which tool.
- Copyright Shield
- OpenAI's commitment to defend business and API customers against copyright claims arising from generated content and to cover the costs.
- IP indemnity
- The vendor indemnifies you against third-party claims if generated content infringes someone else's rights, a key procurement criterion for media AI.
- voice cloning
- Replicating a real person's voice from short audio samples. Legally sensitive: consent and EU AI Act transparency duties apply.
- Open-Weights
- The model weights are freely available, the model can run on your own hardware. Not necessarily open source in the strict sense; check the licence in detail.
- indemnity
- The vendor assumes liability and defence costs if third-party claims arise from generated content.
- watermark
- Watermarks in AI content, visible or invisible (e.g. SynthID). Helps with labelling duties; depending on the vendor it may not be switchable.
- SynthID
- Google's invisible, robust watermark for AI-generated images, video and audio, always on for Google models and machine-verifiable.
- AI Act
- EU AI regulation: governs transparency and labelling duties for AI-generated content and deepfakes. Effectively relevant for Swiss vendors with EU customers too.
- GDPR
- EU General Data Protection Regulation. Relevant for Swiss companies alongside the FADP as soon as data of EU persons is processed.
- C2PA
- Open industry standard (Adobe, Microsoft, Google et al.) for cryptographically signed provenance data in media files, the technical basis for provenance duties.
- GGUF
- Compressed format for quantised AI models, cuts memory needs substantially and lets large models run on smaller hardware.
- ARR
- Annual recurring revenue, some «free» licences only apply below an ARR threshold.
- self-host
- Running the software/model on your own infrastructure instead of the vendor's, full data control, full operational burden.
- TTS
- Text-to-speech synthesis. Enterprise criteria: voice quality, latency, languages, cloning controls.
- US Cloud Act
- US law that can oblige American providers to hand data to US authorities, even when servers are located in Europe. Relevant for Swiss data-protection assessments.
- MoE
- Mixture of Experts, a model architecture where only part of the model is active per request. High capability at lower running cost.
- Proprietary
- Closed licence: usage only via the provider, no insight into model or code, no self-hosting.
- Apache 2.0
- Permissive open-source licence including commercial use and a patent-protection clause.
- CN origin
- Chinese vendor: uncritical when self-hosting the open weights; Chinese data law applies when using the vendor's API.
- Hybrid
- Flexible deployment: cloud, your own private cloud, or fully in your own data centre.