Loading article…
Google unveils Gemini Omni Flash, a multimodal video generator with 10‑second clips, avatars and default SynthID watermark, rolling out to AI Plus subscribers
Google introduced Gemini Omni Flash, the first model in DeepMind’s new Omni family, at the I/O 2026 developer conference on Tuesday, making it immediately available in the Gemini app, Google Flow for AI Plus, Pro and Ultra users, and free in YouTube Shorts and the YouTube Create app [1]. The model can generate and edit video from any mix of image, audio, video and text inputs, with edits handled conversationally so that characters, physics and scene context persist across turns [1].
Omni’s chief architect, Koray Kavukcuoglu, framed the launch as a step toward “creating anything from any input,” emphasizing three claims: a better grasp of physical forces such as gravity and fluid dynamics; deeper world knowledge that links language, imagery and meaning; and a conversational editing layer that avoids the drift seen in earlier video models [1]. Demo prompts ranged from clay‑mation protein‑folding explainers to chain‑reaction physics tracks, showcasing the model’s ability to blend multimodal cues into coherent short clips.
The rollout includes a digital‑avatar feature that lets users record their voice and likeness to generate videos that look and sound like them, though Google says the voice‑editing capability is still being tested for responsible use [1]. All output carries Google’s SynthID watermark by default, enabling verification through the Gemini app, Chrome or Search, and aligning with the C2PA standard adopted earlier by OpenAI [1].
While the Flash tier caps clips at 10 seconds—a deployment choice rather than a technical limit—it already trails OpenAI’s Sora, which permits up to 60 seconds [1]. Google has not disclosed pricing, compute costs, or benchmark scores against competing models such as ByteDance’s Seedance, leaving analysts to wonder whether Omni represents a new product category or a tighter integration of existing frontier‑video capabilities [1].
The next milestone will be the API release for developers and enterprise customers in the coming weeks, where cost structures and longer‑clip limits will become clear. Until then, the industry will watch how Omni’s multimodal, conversational workflow reshapes video creation and whether its watermark and avatar safeguards address growing concerns over deep‑fake misuse.
Coverage is mostly measured — 247 of 300 reports stay neutral.
Every Monday — the token unlocks, Fed dates & catalysts set to move crypto and markets this week. So you’re never blindsided.
Free · 3-min read · one-click unsubscribe
AI-assisted synthesis by the TrendWatcher Editorial Desk · sourced from 3 outlets · Jun 15, 2026 · How we report
It is designed for casual exploration of virtual skies and is available globally through a web browser, though it is labeled as an experimental feature.
The Pixel 10a is priced at £409 for the Obsidian 128 GB model, representing a £91 discount from its standard price.
Google Search includes Gemini 3.5 Flash for enhanced coding assistance, interactive visual generation, dynamic tool creation, and expanded booking functions.
Lens can identify products for shopping, translate text in over 100 languages, recognize songs, and display AR representations of real-world objects.
It is an audio model that enables real-time, natural speech translation across more than 70 languages in Google Translate, Meet, and AI Studio.