What’s new in Gemma 4 — benchmark.space

Google for Developers

What’s new in Gemma 4

2026-04-02 2min 342,474 views watch on youtube →

Channel: Google for Developers

Date: 2026-04-02

Duration: 2min

URL: https://www.youtube.com/watch?v=jZVBoFOJK-Q

Meet Gemma 4, our most intelligent open models to date. Purpose-built for advanced reasoning and agentic workflows, Gemma 4 delivers an unprecedented level of intelligence-per-parameter.

In this video, Olivier Lacombe (Group Product Manager, Google DeepMind) introduces the Gemma 4 model family. We also showcase multiple demos of Gemma 4 being used across a wide variety of hardware, from mobile to laptops.

Resources:

Announcing Gemma 4 → https://goo.gle/gemma-4

Gemma 4 documentation → https://g

0:00

Hi, my name is Olivier, and I'm a group product manager on the Gemma team. Since we launched our first models, the developer community has

absolutely blown us away. Over 400 million downloads,

over 100,000 variants. You've built a vibrant

ecosystem around Gemma and we could not be more grateful. We've listened very closely

to what you wanted next, and today we are thrilled

to announce Gemma 4. Built from the same world-class research and technology behind Gemini 3, Gemma 4 is our family of open models, designed to run directly on the hardware you own, phone, laptops and desktop. For the first time ever, we are releasing Gemma under an open source Apache 2.0 license. Gemma 4 is built for the agentic era. It can handle complex logic, multi-step planning, and agentic workflows, making optimal use of tokens for its intelligence. The bigger model performs well with a context window of up to a quarter million tokens, allowing you to analyze entire code bases,

1:01

and/or multi-turn agentic use cases. It features native support for tool use, allowing you to build agents

that plan and act on your behalf. Let's break down the model family now. First, a 26B Mixture of

experts, and a 31B Dense models. These provide frontier intelligence directly on your personal computer. You can run state-of-the-art

local reasoning and coding pipeline without

needing to upload data outside of your control environment. The 26B MOE with 3.8B in activated parameters is exceptionally fast, While 31B is optimized for output quality. Then, we have our effective 2B

and effective 4B models. Engineered for maximum memory efficiency, these models bring a whole new level of intelligence to mobile and IOT devices. With combined audio and vision support for real-time processing, we can see and hear the world. All of this while natively

supporting over 140 languages. Now, let's test our

effective 2B on multilingual and agentic task.

2:03

Hé Gemma, est-ce que tu peux trouver un restaurant français à San Francisco? Please reply to me in English. Amazing. We have a winner. As open models become more central to enterprise infrastructure, security is paramount.

Developed by Google DeepMind, Gemma 4 undergoes the same

rigorous security protocols as our proprietary models,

giving enterprise and developers a trusted foundation to build on top of. We want you to be able to use Gemma 4 with the tools you already know and love. You can download the weights

and start experimenting today. We cannot wait to see

what you create next.