Google’s new Gemma 3 AI models are fast, frugal, and ready for phones

Table of Contents

Table of Contents

Ready for mobile devices

Versatile, and ready to deploy

Google’s AI efforts are synonymous with Gemini, which has now become an integral element of its most popular products across the Worksuite software and hardware, as well. However, the company has also released multiple open-source AI models under the Gemma label for over a year now.

Today, Google revealed its third generation open-source AI models with some impressive claims in tow. The Gemma 3 models come in four variants — 1 billion, 4 billion, 12 billion, and 27 billion parameters — and are designed to run on devices ranging from smartphones to beefy workstations.

Recommended Videos

Ready for mobile devices

Google

Google says Gemma 3 is the world’s best single-accelerator model, which means it can run on a single GPU or TPU instead of requiring a whole cluster. Theoretically, that means a Gemma 3 AI model can natively run on the Pixel smartphone’s Tensor Processing Core (TPU) unit, just the way it runs the Gemini Nano model locally on phones.

The biggest advantage of Gemma 3 over the Gemini family of AI models is that since it’s open-source, developers can package and ship it according to their unique requirements inside mobile apps and desktop software. Another crucial benefit is that Gemma supports over 140 languages, with 35 of them coming as part of a pre-trained package.

What’s new in Gemma 3?

And just like the latest Gemini 2.0 series models, Gemma 3 is also capable of understanding text, images, and videos. In a nutshell, it is multi-multimdal. On the performance side, Gemma 3 is claimed to surpass other popular open-source AI models such as DeepSeek V3, the reasoning-ready OpenAI o3-mini, and Meta’s Llama-405B variant.

Versatile, and ready to deploy

Taking about input range, Gemma 3 offers a context window worth 128,000 tokens. That’s enough to cover a full 200-page book pushed as an input. For comparison, the context window for Google’s Gemini 2.0 Flash Lite model stands at a million tokens. In the context of AI models, an average English language word is roughly equivalent to 1.3 tokens.

Gemma 3 processing visual input. Google

Gemma 3 also supports function calling and structured output, which essentially means it can interact with external datasets and perform tasks like an automated agent. The nearest analogy would be Gemini, and how it can get work done across different platforms such as Gmail or Docs seamlessly.

The latest open-source AI models from Google can either be deployed locally, or through the company’s cloud-based platforms such as the Vertex AI suite. Gemma 3 AI models are now available via the Google AI Studio, as well as third-party repositories such as Hugging Face, Ollama, and Kaggle.

Google

Gemma 3 is part of an industry trend where companies are working on Large Language Models (Gemini, in Google’s case) and simultaneously pushing out small language models (SLMs), as well. Microsoft also follows a similar strategy with its open-source Phi series of small language models.

Small language models such as Gemma and Phi are extremely resource efficient, which makes them an ideal choice for running on devices such as smartphones. Moroever, as they offer a lower latency, they are particularly well-suited for mobile applications.

Editors’ Recommendations

  • Watch this AI-driven Maserati go insanely fast for new speed record

  • OpenAI launches GPT-4.5 AI model with deeper knowledge and emotions

  • xAI’s Grok-3 is free for a short time. I tried it, and I’m impressed

  • The Gemini app is now the only way to access Google’s AI on iOS

  • xAI’s Grok-3 is impressive, but it needs to do a lot more to convince me




Related posts

Latest posts

Water-cooled ‘laptop’ can house desktop parts, because why not

This monster can handle up to 720W of power with its built-in liquid cooling solution.

Motorola Edge 60 Fusion leak teases an extra camera and cool AI chops

Purported marketing visuals for the upcoming Motorola Edge 60 Fusion reveal a design refresh, triple rear cameras, and Moto AI stack running out of the box.

Microsoft patches an ‘extraordinary’ number of zero-day security vulnerabilities

Microsoft has released a patch for a large number of zero-day vulnerabilities in Windows.

Where to buy the AMD Ryzen 9 9950X3D: new gaming CPUs at no extra cost

AMD's latest gaming CPUs, the Ryzen 9 9950X3D and the Ryzen 9 9900X3D, are launching today. Here's where to score one.

AMD takes lead over Nvidia, but how long will it last?

A new survey shows that AMD appears to have done a good job of delivering its latest GPUs to retail shelves.

Google reacts to questionable shopping Chrome extensions

A new update to Google's Chrome Web Store policy should help protect shoppers from dubious affiliate marketing extensions.

Images and videos not loading in Google Messages? A fix is on the way

If you've been having problems with the Google Messages app lately, good news: It's about to get better.

You will soon be able to chat with Gemini Live in two languages at once

Gemini Live will soon allow you to go back and forth between two languages.

I’d love to get the Mac Studio for gaming, but one thing is holding me back

Apple’s Mac Studio with M4 Max chip offers even better gaming performance than the M3 Ultra model, but it doesn’t solve the biggest problem with Mac gaming.

Our favorite ultrawide gaming monitor has a great discount today

We tagged the Alienware 34 QD-OLED as the best ultrawide gaming monitor, so you wouldn't want to miss this chance to get it with a $250 discount from Dell.