Robots That Move On Their Own Even Without Internet? Changes Brought by Google's New 'On-Device' AI

AI Summary

The 'Gemini Robotics On-Device' AI, which runs directly inside robots without an internet connection, has been released, signaling the arrival of faster and more agile robots.

Imagine this. A situation where a robot must perform an urgent rescue operation in a factory where the power is out and all internet is cut, or in a deep underground facility where even communication signals cannot be picked up. Until now, most robots had their ‘brain’—the artificial intelligence (AI)—located in distant, massive computers (the cloud). Therefore, if the internet was cut, they would become ‘dead’ and unable to do anything. It was like having your head in Seoul and your body in Busan, but the phone line was cut.

But now, an era is opening where robots can see, judge, and move on their own without the ‘lifeline’ of the internet. This is thanks to ‘Gemini Robotics On-Device,’ a new AI model announced by Google DeepMind. Gemini Robotics On-Device brings AI to local robotic devices

Why Is This Important?

Have you ever experienced a delay when calling an assistant AI on your smartphone? This is because your voice has to travel over the internet to a distant server and return with an answer. In technical terms, this is called Latency.

While a 1-2 second delay isn’t a big deal in casual conversation, a 1-second delay for a robot moving heavy objects or performing precise assembly could lead to a major accident. ‘Gemini Robotics On-Device’ runs AI directly using a local GPU located inside the robot’s body. Google announces ‘GeminiRoboticsOn-Device… - GIGAZINE

To use an analogy, if previous robots were like children who had to call and ask, “Mom, where do I put this?” every time, they have now become ‘independent adults’ with the ability to judge for themselves. This allows robots to operate without stopping even in places where the internet connection is unstable or non-existent. Above all, they can react instantaneously, enabling much more agile and safe movements. DeepMind’s Gemini Robotics On-Device brings advanced AI to local robots

Easy to Understand: The Robot’s ‘Eyes, Mouth, and Hands’ Become One

There is a core concept you must know to understand this technology: the VLA (Vision-Language-Action) model. PDFGemini Robotics On-Device Model Card

Simply put, it is like a system where an experienced chef’s ‘eyes,’ ‘brain,’ and ‘hands’ are perfectly connected as one.

Vision: The robot recognizes the materials and tools in front of it in real-time through its eyes (camera).
Language: It perfectly understands natural human commands like “Peel the apple and put it on a plate.”
Action: It immediately performs precise movements, such as moving its arm to pick up the apple and use a knife according to the command.

Previously, these processes either operated separately or required help from the cloud, but Gemini Robotics On-Device processes all of them at once inside the robot. Gemini Robotics On-Device: Robotics AI Autonomy to the… - KingyAI Through this, robots can exhibit ‘Dexterity’ (the ability of a robot to handle objects delicately) just like a human and quickly adapt even to tasks they encounter for the first time. Gemini Robotics On-Device brings AI to local robotic devices

It’s the same principle as us moving our hands immediately based on the knowledge in our heads, rather than calling our parents every time to ask, “How do I peel an apple?”

Current Status: A Lightweight but Powerful Robotic Brain

Gemini Robotics On-Device was built based on Google’s ‘Gemma’ model. Gemma is an AI model designed to run lightly and quickly within a device, and this robotics version optimizes it for robot control. PDFGemini Robotics On-Device Model Card

The main features of this model can be summarized as follows:

Operates Without Internet: A ‘cloud-free’ method that requires no cloud connection at all. Google rolls out new Gemini model that can run on robots locally
Optimized for Two-Armed Robots: Specifically specialized for ‘bi-arm robots’—those with two arms like humans—to perform complex tasks through hand cooperation. Gemini Robotics On-Device brings AI to local robotic devices
Versatility: Designed flexibly to be used across various types of robots and environments, not just for robots from a specific manufacturer. Google Introduces Gemini Robotics On-Device AI Model, Can Adapt to Different Types of Robots
Handles Complex Commands: It processes multi-step commands like “Pick this up, put it in that box, and close the lid” much better than existing on-device models. Gemini Robotics On-Device also outperforms other on-device alternatives on more challenging out-of-distribution tasks and complex multi-step instructions.

Currently, this model has been released only to a small number of trusted Google partners and testers, and its performance is being meticulously verified in real-world settings. PDFGemini Robotics On-Device Model Card

What Lies Ahead?

Experts believe this announcement will be a ‘Game Changer’ (a significant event that changes the outcome or flow) for the robotics industry. Gemini Robotics: Google Brings AI to Local Robots This is because it can simultaneously solve the problems of high maintenance costs, communication security issues, and frustratingly slow reaction speeds that have previously caused hesitation in adopting robots.

In the not-too-distant future, we will more frequently see smart robots serving in restaurants that immediately react to a customer’s sudden movement to avoid spilling food, or robots silently organizing inventory in the corners of massive warehouses where internet signals don’t reach. Google Launches Gemini Robotics On-Device AI: Robots Go Offline, Stay Smart

This attempt by Google DeepMind will be an important step for AI to move beyond being just text or images on a screen and be reborn as a true ‘companion’ that moves safely and agilely in the same physical space as us. The day when robots are no longer just ‘machines’ but ‘intelligent assistants’ that understand our words and act wisely seems not far away.

References

Share this article:

Test Your Understanding

Q1. What is the most significant feature of Gemini Robotics On-Device?

It must always be connected to the internet.
AI runs directly within the robotic device.
It must be moved only by a human with a controller.

As the name 'On-Device' suggests, this model runs locally on the robotic device itself without an internet or cloud connection.

Q2. Which other Google on-device AI model is this model based on?

Gemma
PowerBot
Cloud

Gemini Robotics On-Device is designed based on Gemma, Google's on-device model.

Q3. What is the role of the VLA (Vision-Language-Action) model processed by Gemini Robotics On-Device?

Translates only text.
Only draws pictures.
Integrates the processes of seeing (V), understanding (L), and acting (A).

The VLA model refers to a structure that understands visual information (Vision) and language (Language) to link them to specific robotic actions (Action).