邊說邊畫？Google Gemini 2.0 Flash「原生圖像生成」實驗體驗記

AI Summary

Google Gemini 2.0 Flash 公開了無需額外工具、直接在對話視窗中繪圖與修改的「原生圖像生成」功能，宣告 AI 已正式進入真正的多模態時代。

想像一下，當你正在給孩子講睡前故事時，故事書裡的插圖會隨著你的聲音即時變化。當你說「主角戴上了紅帽子」，畫中孩子的頭上就會出現一頂紅帽子；當你說「突然下起雨來」，背景就會畫出絲絲雨線。

這聽起來是不是很像電影裡的場景？以前需要高度繪圖技術才能實現的魔法，現在已經來到我們身邊。這是因為 Google 在其最新的 AI 模型 Gemini 2.0 Flash 中實驗性地導入了「原生圖像生成與編輯」功能 [Experiment with Gemini 2.0 Flash native image generation - Google Developers Blog]。

為什麼這很重要？

到目前為止，AI 繪圖的方式就像是「翻譯官」和「畫家」坐在不同的房間裡溝通。當我們輸入指令時，理解文本的 AI 會先進行解讀，然後傳一張紙條給隔壁房間的圖像專用 AI 說「請畫出這樣的畫」。在這個過程中，資訊可能會失真，而且最重要的是，很難進行即時對話和修改。

但這次介紹的 原生圖像生成 (Native image generation，指 AI 模型無需額外工具即可直接自行生成圖像的方式) 則是完全不同層次的故事。Gemini 2.0 Flash 在一個巨大的「大腦」中，從一開始就將閱讀寫作能力與理解繪畫能力合而為一 [Gemini 2.5 Flash]。

簡單來說，翻譯官和畫家合體了。為什麼這具有決定性的重要性？答案就是 「脈絡 (Context)」。由於文本和圖像來自同一個大腦，AI 能更精確地將我們說話的微妙語氣反映在畫面中。此外，無需中斷對話，即可實現如「把剛才那張畫裡的雲再畫得蓬鬆一點」之類的即時回饋 [ExploreGemini2.0FlashNativeImageGenerationExperiment]。

輕鬆理解：「一句話就能改畫的時代」

這次更新最令人驚豔的地方是 對話式圖像編輯 (Conversational image editing) 功能 [You can now test Gemini 2.0 Flash’s native image outputGoogle Outpaces OpenAI with Native Image Generation in Gemini …]。

打個比方，如果說之前的圖像 AI 像是向自動販賣機投幣後等待結果，現在則更像是對坐在我身邊的老練設計師開口請求。

例如，一位開發者在生成一個角色圖像後，想讓該角色手裡拿著一杯熱巧克力 [[Experiment with Gemini 2.0 Flash native image generation

Hacker News](https://news.ycombinator.com/item?id=43344685)]。在以前，可能需要重新輸入「拿著巧克力的角色」這種長指令並從頭開始繪製，但現在只需隨口說一句 「在剛才那個角色的手裡放一杯熱可可」 即可。

AI 教育專家 Paul Couvert 對此盛讚道：「現在僅憑自然對話就能基本編輯任何圖像」 [You can now test Gemini 2.0 Flash’s native image outputGoogle Outpaces OpenAI with Native Image Generation in Gemini …]。即使不懂複雜的專業術語或工具使用方法，我們也能像和朋友聊天一樣輕鬆完成設計，這樣的時代已經開啟。

執著的故事大王：一致性的敘事能力

製作童話書時最令人頭痛的時刻是什麼？就是第 1 頁主角的臉和第 2 頁的臉看起來有微妙的不同。然而，Gemini 2.0 Flash 在維持 角色與設定一致性 方面表現卓越。

即使連續生成多張圖像，也能保持主角長相或背景基調的一致 [You can now test Gemini 2.0 Flash’s native image outputGoogle Outpaces OpenAI with Native Image Generation in Gemini …]。這預示著 AI 不僅僅是一個能畫出漂亮圖片的工具，更能成為真正意義上的「視覺故事大王」。

現狀：每個人都能直接使用嗎？

目前該功能處於 實驗階段 (Experimental)，主要先向開發者和企業公開。不過您不必失望，一般使用者也有非常簡單的方法可以體驗這項未來技術：

訪問 Google AI Studio 網站 [How to Use Gemini 2.0 Flash for Image Generation? - Latenode Blog]。
使用 Google 帳號登入後，在右側的模型選擇選單中點擊 「Gemini 2.0 Flash Experimental」 版本 [How to Use Gemini 2.0 Flash for Image Generation? - Latenode Blog]。

目前該功能免費提供，無需額外費用，任何人都可以盡情發揮創意 [[I Tried OutGemini’s NewNativeImageGen Feature, and…

Beebom](https://beebom.com/tried-out-gemini-native-image-gen-feature-and-its-amazing/)]。

專家們有時將 Gemini 2.0 Flash 稱為 「工作馬 (Workhorse，默默耕耘的勞動者)」 AI [Gemini 2.0 Flash: Unleashing Native Image Generation - A Tech …]。這是因為在華麗的外表下，強大的實務處理能力和飛快的速度才是這個模型的真面目。

未來會如何發展？

Google 的目光已經投向了更遙遠的未來。人們對能處理更龐大數據並執行複雜編碼或視覺化任務的 Gemini 3 Flash 模型充滿期待 [Gemini3Flash— Google DeepMind]，而能像人類一樣即時觀看、聆聽並對話的 Gemini 3.1 Flash Live Preview 模型也在準備中 [[Gemini3.1FlashLive Preview

GeminiAPI

Google AI for Developers](https://ai.google.dev/gemini-api/docs/models/gemini-3.1-flash-live-preview)]。

最終，我們將迎來一個可以透過與 AI 對話即時設計遊戲場景，或僅憑一句話就改變應用程式介面的世界。現在，技術問題已不再是「如何操作」，而是轉向「我想想像並表達什麼」。

MindTickleBytes AI 記者的觀點

如果說之前的圖像 AI 只是單向地向我們扔出華麗「成果」的工具，那麼這次 Gemini 的更新則清晰地展示了它將如何與我們「協作」。既然身邊隨時都有一位能精準理解你意圖的畫家，那麼現在我們需要的可能不再是高深的「提示詞 (Prompt)」，而是像孩子般豐富的想像力。

參考資料

Experiment with Gemini 2.0 Flash native image generation - Google Developers Blog
Gemini 2.5 Flash
[Experiment with Gemini 2.0 Flash native image generation Hacker News](https://news.ycombinator.com/item?id=43344685)
Gemini 2.0 Flash Experimental For Incredible Native Image Generation & Editing via AI Studio & API - YouTube
How to Use Gemini 2.0 Flash for Image Generation? - Latenode Blog
Gemini3Flash— Google DeepMind
Google:Gemini2.0FlashExperimentalFree Chat Online - Skywork ai

[I Tried OutGemini’s NewNativeImageGen Feature, and…

Beebom](https://beebom.com/tried-out-gemini-native-image-gen-feature-and-its-amazing/)

ExploreGemini2.0FlashNativeImageGenerationExperiment
ExperimentwithGemini2.0Flashnativeimagegeneration

[Gemini3.1FlashLive Preview

GeminiAPI

Google AI for Developers](https://ai.google.dev/gemini-api/docs/models/gemini-3.1-flash-live-preview)

You can now test Gemini 2.0 Flash’s native image outputGoogle Outpaces OpenAI with Native Image Generation in Gemini …
Gemini 2.0 Flash: Unleashing Native Image Generation - A Tech …

查核摘要

查核聲明數：12
已驗證聲明數：11
結論：通過 (PASS)

Share this article:

測試你的理解

Q1. Gemini 2.0 Flash 的「原生」圖像生成與傳統方式有何不同？

不調用額外的圖像專用 AI，由單一模型同時處理文本與圖像。
無需網路連接，僅在智慧型手機內部運作的方式。
僅限付費使用者使用的獨占功能。

「原生 (Native)」方式意味著在同一個模型中同時完成文本理解與圖像生成。

Q2. 文章中介紹的「對話式圖像編輯」有什麼特點？

必須學習複雜的 Photoshop 技術才能實現。
可以透過自然對話來修改圖像的特定部分。
每次重新生成圖像時都會出現完全不同的畫面。

Paul Couvert 評價道：「基本上可以透過自然對話編輯任何圖像。」

Q3. 目前可以在哪裡免費測試 Gemini 2.0 Flash 的圖像生成功能？

Google 搜尋框
Google AI Studio
Android Play 商店

Google 已向開發者公開此實驗性功能，可透過 Google AI Studio 免費體驗。