· Quick Take  · 3 min read

3 Observations from Google's Nano Banana Launch

Google's Nano Banana reveals smart marketing tactics, shifts in image editing quality, and implications for the AI arms race.

Google's Nano Banana reveals smart marketing tactics, shifts in image editing quality, and implications for the AI arms race.

Yesterday, 26 Aug, Google officially announced the real identity of the mysterious Nano Banana model, which is Gemini 2.5 Flash Image. Here are 3 quick takeaways.

A new strategy of marketing.

Instead of releasing a new Gemini 2.5 Flash Image model as usual, this time Google opened a “nano-banana” model for users to try for free first. The model was so good, the weird name and the mystery around it sparked curiosity and went viral like wildfire. People keep talking about what it really is, and how a “nano” model could be that good. The truth is it is not so nano, Gemini Flash is considered mid-size, but that is part of the marketing done well.

A new age of high-fidelity image and video editing.

Before Nano Banana, we had a lot of image editing tools, but they always had a problem: they may change a human face or the background, and it defeats the purpose. With Nano Banana, for the first time people see that all the details look just the same; they cannot detect the difference. That opens a host of new things to do, from swapping faces or objects, adding more people into the photo, or changing angles. Seems that from now on, the only limit for photo editing is our imagination.

Now, if you combine Nano Banana with tools like Veo 3 or Kling 2.1, you can swap your face into any video out there that looks so real. That also unlocks new capabilities for videos that were never done before.

The AI arms race

OpenAI has been one of the front runners in AI image generation, but with Nano Banana it has fallen behind Google. Compared with GPT native image generation, Nano Banana is better, significantly faster, and about 4x cheaper ($0.039 vs $0.17 per image).

I predicted before that Google is likely to win the AI race in 1 or 2 years. TPUs give them a cost edge, others have to pay an Nvidia markup, so competing on price is hard for OpenAI. Fighting on text, image, and video will spread them thin. They should consider giving up both image and video fronts, and focus only on text generation, where they are one of the best. The same way as Anthropic, where they even narrow down and focus on the code generation market and are considered the de facto leader in that niche. So, narrowing down and focus may be the best way to move forward for OpenAI.

Back to Posts

Related Posts

View all posts »
How LLMs Solve the Sloppy Input Problem

How LLMs Solve the Sloppy Input Problem

LLMs excel at making sense of messy, unstructured input. That shifts the burden of precision from people to systems. This capability unlocks massive opportunities in business.