Google Announces Advanced Gemini 2.0 Models

AIBoom

10 months ago

Google released Gemini 2.0 on Wednesday, calling it its “most capable” AI model suite to date. Now, it’s available to everyone.

In December, the company provided access to developers and trusted testers while integrating some features into its products. However, this marks a full public release.

The suite includes 2.0 Flash, designed for high-volume, high-frequency tasks at scale. Additionally, it features 2.0 Pro Experimental, optimized for coding performance, and 2.0 Flash-Lite, which Google describes as its “most cost-efficient model yet.”

Gemini Flash costs developers 10 cents per million tokens for text, image, and video inputs. Flash-Lite, the more cost-effective option, costs just 0.75 cents for the same. Tokens represent individual units of data the model processes.

Google continues to push AI development as competition intensifies among tech giants and startups. Its aggressive investment in AI agents reflects a broader strategy to lead the evolving market.

Meta, Amazon, Microsoft, OpenAI, and Anthropic are also advancing agentic AI. These models handle complex, multistep tasks independently, reducing the need for step-by-step user guidance.

Advancing AI Agents

Google has been investing in more agentic AI models, designed to understand the world, think ahead, and take action with user supervision. In a December blog post, the company highlighted Gemini 2.0’s advancements in multimodality, including native image and audio output and built-in tool use. Google believes these improvements bring AI closer to its vision of a universal assistant.

Anthropic, backed by Amazon and founded by former OpenAI researchers, is a key competitor in this space. In October, the company announced that its AI agents could operate computers like humans to complete complex tasks.

According to Anthropic, its technology can interpret screens, select buttons, enter text, navigate websites, and execute tasks using real-time internet browsing. “The tool can use computers in basically the same way we do,” said Jared Kaplan, Anthropic’s chief science officer, in an interview with CNBC. He noted that it can handle tasks requiring “tens or even hundreds of steps.”

OpenAI released a similar feature recently called Operator. It will automate tasks such as planning vacations, filling out forms, making restaurant reservations and ordering groceries. The Microsoft-backed startup described Operator as “an agent that can go to the web to perform tasks for you.”

Earlier this week, OpenAI introduced Deep Research, which allows an AI agent to compile complex research reports and analyze questions and topics of the user’s choice. Google in December launched a similar tool of the same name — Deep Research — which acts as a “research assistant, exploring complex topics and compiling reports on your behalf.”