Gemini 1.5: Google’s next-generation AI model is almost ready

Barely two months after the launch of Gemini, the major language model that Google hopes to take to the top of the AI ​​industry, the company is already announcing its successor. Google is launching Gemini 1.5 today and making it available to developers and business users, ahead of a full consumer rollout soon. The company has made it clear that it is fully committed to Gemini as a business tool, as a personal assistant and everything in between, and is pushing hard on that plan.

There are a lot of improvements in Gemini 1.5: Gemini 1.5 Pro, the general-purpose model in Google’s system, is apparently on par with the high-end Gemini Ultra that the company only recently launched, beating Gemini 1.0 Pro by 87 percent of benchmark tests. It’s created using an increasingly common technique known as ‘Mixture of Experts’ or MoE, which means it only runs part of the overall model when you send a query, rather than the whole thing all the time to process. (Here’s a good explanation of this topic.) That approach should make the model both faster to use and more efficient for Google.

But there’s one new thing in Gemini 1.5 that the entire company, starting with CEO Sundar Pichai, is particularly excited about: Gemini 1.5 has a huge context window, meaning it can handle much larger queries and much more information at once. to look at. That window is a whopping 1 million tokens, compared to 128,000 for OpenAI’s GPT-4 and 32,000 for the current Gemini Pro. Tokens are a tricky metric to understand (here’s a good overview), so Pichai makes it simpler: “It’s about 10 or 11 hours of video, tens of thousands of lines of code.” The context window means you can ask the AI ​​bot about all that content at once.

(Pichai also says that Google researchers are testing a context window of 10 million tokens – that’s about the entire range of Game of Thrones all at once.)

While explaining this to me, Pichai casually notes that you can fit the whole thing together Lord of the Rings trilogy in that context window. This seems too specific to me, so I ask him: This has already happened, hasn’t it? Someone at Google is just checking for Gemini continuity errors, trying to understand the complicated lineage of Middle-earth, and seeing if AI might finally be able to understand Tom Bombadil. “I’m sure it has happened,” Pichai says, laughing, “or will happen – one of the two.”

Pichai also thinks the larger context window will be hugely useful for businesses. “This enables use cases where you can add a lot of personal context and information at the time of the question,” he says. “Think of it as a way that we’ve dramatically expanded the query window.” He imagines if filmmakers would upload their entire film and ask Gemini what reviewers would say; he sees companies using Gemini to view reams of financial data. “I consider it one of the bigger breakthroughs we’ve made,” he says.

For now, Gemini 1.5 will only be available to business users and developers, through Google’s Vertex AI and AI Studio. Ultimately it will replace Gemini 1.0, and the standard version of Gemini Pro – which is available to everyone at gemini.google.com and in the company’s apps – will be 1.5 Pro with a 128,000 token context window. You’ll have to pay extra to reach the million mark. Google is also testing the security and ethical limits of the model, especially regarding the new larger context window.

Google is currently in a breakneck race to build the best AI tool, as companies around the world try to figure out their own AI strategy – and whether to sign their developer agreements with OpenAI, Google or someone else. Just this week, OpenAI announced “memory” for ChatGPT, and it looks like it’s gearing up for a boost in web search. So far, Gemini seems impressive, especially for those already part of Google’s ecosystem, but there’s still a lot of work to be done on all sides.

Ultimately, Pichai tells us, all these 1.0s and 1.5s, Pros and Ultras and corporate battles won’t really matter to users. “People will just consume the experiences,” he says. “It’s like using a smartphone without always paying attention to the processor underneath.” But right now, he says, we’re still in the phase where everyone knows the chip in their phone because it matters. “The underlying technology is changing so quickly,” he says. “People do care.”

Leave a Reply

Your email address will not be published. Required fields are marked *