in

Gemini Live: First Look: Better Than Talking to Siri, But Worse Than I’d Like

Google launched Gemini Live at its Made By Google event in Mountain View, California, on Tuesday. The feature lets you have a semi-natural, non-typed, spoken conversation with an AI chatbot powered by Google’s latest large language model. TechCrunch was there to test it out firsthand.

Gemini Live is Google’s answer to OpenAI’s Advanced Voice Mode, ChatGPT’s nearly identical feature that’s currently in limited alpha testing. While OpenAI beat Google to the punch by demonstrating the feature first, Google is first to launch the feature for real.

In my experience, these low-latency speech capabilities feel much more natural than texting with ChatGPT or even talking to Siri or Alexa. I found that the Gemini Live responded to questions in less than two seconds and was able to change directions fairly quickly when interrupted. The Gemini Live isn’t perfect, but it’s the best hands-free phone use I’ve ever seen.

How it works

Before speaking to Gemini Live, the feature lets you choose from 10 voices, compared to just three with OpenAI. Google worked with voice actors to create each one. I appreciated the variety and found each one to sound very human.

In one example, a Google product manager verbally asked Gemini Live to find family-friendly wineries near Mountain View that had outdoor areas and playgrounds nearby, so kids could potentially come. That’s a much more complicated task than I’d ask Siri, or Google Search, to do, frankly, but Gemini successfully recommended a place that met the criteria: Cooper-Garrod Vineyards in Saratoga.

That said, Gemini Live leaves a bit to be desired. It seemed to hallucinate a nearby playground called Henry Elementary School Playground that is supposedly “10 minutes” from that vineyard. There are other playgrounds nearby in Saratoga, but the closest Henry Elementary School is over a two-hour drive away. There is a Henry Ford Elementary School in Redwood City, but that’s 30 minutes away.

Google liked to show how users could interrupt Gemini Live mid-sentence and the AI ​​would quickly change direction. The company says this allows users to control the conversation. In practice, this feature doesn’t work perfectly. At times, Google and Gemini Live project managers would talk over each other and the AI ​​wouldn’t seem to pick up on what was being said.

Notably, Google doesn’t allow Gemini Live to sing or imitate voices other than the 10 it provides, according to product manager Leland Rechis. The company likely does this to avoid copyright issues. Rechis also said Google isn’t focused on having Gemini Live understand the emotional intonation in a user’s voice, something OpenAI touted during its demo.

Overall, the feature seems like a great way to dive into a topic more naturally than you would with just Google Search. Google notes that Gemini Live is a step toward Project Astra, the fully multimodal AI model the company launched at Google I/O. For now, Gemini Live can only do voice conversations, but Google wants to add real-time video understanding in the future.

Written by Anika Begay

League Cup first round: Sheffield United overcome Wrexham as West Brom stunned by League Two’s Fleetwood | Football News

Harris to target rising prices in first North Carolina policy address By Reuters