Gemini Live Review: An Improved Assistant Stymied by Its AI Model

What to know

  • Gemini Live is an impressive digital assistant voice mode feature with 10 different voices, quick conversational responses, and chat transcripts.  
  • Unfortunately, its responses are stymied by the underlying AI model. Gemini Live’s speech can also be a bit too formal, and its responses feel truncated. 
  • Relying on Gemini Live is foolish. What’s worse is paying $20 for it.

Available via a Gemini Advanced subscription, Gemini Live has been the most talked about feature since it was unveiled at the Made by Google 2024 event, relegating even the Pixel 9 launch to a mere footnote. But early reviews, though initially impressed, are not in its favor.   

So, like most tech reviewers, I decided to take Gemini Live for a spin myself and see what the whole shebang was about. For brevity’s sake, I’m not going to tell you everything I talked about (who’s got the time?). But you’ll get the general idea. 

Gemini Live – An advanced digital assistant handicapped by its AI model

Now, Gemini Live isn’t free, nor do I own a Pixel 9 that comes with a year-long Gemini Advanced subscription for free. So I got a free trial and Gemini Live was available to me immediately, which is neat.

Just like that!

But is the $20 subscription fee for Gemini Live worth it? Let’s find out.

What’s good about Gemini Live?

Gemini Live comes in 10 voices, and you can easily choose yours from Gemini’s settings. But note that Google requires you to set English (United States) as the default to be able to do so, which is a mindless requirement. I mean, there’s a British voice (Capella) right there.

Doth my eyes deceive me?

Either way, there are voices enough for every day of the week, and then some.  

My first impressions of Gemini Live, like everyone else’s, were positive. Considering Google’s stilted, synthetic voices of old, Gemini Live is a breath of fresh air. The voices are, however, a little on the formal side – you won’t hear a lot of Umms and Ahs (and other interjections). Because of this, and other subliminal reasons, I did find the voices a little dispassionate and held back, presumably so users don’t end up forming emotional bonds – something that OpenAI fears could be the case with ChatGPT’s own Voice Mode, which is still much better. 

The responses come quick so it actually feels like you’re talking to a friend on call. But unlike a friend whose stories never end, you can interrupt Gemini anytime. Perhaps you already knew that. But it’s still worth mentioning because you can tell it to buzz off if it starts spouting something you know is incorrect (more on this later).

As soon as you end the conversation, you’ll find the transcript ready and available for you to read. To me, this is one of the best features. It really helps to check out what the conversation looks like in text and share it with others.

Room for improvement

There are things that Gemini Live does well. But it also has a lot of untapped potential.  

Firstly, conversations with Gemini Live are undoubtedly brief. When you ask a question, Gemini Live will answer in as few words as possible, as though it’s busy catering to other people. You won’t find it talking tangentially or spitballing with you which, many would say, a good thing. But can all ideas be stated simply and to the point, Occam’s razor notwithstanding? 

For instance, I asked it to compare Pegasus (since I was using that voice) with Icarus (both part of Greek myths). Though there are several nodes of comparison, Gemini Live gave me brief, to the point answers. I brought Hanuman (from Hindu myth) to give it another angle of comparison. And again, no more than a few sentences. Things got frustrating. 

After multiple attempts to get it to say more, I asked if there’s a setting that lets me adjust its verbosity. It told me it isn’t capable of changing that, but very authoritatively gave me instructions on how I could do it myself, which I followed foolishly because no such setting exists.

What? Where?

Which brings me to…

Where it suffers?

Gemini’s tendency to make up things and hallucinate hasn’t exactly fostered trust among users. It’s also drawn a lot of flak for its image generation blunders in the past. Unfortunate as it is, though the modality has changed, and the underlying model is updated to Gemini 1.5 Flash, the issue is still prevalent on Gemini Live.  

Although for the most part its responses are based on factual information, every now and then it’ll generate an answer out of thin air.

There’s surely a case to be made for how giving voice to AI inspires more trust among users. And with humanlike voices, it’s much easier to place your trust in it and be swayed by the confidence with which the answers are presented. But if you’re not on your guard, or fact-checking dubious responses, you may find yourself fooled, as was I. 

The technology is developing faster than anyone expected, but chatbots are as prone to hallucinations as ever. So, even after knowing AI’s propensity to providing bad information, if you continue to blindly rely on it, perhaps it’s not artificial intelligence that you need.

Say it with me: Fool me once, shame on you; fool me twice, shame on me. 

How does Gemini Live compare to ChatGPT’s Advanced Voice Mode?

Now, let’s consider the elephant in the room. How does Gemini Live compare to ChatGPT’s Advanced Voice Mode? Truth be told, Gemini Live just isn’t as verbose, engaging, or entertaining as ChatGPT’s Voice Mode. Although the latter may have been a little too engaging (even flirtatious), and eerily human-like, what with all its pondering sounds and mannerisms, it at least serves as a tool to have fun with. Gemini Live, on the other hand, takes itself too seriously, which may not work in its favor especially since its responses are handicapped by its AI model.  

But perhaps the biggest difference between the two is this: Gemini Live interprets speech as text and then give its response while ChatGPT’s Voice Mode processes speech directly. 

Verdict

Gemini Live is a fine tool, and a clear step up from the Google Assistant of old. The ability to invoke it from the lock screen is handy, and the 10 voices have enough going for them. But it would be insane to rely on it for anything professional. Personally, I’d sooner donate my money to a charity I don’t know anything about than pay $20 for Gemini Live alone. Fortunately, the Gemini Advanced subscription has other perks.

As things stand, it’s good to view AI, regardless of its modality, as a recovering schizophrenic. It’s getting better, but it’s still prone to relapses. The only difference is that you don’t have a schizophrenic in your pocket, nor will you pay to get one. 

What do you think about Gemini Live? Do you think Google will throttle this glowing review? Let us know in the comments below.

Leave a Reply

Your email address will not be published. Required fields are marked *