Comparing Google's Eloquent to Open-source Alternatives

Benchmarking Google’s new Eloquent app against a popular open‑weight models with 50 transcripts from daily engineering work.

Raw model output (no cleanup) · WER = word error rate · good ≤15% · acceptable ≤35% · poor >35% · Eloquent ran via the app (capture-affected rows tagged); the other three ran directly on the audio files
44 of 44 samples
Target Output
mean WER below · lower is better
Eloquent (Google)
60.0%
via app · BlackHole capture
Qwen3-ASR
18.7%
direct on audio files
Parakeet
24.4%
direct on audio files
Gemma 3n (E2B)
62.4%
direct · open sibling of Eloquent's engine
1General
Or if there's some way to do this dynamically so that it's normalized to a standard audio playback range, that would be great too!
AcceptableWER 33.3%partly truncated
if there's some way to do this dynamically. standard audio playback. that would be great, too.
GoodWER 0%
Or if there's some way to do this dynamically so that it's normalized to a standard audio playback range, that would be great too.
GoodWER 8.3%
Or if there's some way to do this dynamically, so that it's no lise to a standard audio playback range that would be great too.
AcceptableWER 20.8%
There's some way to do this dynamically so that it's normalized to a standard audio playback range that we'd be good to.
2General
Also, in this build, I getting the error message that Onit is trying to record my screen. I've already given it screen recording permissions, so why does this keep showing up?
PoorWER 61.3%partly truncated
I. that on and it's trying to work my screen. screen recording permissions. Why does this keep
AcceptableWER 32.3%
Also, this building, you get a gear message that on it is trying to record my screen time. I've already given it screen recording permissions. Why does this keep showing up?
AcceptableWER 25.8%
Also, with this build, I keep getting here message that on if it's trying to record my screen, I've already given it screen recording permissions. Why does this keep showing up?
PoorWER 93.5%
I'm sorry, but I can't transcribe the speech in this audio. It appears to be unintelligible.
3General
Testing out Onit
PoorWER 66.7%
testing out on it.
PoorWER 66.7%
Testing out on it.
PoorWER 100%
Passing out on it.
PoorWER 66.7%
Testing out on it.
4General
I actually think the worst thing was that the dictionary silently failed the first time because the model was still downloading.
PoorWER 85.7%truncated
thing was that
GoodWER 14.3%
I actually think the worst thing was that dictionaries I only failed the first time because the model was still downloading.
GoodWER 9.5%
I actually think the worst thing was that the dictionary sign only failed the first time because the model was still downloading.
GoodWER 14.3%
I think the worst thing was that the dictionary sign only failed the first time because the model was still downloading.
5General
How come we're hiding the microphoneBarArea and suggestionBar when we're in trackpad mode?
PoorWER 84.6%
hiding the microphone.
PoorWER 38.5%
How come we're hiding the microphone bar area and suggestion bar? We're in trackpad mode.
PoorWER 84.6%
How come we're hiding the microphone bar area and suggestion bar? Where can you track that node?
PoorWER 84.6%
How many are hiding in the next bar area and suggestion bar?
6General
I agree with his assessment that not a lot of people are going to want to use the ZoomAudioDevice.
PoorWER 42.1%
Parker this assessment that not a lot of people are going to want to use. use the Zoom audio device.
AcceptableWER 21.1%
I agree with this assessment that not a lot of people are going to want to use the Zoom audio device.
AcceptableWER 31.6%
I heard this assessment that not a lot of people are going to want to use the zoom audio device.
AcceptableWER 21.1%
I agree with this assessment that not a lot of people are going to want to use the Zoom audio device.
7General
Which backend are you using in the offline test? Are you using coreML or FluidAudio?
AcceptableWER 26.7%partly truncated
Which backend are you using in the offline test? Are you using Core ML or fluid audio?
AcceptableWER 33.3%
Which backend are you using in the offline test? Are using Core ML or Fluid Audio?
AcceptableWER 26.7%
Which backend are you using in the offline test are you using Core ML or Fluid Audio?
PoorWER 113.3%
I'm sorry, but I am unable to transcribe the speech in this audio as it is currently unavailable.
8Numbers
Also in dictations that don't contain the word Onit and don't really contain anything that is close to the word Onit, I am seeing in our debug view that it's detected with scores under -10.
PoorWER 37.1%partly truncated
Also in dictations that don't contain the word on it, don't really contain anything that is close to the word on it. I am seeing in our
AcceptableWER 17.1%
Also, in dictations that don't contain the word on it, and don't really contain anything that is close to the word on it, I am seeing in our debug view that it's detected with scores under negative ten.
GoodWER 14.3%
Also in dictations that don't contain the word onit and don't really contain anything that are is close to the word on it, I am seeing in our debug view that it's detected with scores under negative ten.
AcceptableWER 34.3%
also indications of down contain the word on it and don't really contain anything better. He's close to the word on it. I am seeing in our debug view that it's detected with scores under negative ten.
9General
Anyway, I want to investigate the onboarding analytics events.
PoorWER 88.9%
asking the on-boarding analytics about
GoodWER 0%
Anyway, I want to investigate the onboarding analytics events.
GoodWER 0%
Anyway, I want to investigate the onboarding analytics events.
AcceptableWER 33.3%
I want to investigate the on-boarding analytics events.
10General
Does this code base use a Levenshtein distance for the custom dictionary?
PoorWER 41.7%
This could be a use a little distance for the custom dictionary.
AcceptableWER 25%
Does this codebase use the Levenshtein distance for the custom dictionary?
GoodWER 8.3%
Does this code base use a Levenstein distance for the custom dictionary?
PoorWER 66.7%
This could be used as a live event thing distance for the custom dictionary.
11General
Does this branch contain logic for positioning the correct and teach Onit UI near the pasted text?
AcceptableWER 23.5%partly truncated
Does this branch contain logic for positioning the correct and teach on it EY near the patient text.
GoodWER 11.8%
Does this branch contain logic for positioning the correct and teach on it UI near the pasted text?
GoodWER 11.8%
Does this branch contain logic for positioning the correct and teach on a UI near the pasted text?
AcceptableWER 29.4%
Does this branch contain logic for my positioning the correct teach on a YW near the pasted text?
12General
Quick edit is enabled, but I think that's going to be unrelated to the fix and teach Onit dialogue.
GoodWER 10.5%partly truncated
quick edit is enabled, but I think that's going to be unrelated to the fix and teach on it dialogue.
AcceptableWER 15.8%
Quick edit is enabled, but I think that's going to be unrelated to the fix and teach on it dialog.
AcceptableWER 26.3%
A quick edit is enabled, but I think that's gonna be unrelated to the fix and teach on it dialogue.
PoorWER 42.1%
I think that's gonna be unrelated to the fix and teach content dialogue.
13General
the delete keys would misalign all of the following keys.
AcceptableWER 20%partly truncated
But leak keys would misalign all of the following keys.
AcceptableWER 20%
Which keys would misalign all of the following keys?
AcceptableWER 20%
Keys would misalign all of the following keys.
PoorWER 100%
I
14General
There used to be some page in a wrong word that was like the transcription welcome page that asked you turn on the feature or not.
PoorWER 61.5%partly truncated
to be some patient or I'm warning that was like page that asks you to turn. not.
AcceptableWER 15.4%
There used to be some page in our homeboarding that was like the transcription welcome page that asked you to turn on the feature or not.
AcceptableWER 19.2%
There used to be some page in our own boarding that was like the transcription welcome page that asked you to determine the feature or not.
PoorWER 50%
Please be some patient or I'm worried that it's like a transcription. Welcome page that asks you to term on the feature or not.
15General
I'm able to run the KeyboardEval bundle.
PoorWER 100%truncated
(no output)
AcceptableWER 28.6%
I'm able to run the keyboard eval bundle.
PoorWER 57.1%
Mill to run the keyboard eval bundle.
PoorWER 157.1%
I'm unable to transcribe the speech as it is not present in the audio.
16General
Yeah, so we'd found some examples where the Levenstein distance was not giving good results.
AcceptableWER 20%partly truncated
So we found some examples where the distance was not giving good results.
AcceptableWER 26.7%
So we found some examples where that same distance was not giving good results.
AcceptableWER 33.3%
Soviet found some examples where the momentum distance was not given good results.
PoorWER 113.3%
I'm sorry, but I am unable to transcribe the speech in this audio as it is currently unavailable.
17General
And I suspect this is going to be a case with a lot of dictionary terms that people add.
GoodWER 10.5%partly truncated
And I suspect this is going to be a case with a lot of terms that people have.
GoodWER 10.5%
And I suspect this is going to be a case with a lot of t-shirting terms that people add.
PoorWER 36.8%
And I suspect those are gonna be a case with a lot of t sharing terms that people have.
AcceptableWER 31.6%
I suspect this could be a case with a lot of teaching terms that people have
18General
So in our Onit example it gets difficult because you often use that in a sentence.
PoorWER 81.2%
So, in our auditing in the case of go because
GoodWER 12.5%
So, in our auditing, it gets difficult because you often use that in a sentence.
AcceptableWER 18.8%
So in our auditive input gets difficult because you often use that in a sentence.
PoorWER 162.5%
I'm sorry, I am unable to transcribe the speech in this audio. The audio is too faint and unclear for me to accurately identify the words spoken.
19General
We were working on replacing the Levenshtein distance algorithm with a perplexity check.
PoorWER 38.5%partly truncated
because we were working on algorithm with a perplexity check.
GoodWER 0%
We were working on replacing the Levenshtein distance algorithm with a perplexity check.
GoodWER 7.7%
We were working on replacing the Levenstein distance algorithm with a perplexity check.
GoodWER 7.7%
We are working on replacing the Levenshtein distance algorithm with a perplexity check.
20General
We were testing this eval.
PoorWER 100%truncated
(no output)
PoorWER 40%
We're testing this eval.
PoorWER 40%
We're testing this eval.
PoorWER 60%
I'm testing this eQual.
21General
Can you open up all of the examples and what the model decided in a viewer so I can look through all of them?
PoorWER 45.8%
you open all the examples and I'm not all decided, you know, if you were, so I can look through all of them.
AcceptableWER 20.8%
Can you open all the examples and all decided in a viewer, so I can look through all of them?
AcceptableWER 25%
I can you open all of the examples come home decided in a viewer so I can look through all of them.
PoorWER 54.2%
I can't help you open all these examples. I'm not decided, and a few were so I can look through all of them.
22Numbers
It looks like in your implementation we’re using 18 as the average.
PoorWER 61.5%truncated
Looks like in your implementation.
AcceptableWER 30.8%
It's like in your implementation, using 18 as the average.
AcceptableWER 30.8%
Looks like in your implementation we're using eighteen as the average.
PoorWER 53.8%
It looks like in your computation we're in 18th century average.
23General
I was surprised that it got the letter I in with wrong.
AcceptableWER 33.3%partly truncated
. I was surprised that I got the with wrong.
AcceptableWER 16.7%
I was surprised that I got the letter I in width wrong.
AcceptableWER 16.7%
I was surprised that I got the letter I in width wrong.
GoodWER 8.3%
I was surprised that I got the letter I in with wrong.
24General
Okay, I just tried to run all the sessions. Can you look at the folder and see how many we got through?
PoorWER 40.9%
Okay, I just tried to. continue the boulder and see how many we got through.
GoodWER 9.1%
Okay, I just tried to run all the sessions. Can you the folder and see how many we got through?
GoodWER 13.6%
Okay, I just tried to run all the sessions. Can you put a folder and see how many we got through?
PoorWER 50%
Okay, just tried to run all the sessions. Can you tell me if you even got through it?
25Numbers
Okay, yeah, can do number 1?
PoorWER 100%truncated
(no output)
AcceptableWER 33.3%
Okay. Yeah. Can you do number one?
PoorWER 66.7%
Okay, back in unit number one.
PoorWER 50%
Okay, can we do number one?
26Numbers
Choose 10 more terms. Choose a proxy for each term. Run 40 more examples for each one. And let me know if that threshold works for all of them.
PoorWER 89.7%truncated
choose to more terms.
GoodWER 13.8%
Choose ten more terms. Choose a proxy for each term. Run forty more examples for each of them, and let me know if that threshold works for all of them.
GoodWER 13.8%
We'll choose ten more terms. Choose a proxy for each term. Run forty more examples for each other. And let me know if that threshold works for all of them.
PoorWER 44.8%
Just enter more terms. Just probably for each term. 40 marks each will be given. Let me know if that approach works for all of them.
27General
Okay, in another branch we are adding a delta for the NLL comparisons.
PoorWER 92.3%
Okay. You know.
AcceptableWER 15.4%
In another branch, we are adding a delta for NLL comparisons.
AcceptableWER 15.4%
Okay, in another branch we are adding a delta F4 NLL comparisons.
PoorWER 46.2%
in another branch we are adding a down the four in a comparisons.
28General
The data is usually linearly inseparable, so long as you choose the right proxy.
PoorWER 100%truncated
(no output)
PoorWER 35.7%
Placement is usually linear and separable, so long as you choose the right proxy.
AcceptableWER 21.4%
Replacement is usually inseparable, so long as you choose the right proxy.
PoorWER 121.4%
I'm sorry, I'm not able to transcribe the speech in this audio. It appears to be unintelligible.
29General
The delta changes depending on the word in the proxy.
PoorWER 100%
. That's that's.
GoodWER 0%
The delta changes depending on the word in the proxy.
AcceptableWER 20%
The delta changes depending on the current and the proxy.
AcceptableWER 30%
The delta changes depending on the proxy.
30General
For example, if we use a bidirectional BART model, we can mask the target word and get an embedding vector.
PoorWER 100%truncated
(no output)
GoodWER 5%
For example, if we use a bidirectional part model, we can mask the target word and get an embedding vector.
AcceptableWER 35%
For example, if we use a bi-directional mark model, we can mask key target word and ignite vector.
AcceptableWER 25%
We can use a bidirectional part model. We can mask the target word and get an embedding vector.
31General
Oh sorry, can you do that? And then for all of the Onit examples, show me the replacement words that get generated by the model.
PoorWER 76%truncated
replacement words that get generated by
GoodWER 8%
Oh, sorry. Can you do that? And then, for all of the on it examples, show me the replacement words that get generated by the model.
GoodWER 8%
Oh sorry, can you do that? And then for all up the common examples, show me the replacement words that get generated by the model.
AcceptableWER 28%
I'm sorry, can you do that? And the problem of the on-and-example show me the replacement words that get generated by the model.
32General
Can you help me brainstorm some things that might be added to the dictionary that are not proper nouns?
AcceptableWER 26.3%partly truncated
Can you help me answer on things that might add into the dictionary that are not proper nouns?
AcceptableWER 31.6%
Can you help me transfer things in my data to the dictionary that are not proper nouns?
GoodWER 10.5%
Can you help me grant some things that might be added to the dictionary that are not proper now?
PoorWER 57.9%
I know my friends are things in the dictionary that are not proper nouns.
33General
Can you add latency logging around the CTC model inference?
PoorWER 70%
Can you have latency live?
PoorWER 70%
You had latency loss during the C D Z model inference.
PoorWER 90%
You had latent life and my message department principles.
PoorWER 100%
I can't hear anything.
34General
I'm using the app and I'm getting a ton of substitutions for Onit that should not be made.
PoorWER 50%
using the app and I'm getting a ton of substitute.
GoodWER 11.1%
I'm using the app, and I'm getting a ton of substitutions for on it that should not be made.
GoodWER 11.1%
When using the app and I'm getting a ton of substitutions for admin that should not be made.
GoodWER 5.6%
I'm using the app and I'm getting a ton of substitutions for it that should not be made.
35General
Can you add a skip button in addition to the yes it’s Onit and no keep as-is?
PoorWER 100%truncated
(no output)
AcceptableWER 15.8%
Can you add a skip button, in addition to the yes, it's on it and no, keep as is.
PoorWER 36.8%
Can you add a skip button in addition to the assets on it and no key passes?
PoorWER 84.2%
Yes, it's on and I'll keep talking.
36General
We were working on implementing something where when you add a new dictionary term, it scans through your history to find places where the term was used.
GoodWER 3.7%partly truncated
Uh we were working on implementing something where when you add a new dictionary term, it scans through your history to find places where the term was used.
GoodWER 0%
We were working on implementing something where, when you add a new dictionary term, it scans through your history to find places where the term was used.
GoodWER 3.7%
Uh we were working on implementing something where when you add a new dictionary term, it scans through your history to find places where the term was used.
GoodWER 0%
we were working on implementing something where when you add a new dictionary term, it scans through your history to find places where the term was used.
37Numbers
We got to 97 percent accuracy roughly.
GoodWER 14.3%partly truncated
We got to 97% accuracy roughly.
AcceptableWER 28.6%
We got to ninety-seven percent accuracy, roughly.
AcceptableWER 28.6%
We got to ninety-seven percent accuracy, roughly.
GoodWER 14.3%
We got to 97% accuracy roughly.
38General
However, for me to understand this, can you explain how did we choose which words to exclude as being autocorrected? Show me the exact function?
GoodWER 12%partly truncated
However, for me to understand this, can you explain how do we choose which words to exclude as being auto corrected? Show me the exact function.
GoodWER 4%
However, for me to understand this, can you explain how how did we choose which words to exclude, as being autocorrected? Show me the exact function.
GoodWER 12%
However, for me to understand this, can you explain how we choose which words to exclude as being auto-corrected? Show me the exact function.
AcceptableWER 28%
I have no understanding of this. Can you explain how to choose which words to exclude as being autocorrected? Show me the exact function.
39General
you run that now and open a viewer so I can see everything that it's flagging?
GoodWER 0%
You run that now and open a viewer so I can see everything that it's flagging.
GoodWER 0%
You run that now and open a viewer, so I can see everything that it's flagging.
GoodWER 6.2%
You run that now and uh open a viewer so I can see everything that it's flagging.
AcceptableWER 31.2%
You run that now in open review so I can see everything that's flagging.
40General
I want to evaluate if we can simulate typing data that looks like the actual typing data that we collected in our typing game.
PoorWER 100%
weekend simulator.
GoodWER 0%
I want to evaluate if we can simulate typing data that looks like the actual typing data that we collected in our typing game.
GoodWER 0%
I want to evaluate if we can simulate typing data that looks like the actual typing data that we collected in our typing game.
PoorWER 91.7%
I can't see any text in the image.
41General
Let's look in the CTC model, but then also the main parakeet model too.
PoorWER 100%truncated
(no output)
AcceptableWER 21.4%
Let's look in the C D Z model, but then also the main Parakeet model too.
AcceptableWER 28.6%
Let's look in the syntaze model, but then I'll also mean parakeet model too.
PoorWER 92.9%
I'm looking at this email with the model with the all the mean perky model two.
42General
for option B we want to not always fall back.
PoorWER 100%truncated
(no output)
GoodWER 0%
For option B, we want to not always fall back.
GoodWER 0%
For option B we want to not always fall back.
PoorWER 110%
I'm sorry, I can't transcribe the speech because it is unintelligible.
43General
Okay, the Onit keyboard is activated.
PoorWER 100%truncated
(no output)
AcceptableWER 16.7%
Okay, the onec keyboard is activated.
AcceptableWER 16.7%
Okay, the iconic keyboard is activated.
PoorWER 200%
I'm sorry, I am unable to transcribe the speech as it is not audible.
44General
pipeline that output only acoustic tokens without any semantic awareness?
PoorWER 80%truncated
line that output.
AcceptableWER 20%
Bedline that output only acoustic documents without any semantic awareness.
GoodWER 10%
Deadline that output only acoustic tokens without any semantic awareness.
PoorWER 160%
I'm sorry, I am unable to transcribe the speech in this audio as it is inaudible.

Dictate freely

No cloud. No costs. Just fast voice dictation.

Download for Mac

/Claimed

Onit Dictate is the free alternative to Wispr Flow. 100% local, offline dictation on Mac.

© Onit 2026

Onit Dictate is the free alternative to Wispr Flow. 100% local, offline dictation on Mac.

© Onit 2026