Word Error Rate and Speech Recognition

Word Error Rate Automatic Speech Recognition

In the domain of conversational AI, word error rate is an extremely important measure to determine the accuracy of any Automatic Speech Recognition (ASR) system. Simply speaking, it calculates the number of errors in the transcription text produced by the ASR system. In the field of automatic speech recognition, word error rate is the modus operandi to define accuracy levels for such systems. For instance, the parameters for calculating WER are substitutions, insertions and deletion. Let original sentence by “Hello there” then “Hello” is deletion, “Hello their” is substitution and “Hello they are” is insertion.

For WER let original sentence be – “My name is Paul and I am an engineer.”

WER (11.11%) type 1 – “My name is ball and I am an engineer.” One substitution with 9 words = (1/9)X100 = 11.11%

WER (22.22%) type2 – “My name is Paul and I’m an engineer.” Deletion and substitution with 9 words = (2/9)X100 = 22.22%

Word Error Rate = (Substitutions + Deletions + Insertions)/(Number of words in reference transcripts)

ODIO with an industry-low 8% word error rate has been able to fabricate its own speech recognition technology from scratch where languages and acoustic models are customised to deliver incomparable speech recognition.

With an accurate ASR system, ODIO promotes contextual understanding by capturing the industry-specific jargons and acronyms that form the basis of important client deals and conversations. This aspect becomes detrimental as one word could change the meaning and context of the entire sentence. For instance, a poor ASR system might interpret a sentence- “The clients know about our deal.” as “The clients no about our deal.” which drastically changes the implication of the conversation.

Privacy comes at a huge cost, but the built-in masking feature of ODIO makes it class apart from others. Often, sales deals include confidential and proprietary information revolving around transactions and client details. Such information could possess a threat and thus, it becomes vital to mask such sensitive details of any conversation. For example, our platform simply replaces any OTP-1234 as XXXX to ensure the privacy of our clients and deals.

Thank you for reading. For continued insights and in-depth discussions, please follow our blogs at Odio.

One Comment

  1. Lokesh Chawla

    Good to know about the WER. Thanks

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>