Speech-to-Text Apps and Tools

  • Raghav Jindal
  • July 11, 2022

Our Use Case:

Every D2C brand has its call centers set up to interact with customers and resolve their queries and complaints about their products and services. Here at Bewgle, we use our NLP tools to analyze the calls and understand what the customers are saying to find out what they want. 

As a first step, we transcript audio calls to text and analyze what customers are saying about their experience with the brands.

Speech-to-text apps and tools:

Speech-to-text is a speech recognition software that enables the recognition and translation of spoken language into text through computational linguistics. It is also known as speech recognition or computer speech recognition. Specific applications, tools, and devices can transcribe audio streams in real-time to display text and act on it.

Problems with phone calls:

Phone calls contain a lot of noise and attenuation and have problems with the spoken language. You must first identify the language in which the speakers are conversing. Sometimes, people say some keywords in their local language, which makes it difficult for tools to identify them.

We followed the speech-to-text apps and tools to transcribe audio recordings but the results weren’t satisfying for our use case:

  1. Wav2Vec2
    • Wav2Vec2 is a speech model that accepts a float array corresponding to the raw waveform of the speech signal. This model is trained using connectionist temporal classification (CTC) so the model output has to be decoded using Wav2Vec2CTCTokenizer.
    • Was not able to detect any word, not even a “Hello”. The output we received from the voice transcript was A AA AR A ON LAATER BAT BORTL A AA AAA UA A O AS NO AS WHY O CRIS A  AON AAAA AOA.
    • This text doesn’t make any sense.

  2. Augnito
    • The tool is designed only for medical practitioners to write prescriptions.
    • It is accurate in identifying hard-to-pronounce words like ‘Ophthalmologist’, ’arrhythmia’ and ‘gonorrhea’
    • But it failed for our use case in the transcription of audio calls to text.
     
  3. Google Cloud Platform(GCP)
    • The Speech-to-Text API enables developers to convert audio to text in over 125 languages and variants, by applying powerful neural network models in an easy-to-use API.
    • GCP was at least able to detect some text phonetically but was poor in terms of detecting keywords. namaste madam basket order management Inc head office se naraz Hona totally black last time dobara Ek Bar to mere ko Markar Biryani chale gaye the vahan per abhi aapke Jhooth bolate vahan Jana Nahin Jana Hai Aap Jana Upar Se 14 kilometre kilometre Kaisa Hai Sar aap log bataiye.
     
  4. Transcribe – Speech-to-Text
    • This app is for Mac/iOS devices.

     
  5. Trint
    • It had very low accuracy in transcribing speech to text.
    • A lot of junk was there in the text but surprisingly it was able to detect some words that even GCP was also not able to detect.
    • The output, however, can’t be analyzed further
     
  6. IBM Watson
    • IBM Watson Text-to-Speech is an API cloud service that enables you to convert written text into natural-sounding audio in a variety of languages and voices within an existing application or within the Watson Assistant.
    • It performed excellently when analyzing speech in English. But it doesn’t support nor phonetically detect Hindi or any of the other Indian languages.


    Conclusion:

None of these tools were able to transcribe the audio calls accurately as the audio calls had a lot of noise. Sometimes the calls have attenuation or the language is not specified.

  • Tags:
You might also Like
Analyze Earnings Call using Bewgle’s NLP platform

Analyze Earnings Call using Bewgle’s NLP platform

  • Kshitija Ambulgekar
  • December 23, 2022

At Bewgle we apply our NLP capabilities on any unstructured data, using our patented AI models, to generate actionable insights from it. Bewgle’s Natural language processing, machine learning models, fundamentally analyze the text and output the answers to questions, the insights, topics, sentiment, adjectives and other key features that we promise to our customers. Here … Continue reading "Analyze Earnings Call using Bewgle’s NLP platform"

News Articles Analysis: How to get Insights using BEWGLE’s NLP Platform

News Articles Analysis: How to get Insights using BEWGLE’s NLP Platform

  • Vivek Hegde
  • November 10, 2022

Analyzing news articles can be helpful in drawing insights into how online platforms/news agencies are portraying a brand/product(s). So it’s essential information for any brand to understand and act upon. Keeping up to date with new trends/innovations/launches etc can be hard as it involves going over multiple articles on a daily basis. What if we … Continue reading "News Articles Analysis: How to get Insights using BEWGLE’s NLP Platform"

Lowering Cholesterol Naturally – Through Bewgle Lens

Lowering Cholesterol Naturally – Through Bewgle Lens

  • Swati Agarwal
  • August 2, 2022

At Bewgle, we take immense pride in our NLP capabilities on any unstructured data. Though we have primarily focused on drawing insights from feedback or similar text, I, as a data enthusiast, wanted to challenge the system beyond feedback. One of the use cases that I wanted to try was that of deriving insights from … Continue reading "Lowering Cholesterol Naturally – Through Bewgle Lens"