What sounds catchy?

  • Malavika
  • August 9, 2019
At Bewgle, we read your customer reviews, so you don’t have to.

About 80–90% of potentially usable business information originates from unstructured data.

Unstructured data can be sourced from multiple facets.

  • Text (Website Data, Reviews Data, Logs Data, over 500 new websites are created every minute of the day)
  • Videos and Audio (Over 200 billion HD Movies exist worldwide, to binge watch them it would take 37 million years)
  • Images (By 2015, a staggering 1 Trillion photos have been captured)

Fun Facts:
1. We have generated more unstructured data in the past 3 years than in the entire history of the human race.
2. Less than 0.5% of the world’s unstructured data is ever analysed (What a shame).

With a mission to derive business value from unstructured data, this blog post aim’s to shine a light on our unique method of extracting usable information in the form of KeyPhrases.
For contiguous information extraction, generating Word NGRAMS is the industry standard method. NGRAMS are groups of N-words that occur one after the other , since people may express opinions in 2 words or 3 words or n-words, it is a logically sound approach.

For Example:
For the sentence “This camera is very good”
Bigrams are : This Camera, Camera is, is very, very good
It is observable that only one of the above 4 bigrams is actually useful “Very Good”.
This is the biggest hurdle when working with ngrams or contiguous text.
Volume of the ngram exponentially grows with the increase in number of reviews.For a list of 1000 reviews, generation of all ngrams results in approx. over 40000 bigrams or 20000 trigrams.

This leads to methods of filtering out useful or information-rich ngrams:

Frequency-based Filtering — Some phrases that are used together quite often, do not mean anything significant (’This is’, or ‘I am’), therefore this method is often faulty.
Pointwise Mutual Information — This method uses a measure of association to rank and filter ngrams, but it is also faulty because it doesn’t filter enough junk.

Enter Bewgle.

We created our very own pipeline to ingest raw review data, pre-process it to our standards, and extract meaningful phrases of varied lengths that can be utilized in business logic.Gone are the days of endless excel files full of Ngrams that no one can ever go through. Leveraging Explosion AI’s Sense2Vec repo for pre-processing and an Intelligent Rule Based Filtration Algorithm based on dependency parsing and POS Tagging of phrases, we are generating extremely domain specific phrases that are information-rich and adequate in volume.

Given Below is a comparison of the phrases extracted through previous methods and our KeyPhrase Filtration Algorithm:

Description of Dataset: 1009 Reviews scraped from a retail banking company website.
Previous Method — Generation of all Contiguous Ngrams followed by Filtration based on PMI Ranking
Our Method — Bewgle KeyPhrase Extraction and Filtration Algorithm

Using this methodology, we have been able to extract not just the interesting snippets but also adjectives and sentiments associated with them, which enables brands understand not only what topic was spoken about, but also how satisfied the customers were with that product attribute.

With our iterative style of progressing, this catchy phrases algorithm has more to come..Stay tuned.

To know more, www.bewgle.com

  • Tags:
  • AI
  • Analytics
  • Product
  • Tech
You might also Like
Analyze Earnings Call using Bewgle’s NLP platform

Analyze Earnings Call using Bewgle’s NLP platform

  • Kshitija Ambulgekar
  • December 23, 2022

At Bewgle we apply our NLP capabilities on any unstructured data, using our patented AI models, to generate actionable insights from it. Bewgle’s Natural language processing, machine learning models, fundamentally analyze the text and output the answers to questions, the insights, topics, sentiment, adjectives and other key features that we promise to our customers. Here … Continue reading "Analyze Earnings Call using Bewgle’s NLP platform"

News Articles Analysis: How to get Insights using BEWGLE’s NLP Platform

News Articles Analysis: How to get Insights using BEWGLE’s NLP Platform

  • Vivek Hegde
  • November 10, 2022

Analyzing news articles can be helpful in drawing insights into how online platforms/news agencies are portraying a brand/product(s). So it’s essential information for any brand to understand and act upon. Keeping up to date with new trends/innovations/launches etc can be hard as it involves going over multiple articles on a daily basis. What if we … Continue reading "News Articles Analysis: How to get Insights using BEWGLE’s NLP Platform"

Lowering Cholesterol Naturally – Through Bewgle Lens

Lowering Cholesterol Naturally – Through Bewgle Lens

  • Swati Agarwal
  • August 2, 2022

At Bewgle, we take immense pride in our NLP capabilities on any unstructured data. Though we have primarily focused on drawing insights from feedback or similar text, I, as a data enthusiast, wanted to challenge the system beyond feedback. One of the use cases that I wanted to try was that of deriving insights from … Continue reading "Lowering Cholesterol Naturally – Through Bewgle Lens"