If you have a business, you know how important it is to get online reviews from your clients (good reviews, ideally). Reviewed products or services help other potential clients in their consumer journey. Positive, informative reviews are a very a powerful factor in the decision to purchase a product, make a reservation at a restaurant, or book a hotel. Let’s assume you have managed to engage them in that interaction. Your clients are reviewing you: what are they writing about?
You are probably reading and analyzing these texts to get to know your clients’ insights. In this blog post, we will explain a two-step solution which will allow you to explore your data in a more intelligent way. First, we set up a dashboard to visualize existing information and second, we use Natural Language Processing (NLP) to enrich the dashboard. We do it by adding a search per topic at the hand of information extraction.
A solution to explore customer reviews
We have worked with a collection of more than 3000 reviews about Devour Tours, a tourism company that offers gastronomic tours. Up until now, the company was storing and analyzing these data manually in an Excel sheet (sounds familiar?). As such, the database already has lots of information, provided you know how to make sense of the data by processing and plotting it.
Step 1. A dashboard with search, analytics and visualizations
Excel can be an ally, but as your database grows, it can quickly turn on you. That is why many companies are opting for ways of simplifying this “sense-making” step as well, and they prefer to use optimized search engines and dashboards such as ElasticSearch and Kibana. Kibana “offers a convenient window to the ElasticSearch data”. You can create customizable dashboards which allow you to easily navigate, explore and analyze your data in a single screen. You can use it to visualize counts, timelines, correlations of factors… Data becomes meaningful at a glance, even without any processing.
So we first transferred Devour Tours’ reviews to an ElasticSearch index. Each review consists of an evaluation score (1 to 5), a date, and comments made about the tour. We then designed a dashboard in Kibana. We added a timeline of the number of reviews, the percentage of each score, and we zoomed in the very few negative comments to read them in depth. (Devour Tours has more than 96% of 5-point reviews… so it is not a lot to read).
Step 2. NLP solution: Add topic information to the reviews
3000 reviews written by different individuals may seem like a random mess where one has to find meaning. However, clients usually discuss the same aspects of the tours: the quality of the food, the ambience in the bars they visited, what they thought about the guides… That is, although each review is different in wording, order and style, there are topics that come back giving some structure to the texts. We wanted to help the company to access those patterns faster so that they could make sense of their data in a focused way.
With that in mind, we decided to enrich the data with this more structured information in order to include it in the dashboard. We trained a machine learning model to automatically detect several topics regarding the comments.
Their own topics
The categories we annotated for were specific to guided tours and adjusted to Devour Tours’ value proposition (guide, food, bars, group, organization, tips, value, recommendation, learning and overall experience). More than 400 comments were hand-labelled to train the model.
All our annotations are done with our own software, Avantopy Annotate. If you want to read more about manual annotation, here is another blog post.
The machine found some topics easier to classify than others. In fact, the contents within categories such as guide or bar are more clear-cut. Others such as overall experience often refer to several of the other topics, and could be treated as a more general category. To take this into account, we decided to assign several topics to a single comment if the algorithm showed doubt. This effectively often led to comments dealing both with experience and guide or food.
By providing a topic field to the data, the search engine becomes an even more useful tool to access the reviews. Besides searching the full content, one can now also perform analyses including the topics and visualize them in the dashboard. For instance, one might want to plot the most frequent topics in general, or in particular those that are most talked about in good or bad reviews, or the evolution of these topics in time.
Text-based data such as customer reviews are extremely valuable, but also less straightforward to access than numerical data. In this post, we have explained how we can make texts such as reviews available in a search engine and successfully enrich it through NLP.
A dashboard built on top of a search engine is a great addition to your company’s analytic capabilities. It can provide you with insights on quantitative matters and a structured view on qualitative aspects.
What is next? In order to improve the insights, further NLP analyses such as sentiment analysis can be applied. The ability to distinguish positive from negative comments can be a great way to improve a qualitative analysis. As we add more structured information to the reviews, we can get a more refined picture of their contents.