top of page
  • Writer's pictureJuhi Jain

Why accuracy matters in your chatbots


Businesses today are constantly looking for innovative ways to enhance their customer experience and chatbots have emerged as critical elements in online user interactions. You can find chatbots everywhere from banking services to food-ordering applications. But, just the existence of a chatbot is not enough to ensure successful online interactions that increase customer satisfaction and solve the issues that customers have.

Here are some examples of how chatbots have sometimes performed poorly, thus not enhancing customer experience, to say the least. The examples taken are from a chatbot of a private Indian bank and from a website-building tool.

What went wrong? Why are the chatbots not able to understand what the customer wanted? This article will walk you through the reason behind these pitfalls. And don’t think that these are pitfalls that happen very rarely, in fact, 73% of chatbot conversations are not satisfying for customers, do not answer the questions, or do not solve their problems (Source).

73% of chatbot conversations are not satisfying for customers, do not answer the questions, or do not solve the problems they have.

Before diving right into the technicalities of why chatbots sometimes fail, we need to understand the primary component in the functionality of a chatbot: NLU or Natural Language Understanding.

What is NLU (Natural Language Understanding)?

NLU is an AI-driven software that is able to say what the sentence, that has been sent to the chatbot by the customer, is overall about and to extract relevant information within that sentence.

The technical term for “what the sentence is overall about” is Intent Classification, and we speak about Entity Recognition when we “extract relevant information within that sentence”.

NLU models are most commonly used to automate customer engagement through chatbots. Moreover, they are used to flag hate speech or abusive comments on social media and gaming platforms and analyze how your customers react to products, events or campaigns through sentiment analysis.

So, why are some chatbots then just not understanding what is the customer’s intent and misses to detect entities? NLU is an AI-driven software and the most important point about AI software is that they do not always get everything correct. You can search through your Photos on your iPhone and will see that not all pictures of your parents are shown when you type in the search field “mum and dad”, or you perhaps were one of the lucky people who got unintentional freebies at Amazon Fresh when their cameras were not able to detect all of the items you put in your bag.

Whether it’s searching through Photos on your iPhone or placing items in your bag at Amazon Fresh - AI software is behind these automation features, and it is never correct 100% of the time. In fact, not even humans would be 100% of the time correct…

So wouldn’t it be great to minimise the instances when the chatbot does not understand what the customer wants? Of course, it would, and that’s what the research community in AI is thriving for: making our AI software tools more often being correct in their outputs, whether that’s a classification, prediction or anything in between.

For an objective evaluation of how often any AI software is correct, researchers have come up with multiple metrics, and one that is used very often is accuracy.

What is accuracy?

Accuracy measures the percentage of correct answers or outputs of the AI software in relation to how many inputs have been given to the AI software in total.

In other words, we measure how many input utterances are given to the AI software, how many of those were classified correctly, and divide the latter by the former. For chatbots, accuracy is measured on intents, which means that we want to know the percentage of correctly classified sentences or phrases.

Let’s walk through this with an example: we take 100 utterances (or sentences) that the NLU model has not seen before as part of its training set. We, as humans, know what is the meaning behind each of these 100 utterances. Now, we process one utterance after the other by our NLU model and check what meaning our NLU model thinks these utterances have. When the NLU model thinks the meaning is the same as we humans think, we say that the classification was done correctly and add 1 to the number of input utterances that were classified correctly and 1 to the number of input utterances that are processed by the AI software, of course. If the NLU model did not return the same meaning, or intent, as we humans did, we only add 1 to the number of input utterances that are processed by the AI software but not to the number of input utterances that were classified correctly.

This results in a simple metric that gives you the percentage of correctly classified input utterances out of a set of utterances that has never been seen by the AI software before. We call this set a test set.

In mathematical terms,

Accuracy = Correctly classified test utterances Total number of test utterances

The need for accurate NLU models

So now that we have understood what accuracy actually is, let’s dive into the benefits of very accurate NLU models. Here’s how an accurate NLU model, thus an accurate chatbot can add value to your business:

1. Overall user experience

An accurate NLU model enables the chatbot to understand and interpret user input more accurately. This means that the chatbot is better able to respond to user queries and requests, leading to a more satisfactory user experience because users’ queries are answered correctly.

2. Reduced errors

An accurate NLU model reduces the number of errors made by the chatbot, which can help to improve the overall quality and reliability of the chatbot. This leads to a higher percentage of automated customer conversations, which reduces costs for customer support departments.

3. Enhanced personalization

With an accurate NLU model, chatbots can be programmed to provide personalized recommendations and advice to customers based on their previous interactions and purchase history. This can enable the chatbot to understand and respond to user input in a more personalized and context-aware manner, improving overall customer satisfaction.

4. Customer retention

When a chatbot conversation has proven to be successful in the past, users are more likely to return to your platform.

5. Increase the number of automatically handled customer queries

When launching a new chatbot you may want to start with only a few customer queries that can be handled automatically by the chatbot. Over time, however, you can increase the accuracy of the messages that fall under one of the query categories (intents) that can be handled automatically by the chatbot. After having reached a high level of accuracy on existing query categories, you can add more categories to the chatbot capabilities that are more complex. This results in fewer customer interactions with agents, which again saves costs.

How much Businesses can save with a Chatbot

Let us now walk you through a couple of example calculations of how much costs can actually be saved in different parts of the world, based on average salaries for customer support agents.

The total cost includes the cost of a chatbot assuming the average customer conversation includes six question-answer interactions (called NLU parse requests), and one NLU parse request costs 0.001 USD. In the tables below, we show the percentage of customer conversations/queries that are automatically responded to and solved by a chatbot. The abbreviation “pm” stands for “per month”.

Saudi Arabia


  • Cost per month per customer support agent: $1700

  • Conversations per month per agent: 2000



  • Cost per month per customer support agent: $970

  • Conversations per month per agent: 2000



  • Cost per month per customer support agent: $210

  • Conversations per month per agent: 2000



  • Cost per month per customer support agent: $2040

  • Conversations per month per agent: 2000

To Conclude

The importance of accuracy in a chatbot cannot be overstated. A chatbot with high accuracy is integral for a positive experience for users and is more likely to be used frequently. The above calculations show how with chatbot support, businesses can reduce costs by more than 50%!

Request a demo with our team and let us help you cut costs with the most accurate NLU models supporting nearly 100 languages.

We compared NeuralSpace’s intent classification accuracy with IBM Watson, Google Cloud’s Dialogflow and Amazon Lex. See the results here.

51 views0 comments