Why Hello Ebbot chose NeuralSpace as their NLU provider
Introduction to Hello Ebbot
Hello Ebbot is a Stockholm-based early-stage company that started operating in 2018. Specializing in Conversational AI in Scandinavian languages, they continue to develop their popular chatbot Ebbot using the latest technology. Their customer range varies from electronic retail companies (NetOnNet) to municipalities across Sweden (for example, Varbergs Kommun). Other prominent customers include Byggmax & Oresunds Kraft. Like so many other Conversational AI companies, Hello Ebbot used a natural language understanding (NLU) engine from Google Cloud, AWS, or Microsoft Azure in their bot-building platform.
Introduction to NeuralSpace
NeuralSpace is a Natural Language Processing (NLP) company specializing in local, or low-resource languages. The NeuralSpace Platform is developers’ one-stop solution for any NLP features in more than 80 languages, does not require any machine learning expertise, and all that is needed is a handful of data to train your custom model with a click of a button. The NeuralSpace Platform is modular and each of its Apps, from NLU to Speech-to-Text and Transliteration, can be taken as a standalone product and even installed on a private cloud if required.
Software development companies are often concerned with the following questions while choosing an NLU provider for their solution. We answer them below:
How much data is enough to get started?
This depends on two factors: (1) the initial accuracy level that you expect your chatbot to have when used by consumers; and (2) how many different intents the chatbot should be able to classify. Generally speaking, we recommend that developers should start with about 10 examples per intent to go live, which gives the chatbot an accuracy level of approximately 80% in most cases. Then, developers can build up their training data set with actual user inputs, annotate them, retrain the chatbot and quickly achieve accuracy levels of above 90%. If consumers use language that contains technical phrases or domain-specific jargon, we recommend starting with at least double the amount of examples for the initial training.
What would be the lifecycle of a model?
Once developers have prepared their data and trained a model, it can directly be connected to any product or application by REST APIs. Then, live consumer input is collected and saved and can be used to retrain the model with more data. NeuralSpace has built a separate dashboard for this feedback process that allows chatbot developers to quickly annotate and retrain their models. We generally recommend retraining the model at least once a month but this is obviously the decision of developers. As the NeuralSpace Platform has an integrated version-control system, there is no harm done by retraining a model, assessing its accuracy and conducting some manual tests before it is deployed in the developer’s product or application.
How does the NeuralSpace Platform perform so well in low-resources languages?
Low-resource languages are those languages that have a limited number of training examples available, compared to English, Spanish, French, or German. To give you some numbers here, many low-resource languages only have 1% or even less data available compared to the language with the most examples, which is usually English. Hence, pre-trained models are scarce and advanced deep learning architectures like GPT-3, which require huge amounts of data to train, are usually not usable with low-resource languages. NeuralSpace has developed its proprietary deep learning architectures that are used in the NeuralSpace Platform for all of its 80+ languages. The model architectures have been developed and refined over numerous iterations by benchmarking them on diverse data sets. In addition to this, synonyms, regex, and lookup tables are also used as entity extractors to have a more robust way of predicting intents given a user input. Also, the NeuralSpace Platform has data-enhancing features like data augmentation (called Augmentation), machine translation, and transliteration which can help developers to either expand their training data easily up to ten times or to translate their training data into a high-resource language like English.
Are data shared between customers of the NeuralSpace Platform?
Data are not shared between customers of the NeuralSpace Platform. Every customer receives a copy of a pre-trained model that possesses some level of intrinsic knowledge about the language and can handle a given set of entities. Training this model to recognize intents is very efficient because that intrinsic knowledge is automatically leveraged for any intent that is added by the chatbot developer. It does not only allow developers to achieve high accuracies with a small number of training examples but also train, deploy and test in a time-efficient manner. Although AutoNLP is known for requiring more training time, developers should not wait longer than 15 minutes for most of their data sets.
Are personal data used for predicting?
Consumers’ personal data are not used in pre-training any models offered on the NeuralSpace Platform. However, developers should upload personal data to make the predictions of their private copy of a pre-trained model more accurate and customized to their use cases. This process is called fine-tuning and NeuralSpace’s AutoNLP can easily do this within a few minutes. Again, it is sufficient to start the fine-tuning process with 10 examples per intent only to go live. A good practice is to use the chatbot users’ actual input as additional data with NeuralSpace’s feedback feature.
Where are data stored physically?
Data are stored in a private cloud hosted by Hello Ebbot in a data center in the EU, operated by OVH Cloud. This means that any of Hello Ebbot’s customer data never leaves their premises and they have maximum control over their data.
Does the NeuralSpace Platform comply with GDPR?
As the NeuralSpace Platform is hosted in Hello Ebbot’s private cloud, Hello Ebbot is responsible to store their customer data securely and according to GDPR guidelines. Neither NeuralSpace nor anyone else does not have the right to extract any personal data from Hello Ebbot’s private cloud.
Read more about NeuralSpace’s partnership with Hello Ebbot here.