AutoNLP and AutoMLOps at NeuralSpace
Introduction to NLP
Natural language processing (NLP) is a well-established sub-field of Artificial Intelligence, which has had tremendous success in recent years. Its applications have also exploded in terms of innovation and consumer adoption. Personal voice assistants and chatbots are prime examples of this, among many others. The field of NLP has roots in linguistics and has been around for more than 50 years. Using NLP, computers can comprehend natural language as humans do and the text or voice input is transformed into a computer-readable format.
NLP allows computers to mimic and understand our natural, spoken language. The field is at the intersection of computational linguistics and statistical machine learning. Language is incredibly complex, especially spoken language. Sarcasm, context, emotions, slang, insider phrases and the meanings that link them are especially difficult to understand and analyze for machines. Industry experts say that although NLP has grown since its beginning, its successful implementation remains among the biggest challenges in Big Data.
NLP involves a number of modeling approaches, from transformer-based deep learning to conditional random fields, all of which can result in the best possible performance for a given labeled data set. However, it is never obvious beforehand which approach will give the best accuracy for the given, unique data set. For example, if we build a model that should detect different topics, such as misogynist language, racial injustice, or other hate speech-related categories, the process of developing an accurate model to detect these topics requires testing hundreds of different approaches. Consequently, we have to perform hundreds of trials and experiments to find the most accurate approach.
To address all of these challenges, NeuralSpace came up with its proprietary language-agnostic AutoNLP: an algorithm that figures out which training pipeline, method, features, loss function and other hyperparameters will give the most accurate results on your unique data set. We built our AutoNLP highly data-efficient because creating data sets is itself a challenging task. For example, a data set of 1000 training examples usually achieves 95% accuracy and only takes about 5 minutes to be trained.
The NeuralSpace Platform consists of various Apps that each solve specific language automation tasks, from language understanding, entity recognition, machine translation, transliteration, data augmentation to speech to text and text to speech. Currently, only the language understanding App, can be trained with AutoNLP but soon all the Apps will be accessible to be trained with AutoNLP.
When software developers click on the Train with AutoNLP button after they have prepared their data set, there are several processes happening automatically behind the scenes, without any required interactions with the user. Initially, the NeuralSpace Platform selects the appropriate model architecture to train based on the language you have selected. The NeuralSpace Platform supports multilingual models too. The selection of the model architecture includes extracting features from a pre-trained model in the selected language which has been trained on millions of generic sentences.
After these features are extracted they are passed onto a custom model that is unique for each user. This custom model is again based on transformers and is then fine-tuned on the examples provided by the user. This enables our full model (pre-trained + custom) to get accurately trained on small amounts of data in a short time.
Our AutoNLP also controls for the right training duration so the models are not overfitting on the given, possibly small data sets, and models are not further trained when no significant improvements can be achieved anymore. This saves users unnecessary time and money consumed by the training of models. Whenever a model is trained, the NeuralSpace Platform allows you to configure how many parallel training jobs users would like to start.
In this concurrent training process, users can train up to five models at the same time while additional examples may be uploaded. With NeuralSpace’s version control system in place, users can easily benchmark multiple versions of the model that have been trained on more data. Also, training on NeuralSpace is optimized to use specific machines in order to get faster results.
In detail, if the data set is small, training will be queued into a pool of pre-defined workers; but if the data set is large, training happens on a dedicated VM where computing resources are not shared with other users. Given the generally high variance in small data sets, we recommend training at least three models in parallel for data sets smaller than 50 examples per intent up to 10 intents, while one model should be sufficient for more than 50 examples per intent up to 10 intents, given the smaller variance.
Now the question is how can we deploy the model so it can serve thousands of API requests every minute?
Once clicked on the Deploy button, your trained model is encapsulated into a production-ready infrastructure by AutoMLOps that allows your model to process a large number of requests per second with a linear increase in your infrastructure. In short, AutoMLOps provides scalability and availability to your models and it seamlessly manages all operations from compute resources allocation to scaling the model.
There are several things you need to have in order to deploy a language model in production. For any software application, it is common to host multiple instances (or replicas) of the same algorithm so that the application can handle high traffic loads and reduce failure rates. NeuralSpace’s AutoMLOps ensures that your model is 25/7/365 available and can process large amounts of API requests per second. When deploying, developers have the option to select multiple replicas, which jointly have low latency whenever your model is requested by the end-user. Also, in rare cases when one replica fails, developers have others to fall back on. This is achieved by serving your model replica in an autoscaling environment. The model replica uses our core pre-trained models as well and thus if the requests on your replica increases considerably our backend services automatically start auto-scaling. The replicas are deployed as an independent model with an independent service manager. Thus, while parsing, the replicas are directly called by our middleware so that the latency is minimum at all times. If replicas fail or crash in very rare instances, we have included an automatic spin-up of a replacing replica, which is live in less than ten seconds. Finally, developers can easily integrate deployed replicas into any web or software applications using REST APIs in almost any programming language.
The NeuralSpace Platform is live, test and try it out by yourself! Early sign-ups get $500 worth of credits — what are you waiting for?
Join the NeuralSpace Slack Community to connect with us. Also, receive updates and discuss topics in NLP for low-resource languages with fellow developers and researchers.
Check out our Documentation to read more about the NeuralSpace Platform and its different Apps.