NeuralSpace Release Notes: What’s new in Version 1.4.0?
With two brand new speech AI services, we have delivered some major Platform updates. Scroll down to catch up!
With this release, we launch our Speaker Identification service. This service automatically identifies the number of speakers in an audio file and defines which parts of the audio belong to which speaker.
This is a perfect tool to transcribe meetings, videos with multiple speakers, phone calls, etc.
Our new Voice Extraction service smoothly separates the audio of a speaker from the background noise in an audio file. This helps to improve the quality of transcriptions especially when there is a lot of background noise.
This a perfect tool for auto-overdubbing videos. With a service like this, you will never have to worry about the background noise present in the video. You can extract the voice and background audio and then overlay your overdub audio on the background audio.
File management system
Files can now be shared across different services. For example, the file you upload to use for Transcription can be also used for Voice Extraction through a unique file ID which you will get upon uploading the file.
A new analytics page is now available for Language Understanding and Entity Recognition. It tells you which intents and entities are performing well and which ones aren't. Along with a detailed classification report, you get an interactive confusion matrix (shows the performance of the trained model) for both services.
You can register a webhook on the platform now and get live updates on all asynchronous tasks. E.g., model status while it is training, file status during transcription, voice extraction, or speaker identification. You can also get the status of batch TTS requests. This way you will never have to poll the status API, again and again, to perform a subsequent task based on it’s status.
We have expanded our Language Support to 74 languages spoken across the Asian, African, the Middle Eastern and European regions. More languages and domains will be added soon. Feel free to reach out to us if you have any preferences.
Our Text-to-Speech service covers over 40 languages and more than 200 AI voices! More languages and voices will be added soon. Feel free to reach out to us if you have any preferences.
In the next release, we aim to introduce a brand new AutoNLP pipeline that runs faster and has amazing results to offer.
Speech-to-Text models will be able to adapt as per your custom vocabulary with only textual data.
Text-to-Speech models with the ability to accurately clone celebrity voices.
Speaker identification using speaker samples. : You can upload 30s audio files for specific speakers to identify them by name.
If you haven’t yet, sign-up on the NeuralSpace Platform to try and test it out by yourself! Get started with $200 worth of credits.
Be sure to check out our Documentation to read more about the NeuralSpace Platform and its different services.