top of page
  • Writer's pictureJuhi Jain

Transliteration Overview


Introduction


Transliteration is the transformation of a word from one alphabet to another phonetically. For instance, you type a word on the Latin keypad the way you would pronounce it in Hindi and using transliteration you can convert that into Hindi.


With NeuralSpace’s Transliteration service, you can create content in over 40 language pairs using your Latin keypad. Or vice-versa, write your content in one of the 20+ languages in their native alphabet and transliterate it into the Latin (meaning English) alphabet with a click of a button.


We recently built a Transliteration Twitter bot using our Transliteration service. To use it, all you need to do is reply to the tweet that you want to transliterate with “@NeuralSpace Transliterate” and see the result in less than 5 seconds as a reply to the tweet. Below is a quick demo, try it out!



Features

  • Off-the-shelf Models: Use our pre-trained production-ready state-of-the-art models through APIs and integrate them in any application or software product

  • Language Support: Over 40 language pairs are supported including Arabic dialects and 12 Indian languages

  • High Throughput: Get a response in less than 400 milliseconds and scale automatically to match your requests per second load

  • Train Your Own with AutoNLP (coming soon): Using AutoNLP improve/customize existing transliteration models with your own data

  • Accelerate Dataset Creation with our Data Studio (coming soon): Equipped with handy utility tools, our Data Studio is an in-browser text editor for creating datasets

  • Easy to Integrate and Scale (coming soon): Scale or replicate your deployed models for higher availability and throughput and integrate them with your application through REST APIs

AI Modeling Life Cycle

Just like other services on NeuralSpace, Transliteration takes care of the entire AI modeling lifecycle, which is

  • Dataset preparation

  • Model training

  • Model deployment

  • Feedback loop

You can upload your existing datasets or create your own in the Data Studio — NeuralSpace’s data preparation and annotation tool which is designed to make dataset creation and modification much faster.


Training a custom Transliteration model using AutoNLP is as easy as clicking on the Train with AutoNLP button once your dataset is uploaded and prepared in the Data Studio. After your training is completed after a couple of minutes, you can place your model in production. NeuralSpace’s in-house developed AutoMLOps feature allows you to use your custom-trained models with throughput rates of up to 30 requests per second. Just click on the Deploy button next to the trained model that achieved the best performance and let AutoMLOps handle the rest for you.


Once deployed, test your models using our interactive model testing and feedback mechanism, by clicking on Test model and Feedback page, respectively. The Feedback page lets you browse through everything that has passed through your models and you can directly add sentences that were translated incorrectly back to your dataset. This will start a feedback-driven learning cycle and you should retrain your models to keep them up to date.


Use-cases


Chatbot Use Cases​


- Integrating with Google Maps, Spotify, and other APIs

If you have tried integrating Google Maps or Spotify APIs with your chatbots in, e.g., Tamil, Arabic, Chinese or Greek, you know they rarely produce acceptable results.

With Transliteration, you can extract entities like names, addresses, songs, etc. in your local language and convert them into the Latin (or English) alphabet to fully utilize APIs from Google Maps, Spotify and others.


- Creating NLU Data for Chatbots

A common practice is to create a dataset in English and translate it to another language using Google Translate or similar APIs. While this can work well for simple FAQ chatbots, contextual chatbots require well-structured/meaningful data in the local languages. Transliteration-powered typing tools can help accelerate the dataset creation process for chatbots in languages that don’t use the Latin (or English) alphabet.


Accessibility Use-cases​


- Low-Cost Content Accessibility in Multiple Languages

Internationalization/Translation can be extremely expensive when your databases are in English and dynamic, ever-expanding with content like songs, albums, or addresses. Transliteration can be a handy tool to convert such content on the fly to other languages that don’t use the Latin (or English) alphabet. It can be indexed using popular frameworks like Elastic-search, or other similar tools and instantly made accessible in multiple languages. This can potentially save thousands of dollars on manual translation efforts.


- Content Creation

Something as simple as typing can be extremely challenging when we talk about languages spoken in the Middle East, Africa, India and South East Asia. With governments and financial institutions mandating inclusion and content creators being more used to the Latin letter keypad, a tool like Transliteration can be very handy to make it easy to create local language content.

Language Support


English -> Hindi Hindi -> English English -> Arabic Arabic -> English English -> Bengali Bengali -> English English -> Gujarati Gujarati -> English English -> Kannada Kannada -> English English -> Malayalam Malayalam -> English English -> Marathi Marathi -> English English -> Punjabi Punjabi -> English English -> Sinhala Sinhala -> English English -> Tamil Tamil -> English English -> Telugu Telugu -> English English -> Urdu Urdu -> English English -> Bulgarian Bulgarian -> English English -> Greek Greek -> English English -> Armenian Armenian -> English English -> Georgian Georgian -> English English -> Macedonian Macedonian -> English English -> Mongolian Mongolian -> English English -> Russian Russian -> English English -> Serbian Serbian -> English English -> Ukrainian Ukrainian -> English



 


Check out our Getting Started guide to learn how to use NeuralSpace’s Transliteration service.


The NeuralSpace Platform is live, test and try it out by yourself! Sign-up to get $200 worth of credits — what are you waiting for?


Join the NeuralSpace Slack Community to connect with us, ask questions and collaborate on exciting projects with other community members. Also, receive updates and discuss topics in NLP for low-resource languages with fellow developers and researchers.


Check out our Documentation to read more about the NeuralSpace Platform and its different services.


Happy NLP!



24 views0 comments

Recent Posts

See All
bottom of page