Search
  • Juhi Jain

Augmentation at NeuralSpace



Introduction


Any language processing task requires data, more the data the better. And we all wish we could generate data magically. Getting more data is not always easy, it requires a lot of manual effort and most importantly, a lot of money.


What if you could generate semantically similar data from existing data and use it to train your AI models. That would make things so much easier, right?


That is exactly the idea behind Augmentation.


| What is Augmentation?


Augmentation is the method of synthesizing new data from existing data. Given a sentence Augmentation can generate up to ten sentences keeping the intent of the original sentence intact.


Augmentation can make Conversational AI more robust as it is common to train Conversational AI on user utterances, which means different ways in which a user would say something. In such cases, you can use Augmentation to 10x your dataset size and increase the accuracy of your Conversational AI model.


You can be creative and use Augmentation to generate synthetic text data for any other use cases.



Let us take an example:


If you enter a text - “this is very cool”. Augmentation increases the amount of data by adding slightly modified copies of already existing data all while keeping the intent of the original sentence. Like here you may see the augmented results as:


  • This is pretty cool

  • This is super cool

  • This is really cool

  • This is so cool


Features of NeuralSpace’s Augmentation


  • State-of-the-art Models: Use our state-of-the-art augmentation models through APIs and integrate them into any application

  • Easy to Use: Simply pass your text through the API, and get up to 10 augmented sentences

  • Language Support: Over 100 languages supported



Language Support


Assamese (as)

Bengali (bn)

Gujarati (gu)

Hindi (hi)

Kannada (kn)

Malayalam (ml)

Marathi (mr)

Nepali (ne)

Odia (Oriya) (or)

Punjabi (pa)

Sindhi (sd)

Sinhala (Sinhalese) (si)

Burmese (Myanmar) (my)

Cebuano (ceb)

Hmong (hmn)

Indonesian (id)

Javanese (jv)

Khmer (km)

Malay (ms)

Loa (lo)

Sundanese (su)

Tagalog (Filipino) (tl)

Thai (th)

Urdu (ur)

Vietnamese (vi)

Arabic (ar)

Hebrew (he)

Pashto (ps)

Persian (fa)

Uighur (ug)

Turkmen (tk)

Armenian (hy)

Azerbaijani (az)

Chinese (Simplified) (zh-CN)

Chinese (Traditional) (zh-TW)

Georgian (ka)

Japanese (ja)

Kazakh (kk)

Kirghiz (ky)

Korean (ko)

Kurdish (ku)

Kyrgyz (ky)

Mongolian (mn)

Russian (ru)

Tagalog (tl)

Tajik (tg)

Tatar (tt)

Uzbek (uz)

Afrikaans (af)

Amharic (am)

French (fr)

Hausa (ha)

Igbo (ig)

Kinyarwanda (rw)

Malagasy (mg)

Nyanja (Chichewa) (ny)

Sesotho (st)

Shona (sh)

Somali (so)

Swahili (sw)

Xhosa (xh)

Yoruba (yo)

Zulu (zu)

Albanian (sq)

Aragonese (an)

Bashkir (ba)

Basque (eu)

Belarusian (be)

Bosnian (bs)

Breton (br)

Bulgarian (bg)

Catalan (ca)

Chechen (ce)

Chuvash (cv)

Corsican (co)

Croatian (hr)

Czech (cs)

Danish (da)

Dutch (nl)

English (en)

Esperanto (eo)

Estonian (et)

Finnish (fi)

Frisian (fy)

Galician (gl)

German (de)

Greek (el)

Hungarian (hu)

Icelandic (is)

Irish (ga)

Italian (it)

Latin (la)

Latvian (lv)

Lithuanian (lt)

Luxembourgish (lb)

Macedonian (mk)

Maltese (mt)

Norwegian Bokmål (nb)

Occitan (oc)

Polish (pl)

Portuguese (pt)

Romanian (ro)

Scots Gaelic (gd)

Serbian (sr)

Slovak (sk)

Slovenian (sl)

Spanish (es)

Swedish (sv)

Turkish (tr)

Ukrainian (uk)

Welsh (cy)

Yiddish (yi)

Dutch (nl)

Haitian (ht)

Hawaiian (haw)

Portuguese (pt)

Samoan (sm)

Maori (mi)


 


Check out our Getting Started guide to learn how to use Augmentation.


The NeuralSpace Platform is live, test and try it out by yourself!

Early sign-ups get $500 worth of credits — what are you waiting for?


Join the NeuralSpace Slack Community to connect with us, ask questions and collaborate on exciting projects with other community members. Also, receive updates and discuss topics in NLP for low-resource languages with fellow developers and researchers.


26 views0 comments