Odia

Select the language "oriya" from the model list. FastText (Wikipedia) : Pretrained Word vector (Wikipedia).Trained on Common Crawl and Wikipedia using fastText. FastText (CommonCrawl + Wikipedia) : Pretrained Word vector (CommonCrawl + Wikipedia).BertOdia : Bert-based Odia Language Model.Language Model : Pretrained Odia Language Model.Odia-Santali Dialect Detection Corpus : This corpus contains text data of Odia and Santali written in Odia script.Indian Language Corpora Initiative : It contains parallel annotated corpora in 12 Indian languages including Odia (tourism and health domain).IndoWordNet : Wordnet for Indian languages including Odia.OSCAR Corpus : It contains around 300K Odia sentences.Vocabulary frequency files are available. It used 3.5M Odia sentences to build the embedding. AI4Bharat-IndicNLP Corpus : The text corpus not available now (will be available later).OdiEnCorp 1.0 : This dataset contains 221K Odia sentences.EMILLE Corpus : It contains fourteen monolingual corpora for Indian languages including Odia.It contains 60K English-Odia parallel sentences. CVIT PIB : Parallel corpus for En-Indian languages mined from press information bureau website of India.It contains 38K English-Odia parallel sentences.

PMIndia : Parallel corpus for En-Indian languages mined from Mann ki Baat speeches of the PM of India.IndoWordnet Parallel Corpus : Parallel corpora mined from IndoWordNet gloss and/or examples for Indian-Indian language corpora (6.3 million segments, 18 languages including Odia).OdiEnCorp 1.0 : This dataset contains 30K English-Odia parallel sentences.The collection of data are domain-specific and noisy. OPUS Corpus : It contains parallel sentences of other languages with Odia.OdiEnCorp 2.0 : This dataset contains 97K English-Odia parallel sentences and serving in WAT2020 for Odia-English machine translation task.It contains many language applications, resources, and tools for Odia such as Odia terminology application, Odia language search engine, wordnet, English-Odia parallel text corpus, English-Odia machine-assisted translation, text-to-speech software, and many more. TDIL : It contains language application, resources, and tools for Indian languages including Odia.All contributors are listed on the CONTRIBUTOR list. This is a collective effort and any contribution to enriching Odia NLP resource are welcome. The purpose of this catalog is to provide a one-stop solution for the researchers looking for Odia NLP resources. A Catalog for Odia Language NLP Resources

YOUR CART