To search, Click below search items.

 

All Published Papers Search Service

Title

Collecting SMT Language Model Training Data for Low Source Language

Author

Mamtily Nighmat and Izumi Yamamoto

Citation

Vol. 17  No. 11  pp. 103-107

Abstract

Statistical machine translation (SMT) system basically relies on parallel corpus [1]. Different than Rule based Machine translation (RBMT) approach, capability of SMT system almost depends under the size of corpus. The quality of corpus became a key to build better SMT translation system. In this work, parallel corpus [2] in three languages translated to Uyghur language one by one manually evaluated and applied as train data for Uyghur language model. As a conclusion, comparison between parallel corpus in different grammatical structure language and similar structure language has been discussed.

Keywords

Machine Translation, SMT, Parallel Corpus.

URL

http://paper.ijcsns.org/07_book/201711/20171113.pdf