To search, Click below search items.


All Published Papers Search Service


An Investigative Design Based Statistical Approach for Determining Bangla Sentence Validity


Md. Riazur Rahman, Md. Tarek Habib, Md. Sadekur Rahman, Shaon Bhatta Shuvo, Mohammad Shorif Uddin


Vol. 16  No. 11  pp. 30-37


Automatic grammatical verification of sentences is an essential task in natural language processing. There has been a scarcity of resources in Bangla for such tasks. To address this issue this paper presents a new n-gram based statistical approach to check the syntactic and semantic correctness of sentences in Bangla. An n-gram frequency count-based probabilistic language model is employed combining standard n-gram statistics with appropriate smoothing and advanced backoff language model to detect validity of any sentence in Bangla to design the proposed method. A new Bangla corpus of 10 million words is used to train the proposed method. The system was tested on both valid and invalid sentences collected separately from training corpus. In terms of detecting correct and incorrect sentences the proposed system achieved 82% precision and 81% recall scores outperforming the existing systems.


sentence validity detection natural language processing n-gram smoothing backoff strategy language model..