To search, Click
below search items.
|
|

All
Published Papers Search Service
|
Title
|
An Investigative Design Based Statistical Approach for Determining Bangla Sentence Validity
|
Author
|
Md. Riazur Rahman, Md. Tarek Habib, Md. Sadekur Rahman, Shaon Bhatta Shuvo, Mohammad Shorif Uddin
|
Citation |
Vol. 16 No. 11 pp. 30-37
|
Abstract
|
Automatic grammatical verification of sentences is an essential task in natural language processing. There has been a scarcity of resources in Bangla for such tasks. To address this issue this paper presents a new n-gram based statistical approach to check the syntactic and semantic correctness of sentences in Bangla. An n-gram frequency count-based probabilistic language model is employed combining standard n-gram statistics with appropriate smoothing and advanced backoff language model to detect validity of any sentence in Bangla to design the proposed method. A new Bangla corpus of 10 million words is used to train the proposed method. The system was tested on both valid and invalid sentences collected separately from training corpus. In terms of detecting correct and incorrect sentences the proposed system achieved 82% precision and 81% recall scores outperforming the existing systems.
|
Keywords
|
sentence validity detection natural language processing n-gram smoothing backoff strategy language model..
|
URL
|
http://paper.ijcsns.org/07_book/201611/20161106.pdf
|
|