To search, Click below search items.


All Published Papers Search Service


Keywords Extraction from the Text of Holy Quran Using Linguistic and Heuristic Rules


Fahad Mazaed Alotaibi, Shakeel Ahmad


Vol. 19  No. 2  pp. 82-87


The digital age has transformed the way information is extracted from structured and unstructured textual data. Researchers find gaining insights into data for useful and actionable knowledge a great challenge. Natural Language Processing (NLP) is a popular and challenging research area of computer science and artificial intelligence. One key task of NLP is the extraction of keywords from the source text, which is the foundation of a solid search for and understanding of actionable insights. In addition to being used for searching, keywords are used for checking the relevancy of source documents, labeling document clusters, text summarization, text categorization, filtering information, and creating Vector Space Model (VSM) for the training machine learning (ML) model. Search engines also use sophisticated algorithms for searching based on keywords and other optimization parameters. The aim of this research is to extract the keywords from the text of the Holy Quran using linguistic and heuristic rules. The complexity of the NLP task varies across the natural languages. Arabic is a morphologically rich language, in which significant syntactic relations are found at word level. The proposed method is based on possessive (?????? ??????? ????) and genitive (???? ????) expressions as well as other important syntactic structures. In the first phase, the model generates unigram, bigram, and trigram expressions, while in the second phase keywords are identified. Keywords are very useful for search engines as well as clustering and generating the summary concepts of documents.


Natural Language Processing(NLP), Machine Learning(ML), Vector Space Model(VSM)