To search, Click below search items.

 

All Published Papers Search Service

Title

Feature Extraction based Text Classification using K-Nearest Neighbor Algorithm

Author

Muhammad Azam, Tanvir Ahmed, Fahad Sabah, Muhammad Iftikhar Hussain

Citation

Vol. 18  No. 12  pp. 95-101

Abstract

Scientific publications has been increasing enormously, with this increase classification of scientific publications is becoming challenging task. The core objective of this research is to analyze the performance of classification algorithms using Scopus dataset. In text classification, classification and feature extraction from the document using extracted features are the major issues for decreasing the performances in different algorithms. In this paper, performances of classification algorithms such as Na?ve Bayes (NB) and K-Nearest Neighbor (K-NN) shown better improvement using Bayesian boost and bagging. The performance results were analyzed through selected classification algorithms over 10K documents from Scopus examined using F-measure and produced comparison matrices to estimate accuracy, precision and recall using NB and KNN classifier. Further, data preprocessing and cleaning steps are induced on the selected dataset and class imbalance issues are analyzed to increase the performance of text classification algorithms. Experimental results showed performances over 7% using K-NN and revealed better as compared to NB.

Keywords

K-NN, na?ve bayes, text classification, rapid miner, feature extraction

URL

http://paper.ijcsns.org/07_book/201812/20181213.pdf