To search, Click below search items.

 

All Published Papers Search Service

Title

Determining Feature-Size for Text to Numeric Conversion based on BOW and TF-IDF

Author

Hasan J. Alyamani

Citation

Vol. 22  No. 1  pp. 283-287

Abstract

Machine Learning is the most popular method used in data science. Growth of data is not only numeric data but also text data. Most of the algorithm of supervised and unsupervised machine learning algorithms use numeric data. Now it is required to convert text data into numeric. There are many techniques for this conversion. Researcher confuses which technique is best in what situation. Here in proposed work BOW (Bag-of-Words) and TF-IDF (Term-Frequency-Inverse-Document-Frequency) has been studied based on different features to determine best method. After experimental results on text data, TF-IDF and BOW both provide better performance at range from 100 to 150 number of features.

Keywords

Machine Learning, Supervised and Un-Supervised Learning, TF-IDF, BOW

URL

http://paper.ijcsns.org/07_book/202201/20220139.pdf