To search, Click below search items.

 

All Published Papers Search Service

Title

Evaluation of Similarity Analysis of Newspaper Article Using Natural Language Processing

Author

Ayako Ohshiro, Takeo Okazaki, Takashi Kano, and Shinichiro Ueda

Citation

Vol. 24  No. 6  pp. 1-7

Abstract

Comparing text features involves evaluating the ¡±similarity¡± between texts. It is crucial to use appropriate similarity measures when comparing similarities. This study utilized various techniques to assess the similarities between newspaper articles, including deep learning and a previously proposed method: a combination of Pointwise Mutual Information (PMI) and Word Pair Matching (WPM), denoted as PMI+WPM. For performance comparison, law data from medical research in Japan were utilized as validation data in evaluating the PMI+WPM method. The distribution of similarities in text data varies depending on the evaluation technique and genre, as revealed by the comparative analysis. For newspaper data, non-deep learning methods demonstrated better similarity evaluation accuracy than deep learning methods. Additionally, evaluating similarities in law data is more challenging than in newspaper articles. Despite deep learning being the prevalent method for evaluating textual similarities, this study demonstrates that non-deep learning methods can be effective regarding Japanese-based texts.

Keywords

Pointwise Mutual Information, Simpson coefficient, Doc2vec, BERT, Newspaper.

URL

http://paper.ijcsns.org/07_book/202406/20240601.pdf