Abstract
|
The article describes a new method for author identification across English and Arabic texts using machine learning and deep neural networks, based on a comprehensive framework that employs five algorithms?Random Forest, Logis- tic Regression, k-nearest Neighbors, Support Vector Machines, and Naive Bayes, alongside deep learning approaches. Using the TF-IDF method for feature extraction, the authors analyze two datasets: the Victorian Era Authorship Attribution and a dataset for Arabic author identification. The results show that Support Vector Machines and Logistic Regression demonstrate strong per- formance in authorship attribution, effectively capturing nuanced writing styles with accuracy rates reaching 95%. The authors illustrate the proposed method through detailed comparative analyses and highlight its applicability in forensic linguistics, pla-giarism detection, and literary analysis. The method significantly improves accuracy and computational efficiency in authorship attribution tasks. The new methods effectiveness evaluation is confirmed by the higher precision, recall, and F1-scores obtained compared to traditional methods. New research results develop the field of text mining and can be used for enhancing security measures and understanding linguistic patterns across diverse languages. This studys novelty and scientific contribution lie in its cross-lingual approach and integration of multiple advanced algorithms, providing robust tools for text analysis in multilingual settings.
|