Abstract
|
Recently, breast cancer has become the second leading cause of death from cancer in women. Although most studies have reported that this form of cancer is preventable and many of the risks can be avoided in its early stages, most of the traditional methods of detecting and diagnosing cancer take place at a very late stage. The classification method is one of the data mining techniques used as a detection method in early stage detection for this type of cancer. Feature selection methods have a positive impact and significant enhancement when used with classification methods. They result in increasing the classification accuracy, since they select the important features of images or any data instances. The objective of this study is to investigate the potential bene?t of using the feature selection algorithm as a pre-processing stage for enhancing the classification accuracy of the support vector machine, and to propose a fusion scheme for selecting the best and related features for mammogram images. For this purpose, four feature selection algorithms were chosen, namely mutual information (MI), the statistical dependence measure, the relief-based algorithm and the correlation based algorithm. Extensive experiments have been performed using one of the benchmark datasets, that of the Mammographic Image Analysis Society (MIAS), to test the proposed method on two classes, benign and malignant masses. The results showed that our proposed method at (85 ? 15%) data splitting percentage has a classification accuracy of 75% and 93.75% and positive rate of 87.5% and 88.89% for the top seven and top five features, respectively.
|
Keywords
|
Machine learning, classification, feature selection, support vector machine, breast cancer detection)
|