Breast Cancer Prediction Using Machine Learning
Keywords:
Machine learning, Breast cancer, Dataset, AlgorithmAbstract
One of the most common cancers is breast cancer that occurs in women and it contributes greatly to the number of deaths that occur worldwide. Breast cancer is caused due to the presence of cancerous lumps inside the breast. A breast lump is a mass that develops in the breast. The lumps can be of various sizes and textures. The lumps found inside the breasts can be either cancerous or non-cancerous. If the lump is cancerous, then no diagnosis needs to be carried out. If the lump is found to be cancerous, then further diagnosis will be carried out to check whether the cancer has affected the rest of the body. The tests that are used for diagnosis are MRI, mammogram, ultrasound, and biopsy. Breast cancer is responsible for death of women from cancer. It is accountable for 16 percent of the overall deaths caused by cancer in the world. In this paper, we are going to predict whether lumps present in the breast are cancerous. To achieve this, we are going to make use of four algorithms which are Support Vector Machines (SVM), K-Nearest Neighbour (KNN). Random Forest and Naïve Bayes. We will compare the efficiency of the machine learning algorithms based on classification metrics and deduce the best one for this research
References
"What Is Breast Cancer? | Breast Cancer Definition", Cancer.org, 2019. [Online]. Available:https://www.cancer.org/cancer/breast-cancer/about/what-is-breast-cancer.html.
“Breast Cancer Prediction Using Data Mining Method”. [Online]. Available:https://www.researchgate.net/publication/319688741_Breast_Cancer_Prediction_Using_Data_Mining_Method.
“Breast cancer", World Health Organization, 2019. [Online]. Available:https://www.who.int/cancer/prevention/diagnosis-screening/breast-cancer/en/.
Suleyman Vural, Xiaosheng Wang and Chittibabu Guda, “Classification of breast cancer patients using somatic mutation profiles and machine learning approaches,” in The International Conference on Intelligent Biology and Medicine (ICIBM), Indianapolis, USA, 2015.
Mogana Darshini Ganggayah, Nur Aishah Taib, Yip Cheng Har, Pietro Lio and Sarinder Kaur Dhillon, “Predicting factors for survival of breast cancer patients using machine learning techniques,” in BMC Medical Informatics and Decision Making, 2019.
Hiba Asria, Hajar Mousannifb, Hassan Al Moatassimec, and Thomas Noeld, “Using Machine Learning Algorithms for Breast Cancer Risk Prediction and Diagnosis,” in The 6th International Symposium on Frontiers in Ambient and Mobile Systems (FAMS), 2016, pp 1064-1069.
M. Tahmoores, A. Afshar, B. Bashari Rad, K. B. Nowshath and M. A. Bamiah, “Early Detection of Breast Cancer Using Machine Learning Techniques,” in Journal of Telecommunication, Electronic and Computer Engineering, vol. 10, 2018
KNN algorithm - Finding nearest neighbors. (n.d.). RxJS, ggplot2, Python Data Persistence, Caffe2, PyBrain, Python Data Access, H2O, Colab, Theano, Flutter, KNime, Mean.js, Weka, Solidity. https://www.tutorialspoint.com/machine_learning_with_python/machine_learning_with_python_knn_algorithm_finding_nearest_neighbors.html
Gandhi, R. (2018, July 5). Support vector machine — Introduction to machine learning algorithms Medium. https://towardsdatascience.com/support-vector-machine-introduction-to-machine-learning-algorithms-934a444fca47?gi=884a326d1cd2
“Naive Bayes Classifier - Towards Data Science”. [Online]. Available: https://towardsdatascience.com/naive-bayes-classifier-81d512f50a7c.
“Random Forest Algorithm with Python and Scikit-Learn”. [Online]. Available: https://stackabuse.com/random-forest-algorithm-with-python-and-scikit-learn/.
Kumar, N., 2021. Introduction to Support Vector Machines (SVMs). [online] MarkTechPost. Available at: <https://www.marktechpost.com/2021/03/25/introduction-to-support-vector-machines-svms/> [Accessed 28 November 2021].