KLASIFIKASI DATA MINING UNTUK PREDIKSI PENYAKIT KARDIOVASKULAR
Abstract
According to the World Health Organization (WHO), cardiovascular diseases are one of the leading causes of death worldwide. Cardiovascular diseases involve heart and blood vessel conditions that commonly occur in communities. These conditions encompass various diseases such as coronary heart disease, heart failure, stroke, and peripheral vascular disease. Major risk factors include high blood pressure, high cholesterol, and smoking. Premature deaths due to heart diseases can be prevented by controlling the risk factors and identifying individuals at high risk of developing such diseases. One of the most effective ways to identify and predict heart diseases is through the use of data mining algorithms. Data mining algorithms can address issues in diagnosing cardiovascular or heart diseases by utilizing predictive models, such as Decision Tree, Naive Bayes, Logistic Regression, K-Nearest Neighbor, and others. In this study, identification was performed using classification algorithms including Naïve Bayes, Logistic Regression, Decision Tree Classifier, k-NN, SVM, XGBoost, and Random Forest. The highest accuracy, reaching 98%, was obtained from the Random Forest algorithm
References
[2] Firdlous A.D., “Komparasi Algoritma Klasifikasi Data Mining untuk Memprediksi Penyakit Jantung,” J. Ilmu-ilmu Inform. dan ManajemenSTMIK, vol. 16, no. 1, pp. 79–84, 2022.
[3] P. Valentino and S. Narulita, “Performansi Algoritma Decision Tree (C4.5) untuk Prediksi Penyakit Jantung,” J. Cakrawala Inf., vol. 3, no. 2, pp. 18–24, 2023.
[4] F. Gorunescu, Data Mining Concepts, Models and Techniques. Verlag Berlin Heidelberg: Springer, 2011.
[5] A. U. Haq, J. P. Li, M. H. Memon, S. Nazir, R. Sun, and I. Garciá-Magarinõ, “A hybrid intelligent system framework for the prediction of heart disease using machine learning algorithms,” Mob. Inf. Syst., vol. 2018, 2018.
[6] D. P. Sinambela, H. Naparin, M. Zulfadhilah, and N. Hidayah, “Implementasi Algoritma Decision Tree dan Random Forest dalam Prediksi Perdarahan Pascasalin,” J. Inf. dan Teknol., vol. 5, no. 3, pp. 58–64, 2023.
[7] A. Desiani, M. Akbar, I. Irmeilyana, and A. Amran, “Implementasi Algoritma Naïve Bayes dan Support Vector Machine (SVM) Pada Klasifikasi Penyakit Kardiovaskular,” J. Tek. Elektro dan Komputasi, vol. 4, no. 2, pp. 207–214, 2022.
[8] S. Sumarlinda and W. Lestari, “Aplikasi K-Nearest Neighbor (KNN) untuk Klasifikasi Penyakit Kardiovaskuler,” Sumarlinda, Sri Lestari, Wiji, no. 55, pp. 259–262, 2022.
[9] A. Rohman, “Komparasi Metode Klasifikasi Data Mining Untuk Prediksi Penyakit Jantung,” Neo Tek., vol. 2, no. 2, pp. 21–28, 2017.
[10] W. Y. Ayele, “Adapting CRISP-DM for idea mining a data mining process for generating ideas using a textual dataset,” Int. J. Adv. Comput. Sci. Appl., vol. 11, no. 6, pp. 20–32, 2020.
[11] P. Chapman et al., CRISP-DM 1.0: Step-by-step data mining guide. CRISP-DM Consortium, 2000.
[12] U. Kannengiesser and J. S. Gero, “Modelling the Design of Models: an Example Using Crisp-Dm,” Proc. Des. Soc., vol. 3, no. July, pp. 2705–2714, 2023.
[13] C. Schröer, F. Kruse, and J. M. Gómez, “A systematic literature review on applying CRISP-DM process model,” Procedia Comput. Sci., vol. 181, no. 2019, pp. 526–534, 2021.
[14] WHO, “Cardiovascular diseases (CVDs),” 11 June 2021, 2021. [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/cardiovascular-diseases-(cvds).
[15] D. Bhanu Prakash and B. Debnath, “Cardiovascular_Dieases_Dataset,” 2021. .
[16] A. M. A. Rahim, Inggrid Yanuar Risca Pratiwi, and Muhammad Ainul Fikri, “Klasifikasi Penyakit Jantung Menggunakan Metode Synthetic Minority Over-Sampling Technique Dan Random Forest Clasifier,” Indones. J. Comput. Sci., vol. 12, no. 5, pp. 2995–3011, 2023.
[17] Q. R. Cahyani et al., “Prediksi Risiko Penyakit Diabetes menggunakan Algoritma Regresi Logistik Diabetes Risk Prediction using Logistic Regression Algorithm Article Info ABSTRAK,” JOMLAI J. Mach. Learn. Artif. Intell., vol. 1, no. 2, pp. 2828–9099, 2022.
[18] D. Derisma, “Perbandingan Kinerja Algoritma untuk Prediksi Penyakit Jantung dengan Teknik Data Mining,” J. Appl. Informatics Comput., vol. 4, no. 1, pp. 84–88, 2020.
[19] J. Melvin and A. Soraya, “Analisis Perbandingan Algoritma XGBoost dan Algoritma Random Forest Ensemble Learning pada Klasifikasi Keputusan Kredit,” J. Ris. Rumpun Mat. dan Ilmu Pengetah. Alam, vol. 2, no. 2, pp. 87–103, 2023.
[20] N. Nuraeni, “Klasifikasi Data Mining untuk Prediksi Penyakit Kardiovaskular,” 2023.
[21] I. Düntsch and G. Gediga, “Confusion Matrices and Rough Set Data Analysis,” J. Phys. Conf. Ser., vol. 1229, no. 1, 2019.
[22] D. P. Utomo and M. Mesran, “Analisis Komparasi Metode Klasifikasi Data Mining dan Reduksi Atribut Pada Data Set Penyakit Jantung,” J. Media Inform. Budidarma, vol. 4, no. 2, p. 437, 2020.
[23] S. Gargate, “Evaluating your classification model,” 2019. [Online]. Available: https://medium.com/swlh/evaluating-your-classification-model-cb49338abb96.