Analysis of diabetes disease using k-nearest neighbor (KNN) and naive bayes methods
Keywords:
classification, Diabetes Mellitus, K-Nearest Neighbor, Machine Learning, Naïve BayesAbstract
Diabetes mellitus is a chronic disease whose prevalence continues to increase and has the potential to cause serious complications if not detected early. Therefore, a classification method is needed capable of assisting the diagnosis process quickly and accurately. This study aims to analyze and compare the performance of the K-Nearest Neighbor (KNN) and Naïve Bayes algorithms in the classification of diabetes. The dataset used was obtained from Kaggle, totaling 768 data points with 8 attributes: Pregnancies, Glucose, BloodPressure, SkinThickness, Insulin, BMI, DiabetesPedigreeFunction, Age, and the Outcome label. The dataset is divided into 80% training data and 20% testing data. Model evaluation was carried out using accuracy, precision, recall, and F1-score metrics. The results showed that the KNN method produced the best accuracy of 75% at a value of k = 3, while the Naïve Bayes method produced an accuracy of 77.92%. Based on these evaluation results, the Naïve Bayes method has better performance compared to KNN in classifying diabetes in the dataset used. This research is expected to be a reference in the development of clinical decision support systems for the early diagnosis of diabetes.
References
[1] U. Alam, “General aspects of diabetes mellitus,” vol. 126, 2014. https://doi.org/10.5005/jp/books/12220_38
[2] F. K. R. Noor, “ASUHAN KEPERAWATAN PASIEN YANG MENGALAMI DIABETES MELITUS TIPE 2 DENGAN KETIDAKSTABILAN KADAR GLUKOSA DARAH DI RSUD PASAR REBO,” 2024.
[3] P. Kecerdasan Buatan Dan Dampaknya Pada Dunia Teknologi, I. Zaenuddin, and A. Bani Riyan, “28 Creative Commons Attribution 4.0 International License,” 2024.
[4] D. Kusuma Ningrum and A. Maytsa Ismawardi, “EFEKTIVITAS ALGORITMA KECERDASAN BUATAN DALAM IMPLEMENTASI KESEHATAN MENTAL : SYSTEMATIC LITERATURE REVIEW,” 2025. https://doi.org/10.36040/jati.v9i1.12457
[5] F. Malik Namus Akbar, “Metode KNN (K-Nearest Neighbor) untuk Menentukan Kualitas Air,” vol. 18, no. 1.
[6] A. Sirojul Munir et al., “Perbandingan Akurasi Algoritma Naive Bayes dan Algoritma Decision Tree dalam Pengklasifikasian Penyakit Kanker Payudara”.
[7] W. Dwi Prasetya and B. Sujatmiko, “Rancang Bangun Aplikasi dengan Perbandingan Metode K-Nearest Neighbor (KNN) dan Naive Bayes dalam Klasifikasi Penderita Penyakit Diabetes”.
[8] J. Kecerdasan Buatan, K. dan Teknologi Informasi, L. Octa Sofyan Firmandala, Z. Fatah, R. Artikel, and K. Kunci Data Mining, “Implementasi Data Mining Klasifikasi Kelulusan Mahasiswa di Perguruan Tinggi Menggunakan K-Nearest Neighbors,” Tahun, vol. 5, no. 2, 2024, [Online]. Available: https://ejournal.unuja.ac.id/index.php/core. https://doi.org/10.33650/coreai.v5i2.9729
[9] D. Nurul Anisa, “KLASIFIKASI PENYAKIT DIABETES MENGGUNAKAN ALGORITMA NAIVE BAYES,” Dinamika Informatika, vol. 14, no. 1, 2022. https://doi.org/10.35315/informatika.v14i1.9135
[10] J. Homepage et al., “MALCOM: Indonesian Journal of Machine Learning and Computer Science Comparison of Classification Between Naive Bayes and K-Nearest Neighbor on Diabetes Risk in Pregnant Women Perbandingan Klasifikasi Antara Naive Bayes dan K-Nearest Neighbor Terhadap Resiko Diabetes Pada Ibu Hamil,” vol. 2, pp. 68–75, 2022. https://doi.org/10.57152/malcom.v2i2.432
[11] M. Fadli Kurniawan and D. Ayu Megawaty, “Comparison of Logistic Regression, Random Forest, Support Vector Machine (SVM) and K-Nearest Neighbor (KNN) Algorithms in Diabetes Prediction,” 2025. [Online]. Available: http://jurnal.polibatam.ac.id/index.php/JAIC. https://doi.org/10.30871/jaic.v9i5.9815
[12] H. Hatta Irsyad, M. I. Syafwan, and D. Ramadhani, “Journal of System & Technology ANALISIS PERBANDINGAN KINERJA ALGORITMA K-NEAREST NEIGHBORS DAN SUPPORT VECTOR MACHINE UNTUK KLASIFIKASI PENYAKIT DIABETES,” vol. 1, no. 2, 2025, [Online]. Available: https://doi.org/XX.XXXXX/systec.X.X.X-XX
[13] Muhammad Randy Fachrezi, Hafiz Aryanda, Alwi Syahputra, and Risma Riansyah, “Sistem Klasifikasi Diabetes Mellitus Menggunakan Algoritma K-Nearest Neighbor (KNN) Berbasis Web,” Jurnal Ilmu Komputer dan Teknik Informatika, vol. 2, no. 1, pp. 119–130, Jan. 2026, doi: https://doi.org/10.64803/juikti.v2i1.116
[14] E. B. Susanto, A. N. Anzila, and B. Ismanto, “Comparison Of The Effectiveness Of K-Nearest Neighbor (KNN) And Naive Bayes Algorithms In Identifying Diabetes Patients,” Journal of Artificial Intelligence and Software Engineering (J-AISE), vol. 5, no. 1, p. 22, Mar. 2025, doi: https://doi.org/10.30811/jaise.v5i1.6275
[15] M. R. Hunafa and A. Hermawan, “KLIK: Kajian Ilmiah Informatika dan Komputer Perbandingan Algoritma Naïve Bayes dan K-Nearest Neighbor Pada Imbalace Class Dataset Penyakit Diabetes,” Media Online, vol. 4, no. 3, pp. 1551–1561, 2023, doi: 10.30865/klik.v4i3.1486.
[16] R. A. Safitri and R. Hidayati, “Komparasi Metode K-Nearest Neighbor dan Naïve Bayes untuk Mengklasifikasi Resiko Diabetes Di Posbindu Desa Bulupitu,” SMATIKA JURNAL, vol. 14, no. 02, pp. 297–303, Dec. 2024, doi: https://doi.org/10.32664/smatika.v14i02.1350
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Machine Intelligence for Societal Advancement

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
