SEGMENTASI PELANGGAN MENGGUNAKAN K-MEANS CLUSTERING DI TOKO RETAIL
Abstract
Advancements in information technology have transformed various aspects of human life, including the business world. Companies are required to use technology and data effectively to enhance their competitive advantage. One increasingly relevant strategy is Customer Relationship Management (CRM), where customer data is the main focus. Consumer data segmentation is an approach used to group customers based on certain characteristics. In this study, the K-Means Clustering algorithm is applied to consumer data segmentation to improve the marketing strategy of a store. The study begins with the collection of customer data from the Dan+Dan Telukjambe 2 store, followed by Exploratory Data Analysis (EDA) to understand the patterns and characteristics of the data. Preprocessing steps are carried out to ensure the data is ready for use, including removing irrelevant columns, handling missing values, and data transformation. Principal Component Analysis (PCA) is used to reduce data dimensions before applying K-Means Clustering. The Elbow Method and Silhouette Score are used to determine the optimal number of clusters. The study results indicate that the optimal number of clusters is six. Evaluation using the Silhouette Coefficient provides an average coefficient value of 0.66, indicating good clustering quality. Further analysis shows different distributions of age, purchasing power, occupation, and marital status in each cluster, providing deep insights into customer segments. The resulting clusters offer valuable information for developing more effective and targeted marketing strategies
References
[2] S. Mahasri, “Pengaruh Perkembangan Teknologi Informasi Terhadap Bidang Akuntansi Manajemen,” J. Akunt. dan Keuang., vol. 2, no. 2, pp. 127–137, 2000.
[3] N. Hidayatul Istiqomah, “TRANSFORMASI PEMASARAN TRADISIONAL KE E MARKETING: TINJAUAN LITERATUR TENTANG DAMPAK PENGGUNAAN TEKNOLOGI DIGITAL TERHADAP DAYA SAING PEMASARAN BISNIS,” vol. 4, no. 2, pp. 72–87, 2023.
[4] R. rian putra and C. Wadisman, “IMPLEMENTASI DATA MINING PEMILIHAN PELANGGAN POTENSIAL MENGGUNAKAN ALGORITMA K-MEANS,” J. Inf. Technol. Comput. Sci., vol. 1, pp. 72–77, 2018, doi: doi.org/10.31539/intecoms.v1i1.141.
[5] B. E. Adiana, I. Soesanti, and A. E. Permanasari, “Analisis Segmentasi Pelanggan Menggunakan Kombinasi Rfm Model Dan Teknik Clustering,” J. Terap. Teknol. Inf., vol. 2, no. 1, pp. 23–32, 2018, doi: 10.21460/jutei.2018.21.76.
[6] T. Tukino and B. Huda, “Penerapan Algoritma K-Means Untuk Mendukung Keputusan Dalam Pemilihan Tema Tugas Akhir Pada Prodi Sistem Informasi Universitas Buana Perjuangan Karawang.,” Techno Xplore J. Ilmu Komput. dan Teknol. Inf., vol. 4, no. 1, pp. 1–10, 2019, doi: 10.36805/technoxplore.v4i1.542.
[7] D. A. Pramudita and Bagus Sumargo, “Pengelompokan Pengguna Internet dengan Metode K-Means Clustering,” J. Stat. dan Apl., vol. 3, no. 1, pp. 1–12, 2019, doi: 10.21009/jsa.03101.
[8] S. A. P. R. Permata sari, “Analisis Dan Visualisasi Data Penjualan Menggunakan Exploratory Data Analysis dan K-Means Clustering,” vol. 5, pp. 423–433, 2023, doi: 10.30865/json.v5i2.7180.
[9] S. B. H. Sakur, M. Silangen, and D. Tuwohingide, “Penerapan Algoritma K-Means Cluster dan Metode TOPSIS pada Pemilihan Mahasiswa kunjungan Industri,” Jutisi J. Ilm. Tek. Inform. dan Sist. Inf., vol. 11, no. 3, p. 851, 2022, doi: 10.35889/jutisi.v11i3.1045.
[10] M. A. Satriawan, R. Andreswari, and O. N. Pratiwi, “Segmentasi Pelanggan Telkomsel Menggunakan Metode Clustering Dengan Rfm Model Dan Algoritma K-Means Telkomsel Customer Segmentation Using Clustering Method With Rfm Model and K-Means Algorithm,” e-Proceeding Eng. , vol. 8, no. 2, pp. 2876–2883, 2021.
[11] K. R. Shahapure and C. Nicholas, “Cluster quality analysis using silhouette score,” Proc. - 2020 IEEE 7th Int. Conf. Data Sci. Adv. Anal. DSAA 2020, pp. 747–748, 2020, doi: 10.1109/DSAA49011.2020.00096.
[12] W. A. Taqwim, “Analisis Segmentasi Pelanggan Dengan RFM Model Pada Pt . Arthamas Citra Mandiri Menggunakan Metode Fuzzy C-Means Clustering,” J. Pengemb. Teknol. Inf. dan Ilmu Komput., vol. 3, no. 2, pp. 1986–1993, 2019.
[13] N. A. Permatasari, Y. H. Chrisnanto, and A. K. Ningsih, “Segmentasi Kasus Data Kematian Covid 19 Di Jawa Barat Menggunakan Algoritma DBSCAN,” IJESPG (International J. Eng. Econ. Soc. Polit. Gov., vol. 1, no. 4, pp. 119–128, 2023.
[14] J. Peng et al., “DataPrep.EDA: Task-Centric Exploratory Data Analysis for Statistical Modeling in Python,” in Proceedings of the 2021 International Conference on Management of Data, Jun. 2021, pp. 2271–2280. doi: 10.1145/3448016.3457330.
[15] D. Hediyati and I. M. Suartana, “Penerapan Principal Component Analysis (PCA) Untuk Reduksi Dimensi Pada Proses Clustering Data Produksi Pertanian Di Kabupaten Bojonegoro,” J. Inf. Eng. Educ. Technol., vol. 5, no. 2, pp. 49–54, 2021, doi: 10.26740/jieet.v5n2.p49-54.
[16] B. Berlilana, R. Utami, and W. M. Baihaqi, “Pengaruh Teknologi Informasi Revolusi Industri 4.0 terhadap Perkembangan UMKM Sektor Industri Pengolahan,” Matrix J. Manaj. Teknol. dan Inform., vol. 10, no. 3, pp. 87–93, 2020, doi: 10.31940/matrix.v10i3.1930.
[17] H. Willa Dhany and F. Izhari, “Journal of Intelligent Decision Support System (IDSS) Exploratory Data Analysis (EDA) methods for healthcare classification,” J. Intell. Decis. Support Syst., vol. 6, no. 4, pp. 209–215, 2023.
[18] S. Rustam, “Analisa Clustering Phising Dengan K-Means Dalam Meningkatkan Keamanan Komputer,” Ilk. J. Ilm., vol. 10, no. 2, pp. 175–181, 2018, doi: 10.33096/ilkom.v10i2.309.175-181.
[19] R. Adhitama, A. Burhanuddin, and R. Ananda, “Penentuan Jumlah Cluster Ideal Smk Di Jawa Tengah Dengan Metode X-Means Clustering Dan K-Means Clustering Determining Vocational Ideal Cluster Number in Central Java With X-Means Clustering and K-Means Clustering Methods,” J. Inform. dan Komputer) Akreditasi KEMENRISTEKDIKTI, vol. 3, no. 1, pp. 1–5, 2020, doi: 10.33387/jiko.
[20] A. S. Agung, A. A. Fauzi, A. A. Nur Risal, and F. Adiba, “Implementasi Teknik Data Mining terhadap Klasifikasi Data Prediksi Curah Hujan BMKG Di Sulawesi Selatan,” J. Tekno Insentif, vol. 17, no. 1, pp. 22–23, 2023, doi: 10.36787/jti.v17i1.955.
[21] T. F. Johnson, N. J. B. Isaac, A. Paviolo, and M. González-Suárez, “Handling missing values in trait data,” Glob. Ecol. Biogeogr., vol. 30, no. 1, pp. 51–62, 2021, doi: 10.1111/geb.13185.
[22] A. P. Joshi and B. V. Patel, “Data Preprocessing: The Techniques for Preparing Clean and Quality Data for Data Analytics Process,” Orient. J. Comput. Sci. Technol., vol. 13, no. 0203, pp. 78–81, 2021, doi: 10.13005/ojcst13.0203.03.
[23] V. N. G. Raju, K. P. Lakshmi, V. M. Jain, A. Kalidindi, and V. Padma, “Study the Influence of Normalization/Transformation process on the Accuracy of Supervised Classification,” Proc. 3rd Int. Conf. Smart Syst. Inven. Technol. ICSSIT 2020, no. Icssit, pp. 729–735, 2020, doi: 10.1109/ICSSIT48917.2020.9214160.