Application of K-Means Algorithm for Segmentation Analysis of Youtube Viewers in Indonesia
Abstract
The application of K-Means as a clustering method in segmentation analysis is common. However, academic research on YouTube audience segmentation in Indonesia is still limited. YouTube audiences in Indonesia are diverse, ranging from entertainment, education, to news, so more in-depth analysis is needed to identify user segments more specifically. YouTube audience segmentation can provide a deeper understanding of people's video consumption behavior. This understanding can help content creators and digital industry players develop more effective content strategies. K-Means was chosen as the clustering method in this study because it can group YouTube viewers in Indonesia based on their interaction patterns with YouTube content. In addition, K-Means' ability to handle large data is suitable for segmenting platforms with a large number of users such as YouTube. This research uses three main features, namely views, duration, and engagement rate to group viewers into five clusters. Cluster evaluation using Silhouette Score (0.3445), Davies-Bouldin Index (0.9576), and Calinski-Harabasz Index (481.4730) shows that the resulting segmentation is of good quality. The analysis shows that there are differences in video consumption patterns across clusters, reflecting variations in viewer preferences and engagement levels.
Downloads
References
Wahidin, W., Mugihartadi, M., Aviani, T. H. B., Pratiwi, H., Wijaya, Y. I., Andie, A., Windarto, A. P., and Waluyo, A. 2021. Application of data mining techniques using the K-Means Method on Unmet Need of Health Services by Province in Indonesia. Journal of Physics: Conference Series, 1783 (1), p.012012. doi:10.1088/1742-6596/1783/1/012012.
Han, J., Kamber, M., and Pei, J. Data mining: Concepts and techniques. 4th ed. Amsterdam: Elsevier, 2022.
Jain, A. K. Data clustering: Theory, algorithms, and applications. 1st ed. Cham: Springer, 2023.
DataReportal. Essential YouTube Stats: Everything You Need to Know. (Updated Jan 2025). Available at: https://datareportal.com/essential-youtube-stats [Accessed Apr 4, 2025].
Ipsos, "Survei: YouTube layanan video paling disukai Gen Z di Indonesia," Tempo, vol. 39, no. 2, pp. 45–57, 2023. [Online]. Available: https://www.tempo.co/digital/survei-youtube-layanan-video-paling-disukai-gen-z-di-indonesia-133226. [Accessed Apr. 4, 2025].
Hootsuite and We Are Social, "Indonesian digital report 2023," DataReportal, Jan. 18, 2023. Available: https://datareportal.com/reports/digital-2023-indonesia. [Accessed Apr. 4, 2025].
Rahman and A. Nugroho, "Segmentasi pengguna YouTube berdasarkan perilaku interaksi menggunakan K-Means clustering," J. Teknol. Inf. dan Komun., vol. 10, no. 1, pp. 55–68, 2022.
S. Sitanggang, F. Umbara, and H. Ashaury, "Klasifikasi video pada media sosial YouTube dengan menggunakan metode K-Means dan Support Vector Machine," J. Locus Penelitian dan Pengabdian, vol. 2, pp. 1027–1032, Nov. 2023, doi: 10.58344/locus.v2i10.1732.
K. Widjaja and R. Oetama, "K-Means clustering video trending di YouTube Amerika Serikat," Ultima InfoSys: J. Ilmu Sistem Informasi, vol. 11, no. 2, pp. 78–84, 2020, doi: 10.31937/si.v11i2.1508.
S. A. Perdana, S. F. Florentin, and A. Santoso, "Analisis segmentasi pelanggan menggunakan K-Means clustering studi kasus aplikasi Alfagift," Sebatik, vol. 26, no. 2, pp. 446–457, 2022, doi: 10.46984/sebatik.v26i2.1991.
S. Salmon, A. Azahari, and H. Ekawati, "Perbandingan kinerja algoritma K-Nearest Neighbor dan algoritma Random Forest untuk klasifikasi data mining pada penyakit gagal ginjal," Building of Informatics, Technology and Science (BITS), vol. 6, no. 3, pp. 1943–1953, 2024, doi: 10.47065/bits.v6i3.6476.