Geographic-Origin Music Classification from Numerical Audio Features: Integrating Unsupervised Clustering with Supervised Models

Pranolo, Andri and Sularso, Sularso and Anwar, Nuril and Putra, Agung Bella Utama and Wibawa, Aji Prasetya and Saifullah, Shoffan and Dreżewski, Rafał and Nuryana, Zalik and Andi, Tri Andi (2025) Geographic-Origin Music Classification from Numerical Audio Features: Integrating Unsupervised Clustering with Supervised Models. Buletin Ilmiah Sarjana Teknik Elektro, 7 (4). pp. 842-857.

[thumbnail of 13400-Article Text-67657-1-10-20251119.pdf] Text
13400-Article Text-67657-1-10-20251119.pdf - Published Version

Download (1MB)

Abstract

Classifying the geographic origin of music is a relevant task in music information retrieval, yet most studies have focused on genre or style recognition rather than regional origin. This study evaluates Support Vector Machine (SVM) and Convolutional Neural Network (CNN) models on the UCI Geographical Origin of Music dataset (1,059 tracks from 33 non-Western regions) using numerical audio features. To incorporate latent structure, we first applied K-means clustering with the optimal number of clusters (k=2) determined by the Elbow and Silhouette methods. The cluster assignments were used as auxiliary signals for training, while evaluation relied on the true region labels. Classification performance was assessed with Accuracy, Precision, Recall, and F1-score. Results show that SVM achieved 99.53% accuracy (95% CI: 97.38–99.92%), while CNN reached 98.58% accuracy (95% CI: 95.92–99.52%); Precision, Recall, and F1 mirrored these values. The differences confirm SVM’s superior performance on this dataset, though the near-perfect scores also suggest strong separability in the feature space and potential risks of overfitting. Learning-curve analysis indicated stable training, and cluster supervision provided small but consistent benefits. Overall, SVM remains a reliable baseline for tabular music features, while CNNs may require spectro-temporal representations to leverage their full potential. Future work should validate these findings across multiple datasets, apply cross-validation with statistical significance testing, and explore hybrid deep models for broader generalization.

Item Type: Article
Subjects: T Technology > TK Electrical engineering. Electronics Nuclear engineering
Depositing User: BISTE UAD
Date Deposited: 16 May 2026 16:35
Last Modified: 16 May 2026 16:35
URI: https://alxiv.org/id/eprint/828

Actions (login required)

View Item
View Item