R aga forms the melodic framework for most of the music of the Indian subcontinent. Thus automatic r aga recognition is a fundamental step in the computational modelling of the Indian art-music traditions. In this work, we investigate the properties of r aga and the natural processes by which people identify it. We bring together and discuss the previous computational approaches to r aga recognition correlating them with human techniques, in both Karn _ at _ aka (south Indian) and Hindust an ı (north Indian) music traditions. The approaches which are based on first-order pitch distributions are further evaluated on a large comprehensive dataset to understand their merits and limitations. We outline the possible short and mid-term future directions in this line of work.
We present Essentia 2.0, an open-source C++ library for audio analysis and audio-based music information retrieval released under the Affero GPL license. It contains an extensive collection of reusable algorithms which implement audio input/output functionality, standard digital signal processing blocks, statistical characterization of data, and a large set of spectral, temporal, tonal and high-level music descriptors. The library is also wrapped in Python and includes a number of predefined executable extractors for the available music descriptors, which facilitates its use for fast prototyping and allows setting up research experiments very rapidly. Furthermore, it includes a Vamp plugin to be used with Sonic Visualiser for visualization purposes. The library is cross-platform and currently supports Linux, Mac OS X, and Windows systems. Essentia is designed with a focus on the robustness of the provided music descriptors and is optimized in terms of the computational cost of the algorithms. The provided functionality, specifically the music descriptors included in-the-box and signal processing algorithms, is easily expandable and allows for both research experiments and development of large-scale industrial applications.
Automatic rāga recognition is one of the fundamental computational tasks in Indian art music. Motivated by the way seasoned listeners identify rāgas, we propose a rāga recognition approach based on melodic phrases. Firstly, we extract melodic patterns from a collection of audio recordings in an unsupervised way. Next, we group similar patterns by exploiting complex networks concepts and techniques. Drawing an analogy to topic modeling in text classification, we then represent audio recordings using a vector space model. Finally, we employ a number of classification strategies to build a predictive model for rāga recognition. To evaluate our approach, we compile a music collection of over 124 hours, comprising 480 recordings and 40 rāgas. We obtain 70% accuracy with the full 40-rāga collection, and up to 92% accuracy with its 10-rāga subset. We show that phrase-based rāga recognition is a successful strategy, on par with the state of the art, and sometimes outperforms it. A by-product of our approach, which arguably is as important as the task of rāga recognition, is the identification of rāga-phrases. These phrases can be used as a dictionary of semanticallymeaningful melodic units for several computational tasks in Indian art music.
The tonic is a fundamental concept in Indian art music. It is the base pitch, which an artist chooses in order to construct the melodies during a rāg(a) rendition, and all accompanying instruments are tuned using the tonic pitch. Consequently, tonic identification is a fundamental task for most computational analyses of Indian art music, such as intonation analysis, melodic motif analysis and rāg recognition. In this paper we review existing approaches for tonic identification in Indian art music and evaluate them on six diverse datasets for a thorough comparison and analysis. We study the performance of each method in different contexts such as the presence/absence of additional metadata, the quality of audio data, the duration of audio data, music tradition (Hindustani/Carnatic) and the gender of the singer (male/female). We show that the approaches that combine multi-pitch analysis with machine learning provide the best performance in most cases (90% identification accuracy on average), and are robust across the aforementioned contexts compared to the approaches based on expert knowledge. In addition, we also show that the performance of the latter can be improved when additional metadata is available to further constrain the problem. Finally, we present a detailed error analysis of each method, providing further insights into the advantages and limitations of the methods.
Discovery of repeating structures in music is fundamental to its analysis, understanding and interpretation. We present a data-driven approach for the discovery of shorttime melodic patterns in large collections of Indian art music. The approach first discovers melodic patterns within an audio recording and subsequently searches for their repetitions in the entire music collection. We compute similarity between melodic patterns using dynamic time warping (DTW). Furthermore, we investigate four different variants of the DTW cost function for rank refinement of the obtained results. The music collection used in this study comprises 1,764 audio recordings with a total duration of 365 hours. Over 13 trillion DTW distance computations are done for the entire dataset. Due to the computational complexity of the task, different lower bounding and early abandoning techniques are applied during DTW distance computation. An evaluation based on expert feedback on a subset of the dataset shows that the discovered melodic patterns are musically relevant. Several musically interesting relationships are discovered, yielding further scope for establishing novel similarity measures based on melodic patterns. The discovered melodic patterns can further be used in challenging computational tasks such as automatic rāga recognition, composition identification and music recommendation Keywords-Motifs, Pattern discovery, Time series, Melodic analysis, Indian art music 1 Compare results across datasets: http://www.music-ir.org/mirex/wiki/ 2013:MIREX2013 Results
We perform a comparative evaluation of methodologies for computing similarity between short-time melodic fragments of audio recordings of Indian art music. We experiment with 560 different combinations of procedures and parameter values. These include the choices made for the sampling rate of the melody representation, pitch quantization levels, normalization techniques and distance measures. The dataset used for evaluation consists of 157 and 340 annotated melodic fragments of Carnatic and Hindustani music recordings, respectively. Our results indicate that melodic fragment similarity is particularly sensitive to distance measures and normalization techniques. Sampling rates do not have a significant impact for Hindustani music, but can significantly degrade the performance for Carnatic music. Overall, the performed evaluation provides a better understanding of the processing steps and parameter settings for melodic similarity in Indian art music. Importantly, it paves the way for developing unsupervised melodic pattern discovery approaches, whose evaluation is a challenging and, many times, ill-defined task.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.