Auditory modelling and self‐organizing neural networks for timbre classification

Cosi, Piero; Poli, Giovanni De; Lauzzana, Giampaolo

doi:10.1080/09298219408570648

Cited by 42 publications

(15 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…4 A pilot study indeed suggested that this Cartesian hypothesis may provide an appropriate methodology for developing prediction models of affect description. 2 A number of computational models are nowadays available that extract perception-related properties from musical audio such as onset (e.g., Klapuri, 1999;Smith, 1994), beat (e.g., Toiviainen, 2001;Large & Kolen, 1994;Scheirer, 1998;Laroche, 2003), consonance (e.g., Aures, 1985;Daniel & Weber, 1997;Leman, 2000a), pitch (e.g., Clarisse et al, 2002;De Mulder et al, 2004), harmony, tonality (e.g., Terhardt, 1974;Parncutt, 1989, Leman, 1995, 2000b, timbre (e.g., Cosi et al, 1994;Toiviainen, 1996;De Poli & Prandoni, 1997). 3 Apart from a preliminary study by Scheirer et al (2000) , we know of no other attempts that relate these and similar audioextracted structural features to affect-based description of music.…”

Section: Introductionmentioning

confidence: 99%

Prediction of Musical Affect Using a Combination of Acoustic Structural Cues

Leman¹,

Vermeulen²,

Voogdt³

et al. 2005

Journal of New Music Research

View full text Add to dashboard Cite

This study explores whether musical affect attribution can be predicted by a linear combination of acoustical structural cues. To that aim, a database of sixty musical audio excerpts was compiled and analyzed at three levels: judgments of affective content by subjects; judgments of structural content by musicological experts (i.e., ''manual structural cues''), and extraction of structural content by an auditory-based computer algorithm (called: acoustical structural cues). In Study I, an affect space was constructed with Valence (gaysad), Activity (tender-bold) and Interest (excitingboring) as the main dimensions, using the responses of a hundred subjects. In Study II manual and acoustical structural cues were analyzed and compared. Manual structural cues such as loudness and articulation could be accounted for in terms of a combination of acoustical structural cues. In Study III, the subjective responses of eight individual subjects were analyzed using the affect space obtained in Study I, and modeled in terms of the structural cues obtained in Study II, using linear regression modeling. This worked better for the Activity dimension than for the Valence dimension, while the Interest dimension could not be accounted for. Overall, manual structural cues worked better than acoustical structural cues. In a final assessment study, a selected set of acoustical structural cues was used for building prediction models. The results indicate that musical affect attribution can partly be predicted using a combination of acoustical structural cues. Future research may focus on non-linear approaches, elaboration of dataset and subjects, and refinement of acoustical structural cue extraction.

show abstract

Section: Introductionmentioning

confidence: 99%

Prediction of Musical Affect Using a Combination of Acoustic Structural Cues

Leman¹,

Vermeulen²,

Voogdt³

et al. 2005

Journal of New Music Research

View full text Add to dashboard Cite

show abstract

“…20 The concept of a "timbre space" first suggested by Grey 15 was replicated within a neural network, and clustering was used to categorize sounds. A similar approach involving a selflearning neural network was adopted by Cosi et al, 21 capturing tone quality with MFCCs.…”

Section: Musical Instrument Classifiers Based On Timbral Consideramentioning

confidence: 99%

“…Strategy B-1 was based upon mean MFCCs evaluated over the whole tone and was inspired by a number of previous musical instrument classifiers. 21,26,30 It formed a reduced-space representation of the average timbre of the tone.…”

Section: Onset Fingerprinting Vs Whole Tone Mfccsmentioning

confidence: 99%

A neurally inspired musical instrument classification system based upon the sound onset

Newton

Smith

2012

The Journal of the Acoustical Society of America

View full text Add to dashboard Cite

Physiological evidence suggests that sound onset detection in the auditory system may be performed by specialized neurons as early as the cochlear nucleus. Psychoacoustic evidence shows that the sound onset can be important for the recognition of musical sounds. Here the sound onset is used in isolation to form tone descriptors for a musical instrument classification task. The task involves 2085 isolated musical tones from the McGill dataset across five instrument categories. A neurally inspired tone descriptor is created using a model of the auditory system's response to sound onset. A gammatone filterbank and spiking onset detectors, built from dynamic synapses and leaky integrate-and-fire neurons, create parallel spike trains that emphasize the sound onset. These are coded as a descriptor called the onset fingerprint. Classification uses a time-domain neural network, the echo state network. Reference strategies, based upon mel-frequency cepstral coefficients, evaluated either over the whole tone or only during the sound onset, provide context to the method. Classification success rates for the neurally-inspired method are around 75%. The cepstral methods perform between 73% and 76%. Further testing with tones from the Iowa MIS collection shows that the neurally inspired method is considerably more robust when tested with data from an unrelated dataset.

show abstract

“…Cosi et al, 1994;Feiten and Günzel, 1994;Spevak and Polfreman, 2001;Frühwirth and Rauber, 2001). Alternatives include multi-dimensional scaling (Kruskal and Wish, 1978), Sammon's mapping (Sammon, 1969), and generative topographic mapping (Bishop et al, 1998).…”

Section: Self-organizing Mapsmentioning

confidence: 99%

Exploring Music Collections by Browsing Different Views

Pampalk

Dixon

Widmer

2004

Computer Music Journal

View full text Add to dashboard Cite

The availability of large music collections calls for ways to efficiently access and explore them. We present a new approach which combines descriptors derived from audio analysis with meta-information to create different views of a collection. Such views can have a focus on timbre, rhythm, artist, style or other aspects of music. For each view the pieces of music are organized on a map in such a way that similar pieces are located close to each other. The maps are visualized using an Islands of Music metaphor where islands represent groups of similar pieces. The maps are linked to each other using a new technique to align self-organizing maps. The user is able to browse the collection and explore different aspects by gradually changing focus from one view to another. We demonstrate our approach on a small collection using a meta-information-based view and two views generated from audio analysis, namely, beat periodicity as an aspect of rhythm and spectral information as an aspect of timbre.

show abstract

Auditory modelling and self‐organizing neural networks for timbre classification

Abstract: A timbre classification system based on auditory processing and Kohonen self organizing neural networks is described. Preliminary results are given on a simple classification experiment involving 12 instruments in both clean and degraded conditions.

Cited by 42 publications

References 17 publications

Prediction of Musical Affect Using a Combination of Acoustic Structural Cues

Prediction of Musical Affect Using a Combination of Acoustic Structural Cues

A neurally inspired musical instrument classification system based upon the sound onset

Exploring Music Collections by Browsing Different Views

Contact Info

Product

Resources

About