2000
DOI: 10.1017/s1355771800003071
|View full text |Cite
|
Sign up to set email alerts
|

MARSYAS: a framework for audio analysis

Abstract: Existing audio tools handle the increasing amount of computer audio data inadequately. The typical tape-recorder paradigm for audio interfaces is inflexible and time consuming, especially for large data sets. On the other hand, completely automatic audio analysis and annotation is impossible using current techniques. Alternative solutions are semi-automatic user interfaces that let users interact with sound in flexible ways based on content. This approach offers significant advantages over manual browsing, ann… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
140
0
4

Year Published

2005
2005
2022
2022

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 262 publications
(145 citation statements)
references
References 15 publications
1
140
0
4
Order By: Relevance
“…A number of efficient tools have been created that allow for the rich processing of audio (Tzanetakis & Cook, 2000;Leman, Lesaffre, & Tanghe, 2001;Lartillot & Toiviainen, 2007). Consequently, hundreds of features may now be extracted from audio (MFCCs, and various descriptive statistics of frame-based analysis of rhythm, pitch and more).…”
Section: Feature Extractionmentioning
confidence: 99%
“…A number of efficient tools have been created that allow for the rich processing of audio (Tzanetakis & Cook, 2000;Leman, Lesaffre, & Tanghe, 2001;Lartillot & Toiviainen, 2007). Consequently, hundreds of features may now be extracted from audio (MFCCs, and various descriptive statistics of frame-based analysis of rhythm, pitch and more).…”
Section: Feature Extractionmentioning
confidence: 99%
“…In case of music audio data, other descriptors are used, see for instance [25], [28], [30]. These features include structure of the spectrum, time domain features, and also time-frequency description.…”
Section: Audio Data Parameterizationmentioning
confidence: 99%
“…Since the research on automatic detection of emotions in music is very recent, there is no significant comparison of descriptor sets and their performance for this purpose. Li and Ogihara applied parameters provided in [28], describing timbral texture features, rhythmic content features, and pitch content features. The dimension of the final feature vector was 30.…”
Section: Audio Data Parameterizationmentioning
confidence: 99%
“…The feature extraction and graphical user interface that was used for the annotation were implemented using Marsyas (http://marsyas.sourceforge.net), a free software framework for Computer Audition research described in [12]. Figure 5.2 shows a screenshot of the user interface used for annotation and experimenting with singing voice structure.…”
Section: Methodsmentioning
confidence: 99%
“…We experimented with various features proposed in the literature such as spectral shape features (Centroid, Rolloff, Relative Subband Energy) [12], Mel Frequency Cepstral Coefficients (MFCC) [13] and Linear Prediction Coefficients (LPC) [14]. The final feature set we used consists of the following features: Mean Centroid, Rolloff, and Flux, Mean Relative Energy 1 (relative energy of the subband that spans the lowest 1/4th of the total bandwidth), Mean Relative Subband Energy 2 (relative energy of the second 1/4th of the total bandwidth), Standard Deviation of the Centroid, Rolloff, and Flux.…”
Section: Feature Extractionmentioning
confidence: 99%