2013 IEEE International Conference on Acoustics, Speech and Signal Processing 2013
DOI: 10.1109/icassp.2013.6637756
|View full text |Cite
|
Sign up to set email alerts
|

Sleepiness detection from speech by perceptual features

Abstract: We propose a two-class classification scheme with a small number of features for sleepiness detection. Unlike the conventional methods that rely on the linguistics content of speech, we work with prosodic features extracted by psychoacoustic masking in spectral and temporal domain. Our features also model the variations between non-sleepy and sleepy modes in a quasi-continuum space with the help of code words learned by a bag-of-features scheme. These improve the unweighted recall rates for unseen people and m… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

2
13
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 6 publications
(15 citation statements)
references
References 9 publications
2
13
0
Order By: Relevance
“…The KSS being a semi-continuous measure, the choice is made to split the dataset into two classes: following [4], [6], [9], [15], the samples with a mean KSS>7.5 will be considered as Sleepy Language (SL). On the contrary, the samples with a KSS≤7.5 are labelled as Non Sleepy Language (NSL).…”
Section: Ground Truth: the Karolinska Sleeping Scalementioning
confidence: 99%
“…The KSS being a semi-continuous measure, the choice is made to split the dataset into two classes: following [4], [6], [9], [15], the samples with a mean KSS>7.5 will be considered as Sleepy Language (SL). On the contrary, the samples with a KSS≤7.5 are labelled as Non Sleepy Language (NSL).…”
Section: Ground Truth: the Karolinska Sleeping Scalementioning
confidence: 99%
“…Another version of this type of model has been developed with to predict “microsleeps” with similar success (Krajewski et al, 2008 ). Other variations also exist, such as the model developed by Günsel et al ( 2013 ), which uses prosodic features extracted by psychoacoustic masking, and the model developed by Thakare ( 2014 ), which utilizes automatic speech recognition (ASR) to identify key phonemes from which to extract features. This latter approach would require a robust ASR system in operational environments to account for environmental noise, reverberation, and channel distortions.…”
Section: Introductionmentioning
confidence: 99%
“…In 2011 the challenge was to classify speech recordings into sleepy vs non-sleepy, using KSS ratings of 7 and below as non-sleepy, and 8 and above as sleepy. In that challenge the baseline systems achieved an accuracy of about 70%, while a later study using the same speech data and task achieved a classification accuracy of over 80% [4]. Could the failure of the 2019 challenge be because of the switch from a classification task to a regression task?…”
Section: Introductionmentioning
confidence: 99%