2020
DOI: 10.3390/s20092740
|View full text |Cite
|
Sign up to set email alerts
|

Using Complexity-Identical Human- and Machine-Directed Utterances to Investigate Addressee Detection for Spoken Dialogue Systems

Abstract: Human-machine addressee detection (H-M AD) is a modern paralinguistics and dialogue challenge that arises in multiparty conversations between several people and a spoken dialogue system (SDS) since the users may also talk to each other and even to themselves while interacting with the system. The SDS is supposed to determine whether it is being addressed or not. All existing studies on acoustic H-M AD were conducted on corpora designed in such a way that a human addressee and a machine played different dialogu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
10
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
3
2

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(10 citation statements)
references
References 29 publications
0
10
0
Order By: Relevance
“…First, with greater importance on true wake-word independence. In Akhtiamov et al (2020) the classification process was improved by employing an ensemble classifier, consisting of several classification tasks that are combined in a late fusion approach, which allows combining the strength of the different methods into one singular system. Second, with a heightened sense of privacy by changing a system to ignore information that is not directed to the device either by using features with a limited use to detect what has been said (Baumann and Siegert, 2020) or by extending a wakeword detection system by an acoustic feature classification to improve the security of such a system from false activations (Wang et al, 2020).…”
Section: Time Developmentmentioning
confidence: 99%
See 4 more Smart Citations
“…First, with greater importance on true wake-word independence. In Akhtiamov et al (2020) the classification process was improved by employing an ensemble classifier, consisting of several classification tasks that are combined in a late fusion approach, which allows combining the strength of the different methods into one singular system. Second, with a heightened sense of privacy by changing a system to ignore information that is not directed to the device either by using features with a limited use to detect what has been said (Baumann and Siegert, 2020) or by extending a wakeword detection system by an acoustic feature classification to improve the security of such a system from false activations (Wang et al, 2020).…”
Section: Time Developmentmentioning
confidence: 99%
“…For using RBC as a test set, one of the best performances of 60.90% Unweighted Average Recall (UAR) was achieved, when using VACC data together with RBC data with an end-to-end (e2e) speech processing model (Akhtiamov et al, 2019). But using a more complex meta-model, that makes use of different models to combine different layers of information, gives a slightly better performance of 62.80 UAR (Akhtiamov et al, 2019(Akhtiamov et al, , 2020.…”
Section: Studies Including Several Datasetsmentioning
confidence: 99%
See 3 more Smart Citations