2014
DOI: 10.1109/taslp.2013.2294586
|View full text |Cite
|
Sign up to set email alerts
|

Latent Semantic Analysis for Multimodal User Input With Speech and Gestures

Abstract: Abstract-This paper describes our work in semantic interpretation of a "multimodal language" with speech and gestures using latent semantic analysis (LSA). Our aim is to infer the domain-specific informational goal of multimodal inputs. The informational goal is characterized by lexical terms used in the spoken modality, partial semantics of gestures in the pen modality, as well as term co-occurrence patterns across modalities, leading to "multimodal terms." We designed and collected a multimodal corpus of nav… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 15 publications
(6 citation statements)
references
References 36 publications
(38 reference statements)
0
4
0
Order By: Relevance
“…Как правило, потоки неструктурированных данных, генерируемые такой подсистемой, настолько интенсивны, что производительности современных систем интеллектуального принятия решений недостаточно для онтологизации всех объектов, событий и ситуаций, информацию о которых несут эти потоки, в режиме реального времени [11,12].…”
Section: рис 2 мультиакторная архитектура агнейрона в программе имита...unclassified
“…Как правило, потоки неструктурированных данных, генерируемые такой подсистемой, настолько интенсивны, что производительности современных систем интеллектуального принятия решений недостаточно для онтологизации всех объектов, событий и ситуаций, информацию о которых несут эти потоки, в режиме реального времени [11,12].…”
Section: рис 2 мультиакторная архитектура агнейрона в программе имита...unclassified
“…Minotto et al [35] used an RGB camera and depth sensor as input stream and proposed a multimodal speaker diarization algorithm to extract speech features for fusion. Hui et al [36] analyzed and fused multimodal languages of speech and gestures based on latent semantic analysis (LSA).…”
Section: B Multimodal Interactionmentioning
confidence: 99%
“…Their experiments showed that multimodality was better than single mode recognition. In 2014, Hui and Meng [29] fused the user's voice and pen at the feature layer. The experimental results showed that the user's intention was understood and that the robustness was improved.…”
Section: Related Workmentioning
confidence: 99%