Proceedings of the 30th ACM International Conference on Information &Amp; Knowledge Management 2021
DOI: 10.1145/3459637.3481895
|View full text |Cite
|
Sign up to set email alerts
|

AudiBERT

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 28 publications
(5 citation statements)
references
References 24 publications
0
5
0
Order By: Relevance
“…As noted, the systematic review included 34 studies published between February 2019 and March 2024 that investigate the application of LLMs in various aspects of depression research (3,7,24,2454). These studies encompassed a wide variety of sample sizes, ranging from as few as 25 to over 632,000, and utilized data types ranging from clinical interview transcriptions and electronic health records to user-generated content on social media platforms ( Table 1 ).…”
Section: Resultsmentioning
confidence: 99%
“…As noted, the systematic review included 34 studies published between February 2019 and March 2024 that investigate the application of LLMs in various aspects of depression research (3,7,24,2454). These studies encompassed a wide variety of sample sizes, ranging from as few as 25 to over 632,000, and utilized data types ranging from clinical interview transcriptions and electronic health records to user-generated content on social media platforms ( Table 1 ).…”
Section: Resultsmentioning
confidence: 99%
“…Learn shared representations from weighted modality-specific representations Gated Multimodal Unit (GMU) [429], parallel attention model, attention layer, sparse MLP (mix vertical and horizontal information via weight sharing and sparse connection), multimodal encoder-decoder, multimodal factorized bilinear pooling (combines compact output features of multi-modal low-rank bilinear [430] and robustness of multi-modal compact bilinear [431]), multi-head intermodal attention fusion, transformer [295], feed-forward network, low-rank multimodal fusion network [432] [ 62,65,67,76,93,100,102,106,113,117,131,135,136,[142][143][144]174,218,433] Learn joint sparse representations Dictionary learning [20] Learn and fuse outputs from different modality-specific parts at fixed time steps Cell-coupled LSTM with L-skip fusion mechanism [101] Learn cross-modality representations that incorporate interactions between modalities LXMERT [434], transformer encoder with cross-attention layers (representations of a modality as query and the other as key/value, and vice versa), memory fusion network [435] [82, 92,129] Horizontal and vertical kernels to capture patterns across different levels CASER [309] [170]…”
Section: Model Levelmentioning
confidence: 99%
“…In order to improve the performance of multi-class deep neural networks [23] In order to detect depression as early as possible through speech [33], Ermal et al used the DAIC-WOZ dataset to propose a new deep learning framework AudiBERT that utilizes the multimodal characteristics of the human voice. In order to realize the prediction of Parkinson's disease (PD) [34], S. Kamoji et al used the Freezing of Gait dataset, Parkinson's clinical voice dataset, Parkinson's disease wave and spiral drawing dataset, and designed predictive models based on the decision tree, KNN, transfer learning, and Convolutional Neural Networks.…”
Section: Related Workmentioning
confidence: 99%