ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
DOI: 10.1109/icassp43922.2022.9747676
|View full text |Cite
|
Sign up to set email alerts
|

Automated Audio Captioning Using Transfer Learning and Reconstruction Latent Space Similarity Regularization

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
10
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2

Relationship

2
5

Authors

Journals

citations
Cited by 15 publications
(18 citation statements)
references
References 9 publications
0
10
0
Order By: Relevance
“…There is a plethora of pretrained models available publicly. These pretrained models are often used for transfer learning to another related domain [13], hence there is a need for finetuning. In our case, we find that it is sufficient to simply use these pretrained models as it is without finetuning.…”
Section: Pretrained Embeddingsmentioning
confidence: 99%
“…There is a plethora of pretrained models available publicly. These pretrained models are often used for transfer learning to another related domain [13], hence there is a need for finetuning. In our case, we find that it is sufficient to simply use these pretrained models as it is without finetuning.…”
Section: Pretrained Embeddingsmentioning
confidence: 99%
“…For instance, the AT-CNN [8], and the Audio Captioning Transformer [9] pretrains their model encoder on audio tagging as an pretraining task before finetuning their model directly on Automated Audio Captioning. Other authors also use self supervised supplementary objectives [3] or caption retrieval [10] to guide the model training. Self-Critical Sequence Training [11], a reinforcement learning tactic used in Image Captioning, has also been used to optimize the model for Audio Captioning.…”
Section: Model Architectures and Objectivesmentioning
confidence: 99%
“…In this work, we build on the work of [3] and [4] by using their model architectures that has been proven to achieve competitive results. Henceforth, we refer [4] and [3] as System 1 and System 2 respectively. Figure 1 delineates their system architectures.…”
Section: Model Architectures and Objectivesmentioning
confidence: 99%
See 2 more Smart Citations