Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries 2013
DOI: 10.1145/2467696.2467741
|View full text |Cite
|
Sign up to set email alerts
|

Multimodal alignment of scholarly documents and their presentations

Abstract: We present a multimodal system for aligning scholarly documents to corresponding presentations in a fine-grained manner (i.e., per presentation slide and per paper section). Our method improves upon a state-of-the-art baseline that employs only textual similarity. Based on an analysis of baseline errors, we propose a three-pronged alignment system that combines textual, image, and ordering information to establish alignment. Our results show a statistically significant improvement of 25%, confirming the import… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2013
2013
2019
2019

Publication Types

Select...
3
3

Relationship

1
5

Authors

Journals

citations
Cited by 7 publications
(3 citation statements)
references
References 12 publications
0
3
0
Order By: Relevance
“…Finally, alignment between different modalities (e.g., presentation, videos) and text was studied in different domains. Both Kan (2007) and Bahrani and Kan (2013) studied the problem of document to presentation alignment for scholarly documents. Kan (2007) focused on the the discovery and crawling of document-presentation pairs, and a model to align between documents to corresponding presentations.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Finally, alignment between different modalities (e.g., presentation, videos) and text was studied in different domains. Both Kan (2007) and Bahrani and Kan (2013) studied the problem of document to presentation alignment for scholarly documents. Kan (2007) focused on the the discovery and crawling of document-presentation pairs, and a model to align between documents to corresponding presentations.…”
Section: Related Workmentioning
confidence: 99%
“…Kan (2007) focused on the the discovery and crawling of document-presentation pairs, and a model to align between documents to corresponding presentations. In Bahrani and Kan (2013) they extended previous model to include also visual components of the slides. Aligning video and text was studied mainly in the setting of enriching videos with textual information (Bojanowski et al, 2015;Malmaud et al, 2015;Zhu et al, 2015).…”
Section: Related Workmentioning
confidence: 99%
“…Based on these research results, the effectiveness of indicating explanation spots can be considered. For lecture retrieval, researches for aligning lecture slides with speech recognition results or articles are relatively common [4,5,6,7]. However, rarely are studies undertaken to align individual contents in the slide with utterances, as dealt with in this paper.…”
Section: Introductionmentioning
confidence: 99%