Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume 2021
DOI: 10.18653/v1/2021.eacl-main.99
|View full text |Cite
|
Sign up to set email alerts
|

Detecting Extraneous Content in Podcasts

Abstract: Podcast episodes often contain material extraneous to the main content, such as advertisements, interleaved within the audio and the written descriptions. We present classifiers that leverage both textual and listening patterns in order to detect such content in podcast descriptions and audio transcripts. We demonstrate that our models are effective by evaluating them on the downstream task of podcast summarization and show that we can substantively improve ROUGE scores and reduce the extraneous content genera… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
4
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
6
2

Relationship

1
7

Authors

Journals

citations
Cited by 14 publications
(7 citation statements)
references
References 15 publications
0
4
0
Order By: Relevance
“…2 Our participating system in TREC 2020 focuses on identifying salient segments from transcripts and using them as input to an abstractive summarizer (Song et al, 2020). Reddy et al (2021) develop classifiers to detect and eliminate extraneous marketing materials in podcasts to aid summarization. In this paper, we explore techniques that generate grounded podcast summaries where pieces of summary text are tied to short podcast clips.…”
Section: Related Workmentioning
confidence: 99%
“…2 Our participating system in TREC 2020 focuses on identifying salient segments from transcripts and using them as input to an abstractive summarizer (Song et al, 2020). Reddy et al (2021) develop classifiers to detect and eliminate extraneous marketing materials in podcasts to aid summarization. In this paper, we explore techniques that generate grounded podcast summaries where pieces of summary text are tied to short podcast clips.…”
Section: Related Workmentioning
confidence: 99%
“…Figure 3 is one illustrative example of a curve from listening data on Spotify that shows the proportion of listeners at each time point over the duration of a single podcast episode. In such curves, dips tend to correspond to ads or other extraneous material [68] and there are commonly sharp drops at the beginnings and the ends of episodes. These curves in general show some distinctive characteristics depending on the nature of the podcast; for example, well-known podcasts tend to have a sharper drop at the beginning than lesser known podcasts, since they attract a diverse group of listeners who may be curious about the podcast but find that they are not interested after a few seconds.…”
Section: Podcast Consumption and Feedbackmentioning
confidence: 99%
“…These curves in general show some distinctive characteristics depending on the nature of the podcast; for example, well-known podcasts tend to have a sharper drop at the beginning than lesser known podcasts, since they attract a diverse group of listeners who may be curious about the podcast but find that they are not interested after a few seconds. Listening curves are useful for detecting extraneous content [69], assessing ad monetization, improving summarization, and devising user engagement metrics on podcast-access platforms (such as, for example, deriving thresholds on the amount of listening that counts as user satisfaction).…”
Section: Podcast Consumption and Feedbackmentioning
confidence: 99%
“…At the heart of the monetization ecosystem on social media are influencers (also known as content creators) [44], who engage in complex supply chains for the pursuit of revenue based on their Internet activities. As a pervasive type of Internet entrepreneurs engaged in cultural production, influencers often turn into media empires with impressive economic returns [37]. With the content creation economy booming globally and at unprecedented scale, public authorities tasked with the enforcement of existing Internet law (e.g.…”
Section: Introductionmentioning
confidence: 99%