2018 International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP) 2018
DOI: 10.1109/isai-nlp.2018.8692799
|View full text |Cite
|
Sign up to set email alerts
|

YouTube AV 50K: An Annotated Corpus for Comments in Autonomous Vehicles

Abstract: With one billion monthly viewers, and millions of users discussing and sharing opinions, comments below YouTube videos are rich sources of data for opinion mining and sentiment analysis. We introduce the YouTube AV 50K dataset, a freelyavailable collections of more than 50,000 YouTube comments and metadata below autonomous vehicle (AV)-related videos. We describe its creation process, its content and data format, and discuss its possible usages. Especially, we do a case study of the first self-driving car fata… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
13
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
6
3

Relationship

2
7

Authors

Journals

citations
Cited by 25 publications
(17 citation statements)
references
References 24 publications
0
13
0
Order By: Relevance
“…In study [16], analysis on youtube"s comments has done where social media and politics has been focused to score selected comments by sentiment analysis. Authors have developed blog ranking algorithm in study [13] to find out the implicit and hidden links given in a blog based on content analysis.…”
Section: Related Workmentioning
confidence: 99%
“…In study [16], analysis on youtube"s comments has done where social media and politics has been focused to score selected comments by sentiment analysis. Authors have developed blog ranking algorithm in study [13] to find out the implicit and hidden links given in a blog based on content analysis.…”
Section: Related Workmentioning
confidence: 99%
“…flips and cartwheels) and locomotion skills (walking and running) to train DL models for character motion animation instead of using motion capture systems. Reference [7] uses the video comments as a basis to create a sentiment analysis corpus for autonomous vehicles, [8] creates a video/audio corpus for lip reading training and [9] creates a corpus of 1.8 million 10sec audio clips for 632 audio event categories.…”
Section: Related Workmentioning
confidence: 99%
“…We are excited about the idea of using GANs for text regression. Given the nature of the TR-GAN model, it is not challenging to find an experimental dataset; for example, [27] collected 50,000 textual comments below YouTube videos, among which 20,000 are labelled by state-of-the-art algorithms and 1,000 are labelled manually. We also are interested to see how the generated languages look like, given that existing literatures of using GANs for NLG merely report original experimental results but instead numerical metrics.…”
Section: Future Workmentioning
confidence: 99%