2020
DOI: 10.31234/osf.io/fy8zx
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

SAYCam: A large, longitudinal audiovisual dataset recorded from the infant’s perspective

Abstract: We introduce a new resource: the SAYCam corpus. Infants aged 6-32 months wore a head-mounted camera for approximately 2 hours per week, over the course of approximately two and a half years. The result is a large, naturalistic, longitudinal dataset of infant- and child-perspective videos. Transcription efforts are underway, with over 200,000 words of naturalistic dialogue already transcribed. Similarly, the dataset is searchable using a number of criteria (e.g., age of participant, location, setting, objects p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
53
0
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 29 publications
(54 citation statements)
references
References 0 publications
0
53
0
1
Order By: Relevance
“…Images depicting people, specifically the categories “man,” “woman,” and “child,” were not sampled according to census distributions (age, ethnicity, gender, etc.). Moreover, ecoset image and category distributions do not reflect the naturalistic, egocentric visual input typically encountered in the everyday life of infant and adults ( 46 , 47 ).…”
Section: Methodsmentioning
confidence: 92%
“…Images depicting people, specifically the categories “man,” “woman,” and “child,” were not sampled according to census distributions (age, ethnicity, gender, etc.). Moreover, ecoset image and category distributions do not reflect the naturalistic, egocentric visual input typically encountered in the everyday life of infant and adults ( 46 , 47 ).…”
Section: Methodsmentioning
confidence: 92%
“…Moreover, ImageNet consists of statistically independent static frames, while infants receive a continuous stream of temporally correlated inputs ( 58 ). A better proxy of the real infant datastream is represented by the recently released SAYCam ( 59 , 60 ) dataset, which contains head-mounted video camera data from three children (about 2 h/wk spanning ages 6 to 32 mo) ( Fig. 3 B ).…”
Section: Deep Contrastive Learning On First-person Video Data From Chmentioning
confidence: 99%
“…These pathways were optimized to aggregate the resulting embeddings and their close neighbors (light brown points) and to separate the resulting embeddings and their farther neighbors (dark brown points). ( B ) Examples from the SAYCam dataset ( 59 ), which was collected by head-mounted cameras on infants for 2 h each week between ages 6 and 36 mo. ( C ) Neural predictivity for models trained on SAYCam and ImageNet.…”
Section: Deep Contrastive Learning On First-person Video Data From Chmentioning
confidence: 99%
“…As a baseline, we test the embeddings created by randomly initialized models, examining whether or not the inductive biases conveyed by the architecture are sufficient to embed objects in the same relation more similarly. We then then compare these results to models trained on the following datasets: SAYCam: this dataset offers longitudinal headcam video from a small number of babies (Sullivan et al, 2020). We use models trained on a single child's footage (child S), approximately two hours per week while the child was between 6-30 months old, a total of 221 hours.…”
Section: Colorsmentioning
confidence: 99%
“…Quinn (2003) reviews two findings in infant relation categorization: categorizing one object as above/below another precedes categorizing an object as between other objects, and categorizing relations over specific objects predates abstract relations over varying objects. We model these phenomena with deep neural networks, including contemporary architectures specialized for relational learning and vision models pretrained on baby headcam footage (Sullivan et al, 2020). Across two computational experiments, we can account for most of the developmental findings, suggesting these neural network models are useful for studying the computational mechanisms of infant categorization.…”
mentioning
confidence: 99%