We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite. It offers 3,670 hours of dailylife activity video spanning hundreds of scenarios (household, outdoor, workplace, leisure, etc.) captured by 931 unique camera wearers from 74 worldwide locations and 9 different countries. The approach to collection is designed to uphold rigorous privacy and ethics standards, with consenting participants and robust de-identification procedures where relevant. Ego4D dramatically expands the volume of diverse egocentric video footage publicly available to the research community. Portions of the video are accompanied by audio, 3D meshes of the environment, eye gaze, stereo, and/or synchronized videos from multiple egocentric cameras at the same event. Furthermore, we present a host of new benchmark challenges centered around understanding the first-person visual experience in the past (querying an episodic memory), present (analyzing hand-object manipulation, audio-visual conversation, and social interactions), and future (forecasting activities). By publicly sharing this massive annotated dataset and benchmark suite, we aim to push the frontier of first-person perception.
We study the problem of online subspace learning in the context of sequential observations involving structured perturbations. In online subspace learning, the observations are an unknown mixture of two components presented to the model sequentially -the main effect which pertains to the subspace and a residual/error term. If no additional requirement is imposed on the residual, it often corresponds to noise terms in the signal which were unaccounted for by the main effect. To remedy this, one may impose 'structural' contiguity, which has the intended effect of leveraging the secondary terms as a covariate that helps the estimation of the subspace itself, instead of merely serving as a noise residual. We show that the corresponding online estimation procedure can be written as an approximate optimization process on a Grassmannian. We propose an efficient numerical solution, GOSUS, Grassmannian Online Subspace Updates with Structured-sparsity, for this problem. GOSUS is expressive enough in modeling both homogeneous perturbations of the subspace and structural contiguities of outliers, and after certain manipulations, solvable via an alternating direction method of multipliers (ADMM). We evaluate the empirical performance of this algorithm on two problems of interest: online background subtraction and online multiple face tracking, and demonstrate that it achieves competitive performance with the state-of-the-art in near real time.
Precise detection and quantification of white matter hyperintensities (WMH) observed in T2–weighted Fluid Attenuated Inversion Recovery (FLAIR) Magnetic Resonance Images (MRI) is of substantial interest in aging, and age related neurological disorders such as Alzheimer’s disease (AD). This is mainly because WMH may reflect comorbid neural injury or cerebral vascular disease burden. WMH in the older population may be small, diffuse and irregular in shape, and sufficiently heterogeneous within and across subjects. Here, we pose hyperintensity detection as a supervised inference problem and adapt two learning models, specifically, Support Vector Machines and Random Forests, for this task. Using texture features engineered by texton filter banks, we provide a suite of effective segmentation methods for this problem. Through extensive evaluations on healthy middle–aged and older adults who vary in AD risk, we show that our methods are reliable and robust in segmenting hyperintense regions. A measure of hyperintensity accumulation, referred to as normalized Effective WMH Volume, is shown to be associated with dementia in older adults and parental family history in cognitively normal subjects. We provide an open source library for hyperintensity detection and accumulation (interfaced with existing neuroimaging tools), that can be adapted for segmentation problems in other neuroimaging studies.
The Mild Cognitive Impairment (MCI) stage of AD may be optimal for clinical trials to test potential treatments for preventing or delaying decline to dementia. However, MCI is heterogeneous in that not all cases progress to dementia within the time frame of a trial, and some may not have underlying AD pathology. Identifying those MCIs who are most likely to decline during a trial and thus most likely to benefit from treatment will improve trial efficiency and power to detect treatment effects. To this end, employing multi-modal imaging-derived inclusion criteria may be especially beneficial. Here, we present a novel multi-modal imaging marker that predicts future cognitive and neural decline from [F-18]fluorodeoxyglucose positron emission tomography (PET), amyloid florbetapir PET, and structural magnetic resonance imaging (MRI), based on a new deep learning algorithm (randomized denoising autoencoder marker, rDAm). Using ADNI2 MCI data, we show that employing rDAm as a trial enrichment criterion reduces the required sample estimates by at least five times compared to the no-enrichment regime, and leads to smaller trials with high statistical power, compared to existing methods.
Intranasal administration provides a non-invasive drug delivery route that has been proposed to target macromolecules either to the brain via direct extracellular cranial nerve-associated pathways or to the periphery via absorption into the systemic circulation. Delivering drugs to nasal regions that have lower vascular density and/or permeability may allow more drug to access the extracellular cranial nerve-associated pathways and therefore favor delivery to the brain. However, relative vascular permeabilities of the different nasal mucosal sites have not yet been reported. Here, we determined that the relative capillary permeability to hydrophilic macromolecule tracers is significantly greater in nasal respiratory regions than in olfactory regions. Mean capillary density in the nasal mucosa was also approximately 5-fold higher in nasal respiratory regions than in olfactory regions. Applying capillary pore theory and normalization to our permeability data yielded mean pore diameter estimates ranging from 13–17 nm for the nasal respiratory vasculature compared to <10 nm for the vasculature in olfactory regions. The results suggest lymphatic drainage for CNS immune responses may be favored in olfactory regions due to relatively lower clearance to the bloodstream. Lower blood clearance may also provide a reason to target the olfactory area for drug delivery to the brain.
No abstract
We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite. It offers 3,025 hours of dailylife activity video spanning hundreds of scenarios (household, outdoor, workplace, leisure, etc.) captured by 855 unique camera wearers from 74 worldwide locations and 9 different countries. The approach to collection is designed to uphold rigorous privacy and ethics standards with consenting participants and robust de-identification procedures where relevant. Ego4D dramatically expands the volume of diverse egocentric video footage publicly available to the research community. Portions of the video are accompanied by audio, 3D meshes of the environment, eye gaze, stereo, and/or synchronized videos from multiple egocentric cameras at the same event. Furthermore, we present a host of new benchmark challenges centered around understanding the first-person visual experience in the past (querying an episodic memory), present (analyzing hand-object manipulation, audio-visual conversation, and social interactions), and future (forecasting activities). By publicly sharing this massive annotated dataset and benchmark suite, we aim to push the frontier of first-person perception.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.