MODELLING THE INFLUENCE OF IRRIGATION ON THE POTENTIAL YIELD OF TEA (<i>CAMELLIA SINENSIS</i>) IN NORTH-EAST INDIA

First-person vision is gaining interest as it offers a unique viewpoint on people's interaction with objects, their attention, and even intention. However, progress in this challenging domain has been relatively slow due to the lack of sufficiently large datasets. In this paper, we introduce EPIC-KITCHENS, a large-scale egocentric video benchmark recorded by 32 participants in their native kitchen environments. Our videos depict non-scripted daily activities: we simply asked each participant to start recording every time they entered their kitchen. Recording took place in 4 cities (in North America and Europe) by participants belonging to 10 different nationalities, resulting in highly diverse cooking styles. Our dataset features 55 hours of video consisting of 11.5M frames, which we densely labelled for a total of 39.6K action segments and 454.3K object bounding boxes. Our annotation is unique in that we had the participants narrate their own videos (after recording), thus reflecting true intention, and we crowd-sourced ground-truths based on these. We describe our object, action and anticipation challenges, and evaluate several baselines over two test splits, seen and unseen kitchens.

show abstract

Rescaling Egocentric Vision: Collection, Pipeline and Challenges for EPIC-KITCHENS-100

Damen

et al. 2021

View full text Add to dashboard Cite

This paper introduces the pipeline to extend the largest dataset in egocentric vision, EPIC-KITCHENS. The effort culminates in EPIC-KITCHENS-100, a collection of 100 hours, 20M frames, 90K actions in 700 variable-length videos, capturing long-term unscripted activities in 45 environments, using head-mounted cameras. Compared to its previous version (Damen in Scaling egocentric vision: ECCV, 2018), EPIC-KITCHENS-100 has been annotated using a novel pipeline that allows denser (54% more actions per minute) and more complete annotations of fine-grained actions (+128% more action segments). This collection enables new challenges such as action detection and evaluating the “test of time”—i.e. whether models trained on data collected in 2018 can generalise to new footage collected two years later. The dataset is aligned with 6 challenges: action recognition (full and weak supervision), action detection, action anticipation, cross-modal retrieval (from captions), as well as unsupervised domain adaptation for action recognition. For each challenge, we define the task, provide baselines and evaluation metrics.

show abstract

The EPIC-KITCHENS Dataset: Collection, Challenges and Baselines

Damen

Doughty

Farinella

et al. 2021

IEEE Trans. Pattern Anal. Mach. Intell.

101

105

View full text Add to dashboard Cite

Action Recognition From Single Timestamp Supervision in Untrimmed Videos

Moltisanti

Fidler

Damen

2019

View full text Add to dashboard Cite

Recognising actions in videos relies on labelled supervision during training, typically the start and end times of each action instance. This supervision is not only subjective, but also expensive to acquire. Weak video-level supervision has been successfully exploited for recognition in untrimmed videos, however it is challenged when the number of different actions in training videos increases. We propose a method that is supervised by single timestamps located around each action instance, in untrimmed videos. We replace expensive action bounds with sampling distributions initialised from these timestamps. We then use the classifier's response to iteratively update the sampling distributions. We demonstrate that these distributions converge to the location and extent of discriminative action segments.We evaluate our method on three datasets for finegrained recognition, with increasing number of different actions per video, and show that single timestamps offer a reasonable compromise between recognition performance and labelling effort, performing comparably to full temporal supervision. Our update method improves top-1 test accuracy by up to 5.4%. across the evaluated datasets.

show abstract

Trespassing the Boundaries: Labeling Temporal Bounds for Object Interactions in Egocentric Video

Moltisanti¹,

Wray²,

Mayol-Cuevas³

et al. 2017

View full text Add to dashboard Cite

Manual annotations of temporal bounds for object interactions (i.e. start and end times) are typical training input to recognition, localization and detection algorithms. For three publicly available egocentric datasets, we uncover inconsistencies in ground truth temporal bounds within and across annotators and datasets. We systematically assess the robustness of state-of-the-art approaches to changes in labeled temporal bounds, for object interaction recognition. As boundaries are trespassed, a drop of up to 10% is observed for both Improved Dense Trajectories and Two-Stream Convolutional Neural Network.We demonstrate that such disagreement stems from a limited understanding of the distinct phases of an action, and propose annotating based on the Rubicon Boundaries, inspired by a similarly named cognitive model, for consistent temporal bounds of object interactions. Evaluated on a public dataset, we report a 4% increase in overall accuracy, and an increase in accuracy for 55% of classes when Rubicon Boundaries are used for temporal annotations.

show abstract

SEMBED: Semantic Embedding of Egocentric Action Videos

Wray

Moltisanti

Mayol-Cuevas

et al. 2016

View full text Add to dashboard Cite

Abstract. We present SEMBED, an approach for embedding an egocentric object interaction video in a semantic-visual graph to estimate the probability distribution over its potential semantic labels. When object interactions are annotated using unbounded choice of verbs, we embrace the wealth and ambiguity of these labels by capturing the semantic relationships as well as the visual similarities over motion and appearance features. We show how SEMBED can interpret a challenging dataset of 1225 freely annotated egocentric videos, outperforming SVM classification by more than 5%.

show abstract

3-D monitoring of rubble mound breakwater damages

et al. 2018

View full text Add to dashboard Cite

Breakwaters play a crucial role in the protection of coastal zones. Their maintenance is critical to safeguard the daily activities of harbours and marine areas. The evaluation of damage is a necessity for timely preservation works. Traditional monitoring methods span various techniques, ranging from mechanical profilers to optical systems. Current methods though are typically expensive, requiring remarkably sophisticated technologies which demand a high degree of expertise to be operated. In this paper, we propose an affordable yet accurate fully automated method based on 3D cameras. Our technique is non invasive, allowing hence non intrusive as well as fast measure of damage over time, simultaneously above and below sea water level. Experimental results obtained on laboratory breakwater models demonstrated that the proposed point cloud method, which does not depend on the imaging sensor and can be applied to any 3D dataset of rubble mound breakwater, can achieve accurate damage estimation, even when using a budget RGB-D camera. One of the additional advantages of using RGB-D cameras is the possibility to obtain measurements also in the presence of water.

show abstract

Monitoring Accropodes Breakwaters using RGB-D Cameras

Moltisanti

Farinella

Musumeci

et al. 2015

View full text Add to dashboard Cite

12 3

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Davide Moltisanti

Scaling Egocentric Vision: The "Equation missing" Dataset

Rescaling Egocentric Vision: Collection, Pipeline and Challenges for EPIC-KITCHENS-100

The EPIC-KITCHENS Dataset: Collection, Challenges and Baselines

Action Recognition From Single Timestamp Supervision in Untrimmed Videos

Trespassing the Boundaries: Labeling Temporal Bounds for Object Interactions in Egocentric Video

SEMBED: Semantic Embedding of Egocentric Action Videos

3-D monitoring of rubble mound breakwater damages

Monitoring Accropodes Breakwaters using RGB-D Cameras

Contact Info

Product

Resources

About