Minh Hoai scite author profile

Automatic video segmentation and action recognition has been a long-standing problem in computer vision. Much work in the literature treats video segmentation and action recognition as two independent problems; while segmentation is often done without a temporal model of the activity, action recognition is usually performed on pre-segmented clips. In this paper we propose a novel method that avoids the limitations of the above approaches by jointly performing video segmentation and action recognition. Unlike standard approaches based on extensions of dynamic Bayesian networks, our method is based on a discriminative temporal extension of the spatial bag-of-words model that has been very popular in object recognition. The classification is performed robustly within a multi-class SVM framework whereas the inference over the segments is done efficiently with dynamic programming. Experimental results on honeybee, Weizmann, and Hollywood datasets illustrate the benefits of our approach compared to state-of-the-art methods.

show abstract

Max-margin early event detectors

Hoai

Torre

2012

208

234

View full text Add to dashboard Cite

The need for early detection of temporal events from sequential data arises in a wide spectrum of applications ranging from human-robot interaction to video security. While temporal event detection has been extensively studied, early detection is a relatively unexplored problem. This paper proposes a maximum-margin framework for training temporal event detectors to recognize partial events, enabling early detection. Our method is based on Structured Output SVM, but extends it to accommodate sequential data. Experiments on datasets of varying complexity, for detecting facial expressions, hand gestures, and human activities, demonstrate the benefits of our approach. To the best of our knowledge, this is the first paper in the literature of computer vision that proposes a learning formulation for early event detection.

show abstract

Iterative Crowd Counting

2018

View full text Add to dashboard Cite

In this work, we tackle the problem of crowd counting in images. We present a Convolutional Neural Network (CNN) based density estimation approach to solve this problem. Predicting a high resolution density map in one go is a challenging task. Hence, we present a two branch CNN architecture for generating high resolution density maps, where the first branch generates a low resolution density map, and the second branch incorporates the low resolution prediction and feature maps from the first branch to generate a high resolution density map. We also propose a multi-stage extension of our approach where each stage in the pipeline utilizes the predictions from all the previous stages. Empirical comparison with the previous state-of-the-art crowd counting methods shows that our method achieves the lowest mean absolute error on three challenging crowd counting benchmarks: Shanghaitech, World-Expo'10, and UCF datasets.

show abstract

Large-Scale Training of Shadow Detectors with Noisily-Annotated Shadow Examples

et al. 2016

View full text Add to dashboard Cite

Good View Hunting: Learning Photo Composition from Dense View Pairs

et al. 2018

View full text Add to dashboard Cite

Shadow Detection with Conditional Generative Adversarial Networks

et al. 2017

View full text Add to dashboard Cite

Association of Proteinuria and Hematuria with Acute Kidney Injury and Mortality in Hospitalized Patients with COVID-19

Chaudhri

Moffitt

Taub

et al. 2020

Kidney Blood Press Res

View full text Add to dashboard Cite

Introduction: Acute kidney injury (AKI) is strongly associated with poor outcomes in hospitalized patients with coronavirus disease 2019 (COVID-19), but data on the association of proteinuria and hematuria are limited to non-US populations. In addition, admission and in-hospital measures for kidney abnormalities have not been studied separately. Methods: This retrospective cohort study aimed to analyze these associations in 321 patients sequentially admitted between March 7, 2020 and April 1, 2020 at Stony Brook University Medical Center, New York. We investigated the association of proteinuria, hematuria, and AKI with outcomes of inflammation, intensive care unit (ICU) admission, invasive mechanical ventilation (IMV), and in-hospital death. We used ANOVA, t test, χ2 test, and Fisher’s exact test for bivariate analyses and logistic regression for multivariable analysis. Results: Three hundred patients met the inclusion criteria for the study cohort. Multivariable analysis demonstrated that admission proteinuria was significantly associated with risk of in-hospital AKI (OR 4.71, 95% CI 1.28–17.38), while admission hematuria was associated with ICU admission (OR 4.56, 95% CI 1.12–18.64), IMV (OR 8.79, 95% CI 2.08–37.00), and death (OR 18.03, 95% CI 2.84–114.57). During hospitalization, de novo proteinuria was significantly associated with increased risk of death (OR 8.94, 95% CI 1.19–114.4, p = 0.04). In-hospital AKI increased (OR 27.14, 95% CI 4.44–240.17) while recovery from in-hospital AKI decreased the risk of death (OR 0.001, 95% CI 0.001–0.06). Conclusion: Proteinuria and hematuria both at the time of admission and during hospitalization are associated with adverse clinical outcomes in hospitalized patients with COVID-19.

show abstract

A+D Net: Training a Shadow Detector with Adversarial Shadow Attenuation

Vicente

Nguyen

et al. 2018

View full text Add to dashboard Cite

We propose a novel GAN-based framework for detecting shadows in images, in which a shadow detection network (D-Net) is trained together with a shadow attenuation network (A-Net) that generates adversarial training examples. The A-Net modifies the original training images constrained by a simplified physical shadow model and is focused on fooling the D-Net's shadow predictions. Hence, it is effectively augmenting the training data for D-Net with hard-to-predict cases. The D-Net is trained to predict shadows in both original images and generated images from the A-Net. Our experimental results show that the additional training data from A-Net significantly improves the shadow detection accuracy of D-Net. Our method outperforms the stateof-the-art methods on the most challenging shadow detection benchmark (SBU) and also obtains state-of-the-art results on a cross-dataset task, testing on UCF. Furthermore, the proposed method achieves accurate real-time shadow detection at 45 frames per second.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Minh Hoai

Joint segmentation and classification of human actions in video

Max-margin early event detectors

Iterative Crowd Counting

Large-Scale Training of Shadow Detectors with Noisily-Annotated Shadow Examples

Good View Hunting: Learning Photo Composition from Dense View Pairs

Shadow Detection with Conditional Generative Adversarial Networks

Association of Proteinuria and Hematuria with Acute Kidney Injury and Mortality in Hospitalized Patients with COVID-19

A+D Net: Training a Shadow Detector with Adversarial Shadow Attenuation

Contact Info

Product

Resources

About