We propose density independent hydrodynamics model (DIHM) which is a novel and automatic method for coherency detection in crowded scenes. One of the major advantages of the DIHM is its capability to handle changing density over time. Moreover, the DIHM avoids oversegmentation and thus achieves refined coherency detection. In the proposed DIHM, we first extract a motion flow field from the input video through particle initialization and dense optical flow. The particles of interest are then collected to retain only the most motile and informative particles. To represent each particle, we accumulate the contribution of each particle in a weighted form, based on a kernel function. Next, the smoothed particle hydrodynamics (SPH) is adopted to detect coherent regions. Finally, the detected coherent regions are refined to remove the effects of oversegmentation. We perform extensive experiments on three benchmark datasets and compare the results with 10 state-of-the-art coherency detection methods. Our results show that DIHM achieves superior coherency detection and outperforms the compared methods in both pixel level and coherent region level average particle error rates (PERs), average coherent number error (CNE) and F-score.
We propose a novel Gaussian kernel based integration model (GKIM) for anomalous entities detection and localization in pedestrian flows. The GKIM integrates spatio-temporal features for efficient and robust motion representation to capture the distinctive and meaningful information about the anomalous entities. We next propose a block based detection framework by training a recurrent conditional random field (R-CRF) using the GKIM features. The trained R-CRF model is then used to detect and localize the anomalous entities during the online testing stage. We conduct comprehensive experiments on three benchmark datasets and compare the performance of the proposed method with the state-of-the-art anomalous entities detection methods. Our experiments show that the proposed GKIM outperforms the compared methods in terms of equal error rate (EER) and detection rate (DR) in both frame-level and pixel-level comparisons. The frame-level analysis detects the presence of an anomalous entity in a frame regardless of its location. The pixel-level analysis localizes the anomalous entity in term of its pixels.
Deep Learning-based chest Computed Tomography (CT) analysis has been proven to be effective and efficient for COVID-19 diagnosis. Existing deep learning approaches heavily rely on large labeled data sets, which are difficult to acquire in this pandemic situation. Therefore, weakly-supervised approaches are in demand. In this paper, we propose an end-to-end weakly-supervised COVID-19 detection approach, ResNext+, that only requires volume level data labels and can provide slice level prediction. The proposed approach incorporates a lung segmentation mask as well as spatial and channel attention to extract spatial features. Besides, Long Short Term Memory (LSTM) is utilized to acquire the axial dependency of the slices. Moreover, a slice attention module is applied before the final fully connected layer to generate the slice level prediction without additional supervision. An ablation study is conducted to show the efficiency of the attention blocks and the segmentation mask block. Experimental results, obtained from publicly available datasets, show a precision of 81.9% and F1 score of 81.4%. The closest state-of-the-art gives 76.7% precision and 78.8% F1 score. The 5% improvement in precision and 3% in the F1 score demonstrate the effectiveness of the proposed method. It is worth noticing that, applying image enhancement approaches do not improve the performance of the proposed method, sometimes even harm the scores, although the enhanced images have better perceptual quality.
People counting in high density crowds is emerging as a new frontier in crowd video surveillance. Crowd counting in high density crowds encounters many challenges, such as severe occlusions, few pixels per head, and large variations in person's head sizes. In this paper, we propose a novel Density Independent and Scale Aware model (DISAM), which works as a head detector and takes into account the scale variations of heads in images. Our model is based on the intuition that head is the only visible part in high density crowds. In order to deal with different scales, unlike off-the-shelf Convolutional Neural Network (CNN) based object detectors which use general object proposals as inputs to CNN, we generate scale aware head proposals based on scale map. Scale aware proposals are then fed to the CNN and it renders a response matrix consisting of probabilities of heads. We then explore non-maximal suppression to get the accurate head positions. We conduct comprehensive experiments on two benchmark datasets and compare the performance with other state-of-theart methods. Our experiments show that the proposed DISAM outperforms the compared methods in both frame-level and pixel-level comparisons.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.