Wasserstein distance plays increasingly important roles in machine learning, stochastic programming and image processing. Major efforts have been under way to address its high computational complexity, some leading to approximate or regularized variations such as Sinkhorn distance. However, as we will demonstrate, regularized variations with large regularization parameter will degradate the performance in several important machine learning applications, and small regularization parameter will fail due to numerical stability issues with existing algorithms. We address this challenge by developing an Inexact Proximal point method for exact Optimal Transport problem (IPOT) with the proximal operator approximately evaluated at each iteration using projections to the probability simplex. The algorithm (a) converges to exact Wasserstein distance with theoretical guarantee and robust regularization parameter selection, (b) alleviates numerical stability issue, (c) has similar computational complexity to Sinkhorn, and (d) avoids the shrinking problem when apply to generative models. Furthermore, a new algorithm is proposed based on IPOT to obtain sharper Wasserstein barycenter.
Counting the number of work cycles per unit of time of earthmoving excavators is essential in order to calculate their productivity in earthmoving projects. The existing methods based on computer vision (CV) find it difficult to recognize the work cycles of earthmoving excavators effectively in long video sequences. Even the most advanced sequential pattern-based approach finds recognition difficult because it has to discern many atomic actions with a similar visual appearance. In this paper, we combine atomic actions with a similar visual appearance to build a stretching–bending sequential pattern (SBSP) containing only “Stretching” and “Bending” atomic actions. These two atomic actions are recognized using a deep learning-based single-shot detector (SSD). The intersection over union (IOU) is used to associate atomic actions to recognize the work cycle. In addition, we consider the impact of reality factors (such as driver misoperation) on work cycle recognition, which has been neglected in existing studies. We propose to use the time required to transform “Stretching” to “Bending” in the work cycle to filter out abnormal work cycles caused by driver misoperation. A case study is used to evaluate the proposed method. The results show that SBSP can effectively recognize the work cycles of earthmoving excavators in real time in long video sequences and has the ability to calculate the productivity of earthmoving excavators accurately.
This paper discusses the integration of a geographic information system (GIS) and moving objects in surveillance videos ("moving objects" hereinafter) by using motion detection, spatial mapping, and fusion representation techniques. This integration aims to overcome the limitations of conventional video surveillance systems, such as low efficiency in video searching, redundancy in video data transmission, and insufficient capability to position video content in geographic space. Furthermore, a model for integrating GIS and moving objects is established. The model includes a moving object extraction method and a fusion pattern for GIS and moving objects. From the established integration model, a prototype of GIS and moving objects (GIS-MOV) system is constructed and used to analyze the possible applications of the integration of GIS and moving objects.
This work discusses the integration of multi-camera video moving objects (MCVO) and GIS. This integration was motivated by the characteristics of multi-camera videos distributed in the urban environment, namely, large data volume, sparse distribution and complex spatial–temporal correlation of MCVO, thereby resulting in low efficiency of manual browsing and retrieval of videos. To address the aforementioned drawbacks, on the basis of multi-camera video moving object extraction, this paper first analyzed the characteristics of different video-GIS Information fusion methods and investigated the integrated data organization of MCVO by constructing a spatial–temporal pipeline among different cameras. Then, the conceptual integration model of MCVO and GIS was proposed on the basis of spatial mapping, and the GIS-MCVO prototype system was constructed in this study. Finally, this study analyzed the applications and potential benefits of the GIS-MCVO system, including a GIS-based user interface on video moving object expression in the virtual geographic scene, video compression storage, blind zone trajectory deduction, retrieval of MCVO, and video synopsis. Examples have shown that the integration of MCVO and GIS can improve the efficiency of expressing video information, achieve the compression of video data, rapidly assisting the user in browsing video objects from multiple cameras.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.