Wasserstein distance plays increasingly important roles in machine learning, stochastic programming and image processing. Major efforts have been under way to address its high computational complexity, some leading to approximate or regularized variations such as Sinkhorn distance. However, as we will demonstrate, regularized variations with large regularization parameter will degradate the performance in several important machine learning applications, and small regularization parameter will fail due to numerical stability issues with existing algorithms. We address this challenge by developing an Inexact Proximal point method for exact Optimal Transport problem (IPOT) with the proximal operator approximately evaluated at each iteration using projections to the probability simplex. The algorithm (a) converges to exact Wasserstein distance with theoretical guarantee and robust regularization parameter selection, (b) alleviates numerical stability issue, (c) has similar computational complexity to Sinkhorn, and (d) avoids the shrinking problem when apply to generative models. Furthermore, a new algorithm is proposed based on IPOT to obtain sharper Wasserstein barycenter.
Counting the number of work cycles per unit of time of earthmoving excavators is essential in order to calculate their productivity in earthmoving projects. The existing methods based on computer vision (CV) find it difficult to recognize the work cycles of earthmoving excavators effectively in long video sequences. Even the most advanced sequential pattern-based approach finds recognition difficult because it has to discern many atomic actions with a similar visual appearance. In this paper, we combine atomic actions with a similar visual appearance to build a stretching–bending sequential pattern (SBSP) containing only “Stretching” and “Bending” atomic actions. These two atomic actions are recognized using a deep learning-based single-shot detector (SSD). The intersection over union (IOU) is used to associate atomic actions to recognize the work cycle. In addition, we consider the impact of reality factors (such as driver misoperation) on work cycle recognition, which has been neglected in existing studies. We propose to use the time required to transform “Stretching” to “Bending” in the work cycle to filter out abnormal work cycles caused by driver misoperation. A case study is used to evaluate the proposed method. The results show that SBSP can effectively recognize the work cycles of earthmoving excavators in real time in long video sequences and has the ability to calculate the productivity of earthmoving excavators accurately.
This paper discusses the integration of a geographic information system (GIS) and moving objects in surveillance videos ("moving objects" hereinafter) by using motion detection, spatial mapping, and fusion representation techniques. This integration aims to overcome the limitations of conventional video surveillance systems, such as low efficiency in video searching, redundancy in video data transmission, and insufficient capability to position video content in geographic space. Furthermore, a model for integrating GIS and moving objects is established. The model includes a moving object extraction method and a fusion pattern for GIS and moving objects. From the established integration model, a prototype of GIS and moving objects (GIS-MOV) system is constructed and used to analyze the possible applications of the integration of GIS and moving objects.
This work discusses the integration of multi-camera video moving objects (MCVO) and GIS. This integration was motivated by the characteristics of multi-camera videos distributed in the urban environment, namely, large data volume, sparse distribution and complex spatial–temporal correlation of MCVO, thereby resulting in low efficiency of manual browsing and retrieval of videos. To address the aforementioned drawbacks, on the basis of multi-camera video moving object extraction, this paper first analyzed the characteristics of different video-GIS Information fusion methods and investigated the integrated data organization of MCVO by constructing a spatial–temporal pipeline among different cameras. Then, the conceptual integration model of MCVO and GIS was proposed on the basis of spatial mapping, and the GIS-MCVO prototype system was constructed in this study. Finally, this study analyzed the applications and potential benefits of the GIS-MCVO system, including a GIS-based user interface on video moving object expression in the virtual geographic scene, video compression storage, blind zone trajectory deduction, retrieval of MCVO, and video synopsis. Examples have shown that the integration of MCVO and GIS can improve the efficiency of expressing video information, achieve the compression of video data, rapidly assisting the user in browsing video objects from multiple cameras.
The top-k operation, i.e., finding the k largest or smallest elements from a collection of scores, is an important model component, which is widely used in information retrieval, machine learning, and data mining. However, if the top-k operation is implemented in an algorithmic way, e.g., using bubble algorithm, the resulting model cannot be trained in an end-to-end way using prevalent gradient descent algorithms. This is because these implementations typically involve swapping indices, whose gradient cannot be computed. Moreover, the corresponding mapping from the input scores to the indicator vector of whether this element belongs to the top-k set is essentially discontinuous. To address the issue, we propose a smoothed approximation, namely the SOFT (Scalable Optimal transport-based diFferenTiable) top-k operator. Specifically, our SOFT top-k operator approximates the output of the top-k operation as the solution of an Entropic Optimal Transport (EOT) problem. The gradient of the SOFT operator can then be efficiently approximated based on the optimality conditions of EOT problem. We apply the proposed operator to the k-nearest neighbors and beam search algorithms, and demonstrate improved performance.
Monitoring the work cycles of earthmoving excavators is an important aspect of construction productivity assessment. Currently, the most advanced method for the recognition of work cycles is the “Stretching-Bending” Sequential Pattern (SBSP), which is based on fixed-carrier video monitoring (FC-SBSP). However, the application of this method presupposes the availability of preconstructed installation carriers to act as a surveillance camera as well as installed and commissioned surveillance systems that work in tandem with them. Obviously, this method is difficult to apply to projects with no conditions for a monitoring camera installation or which have a short construction time. This highlights the potential application of Unmanned Aerial Vehicle (UAV) remote sensing, which is flexible and mobile. Unfortunately, few studies have been conducted on the application of UAV remote sensing for the work cycle monitoring of earthmoving excavators. This research is necessary because the use of UAV remote sensing for monitoring the work cycles of earthmoving excavators can improve construction productivity and save time and costs, especially in post-disaster reconstruction projects involving harsh construction environments, and emergency projects with short construction periods. In addition, the challenges posed by UAV shaking may have to be taken into account when using the SBSP for UAV remote sensing. To this end, this study used application experiments in which stabilization processing of UAV video data was performed for UAV shaking. The application experimental results show that the work cycle performance of UAV remote-sensing-based SBSP (UAV-SBSP) for UAV video data was 2.45% and 5.36% lower in terms of precision and recall, respectively, without stabilization processing than after stabilization processing. Comparative experiments were also designed to investigate the applicability of the SBSP oriented toward UAV remote sensing. Comparative experimental results show that the same level of performance was obtained for the recognition of work cycles with the UAV-SBSP as compared with the FC-SBSP, demonstrating the good applicability of this method. Therefore, the results of this study show that UAV remote sensing enables effective monitoring of earthmoving excavator work cycles in construction sites where monitoring cameras are not available for installation, and it can be used as an alternative technology to fixed-carrier video monitoring for onsite proximity monitoring.
Machine learning methods play increasingly important roles in pre-procedural planning for complex surgeries and interventions. Very often, however, researchers find the historical records of emerging surgical techniques, such as the transcatheter aortic valve replacement (TAVR), are highly scarce in quantity. In this paper, we address this challenge by proposing novel generative invertible networks (GIN) to select features and generate high-quality virtual patients that may potentially serve as an additional data source for machine learning. Combining a convolutional neural network (CNN) and generative adversarial networks (GAN), GIN discovers the pathophysiologic meaning of the feature space. Moreover, a test of predicting the surgical outcome directly using the selected features results in a high accuracy of 81.55%, which suggests little pathophysiologic information has been lost while conducting the feature selection. This demonstrates GIN can generate virtual patients not only visually authentic but also pathophysiologically interpretable.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.