Following improvements in deep neural networks, stateof-the-art netwy depends on the training data. An issue with collecting training data is labeling. Labeling by humans is necessary to obtain the ground truth label; however, labeling requires huge costs. Therefore, we propose an automatic labeled data generation pipeline, for which we can change any parameters or data generation environments. Our approach uses a human model named Dhaiba and a background of Miraikan and consequently generated realistic artificial data. We present 500k+ data generated by the proposed pipeline. This paper also describes the specification oforks have been proposed for human recog-nition using point clouds captured by LiDAR. However, the performance of these networks strongl the pipeline and data details with evaluations of various approaches.
Following the improvements in deep neural networks, state-of-the-art networks have been proposed for human segmentation using point clouds captured by light detection and ranging. However, the performance of these networks depends significantly on the training data. An issue with collecting training data is labeling. Labeling by humans is necessary to obtain ground-truth labels; however, labeling involves high costs. Therefore, we propose an automatically labeled data generation pipeline, for which we can change any parameters or data generation environments. Our approach uses a human model named Dhaiba and the background of Miraikan to generate realistic artificial data. We present 1M data generated by the proposed pipeline. Furthermore, we propose an ensemble learning based on generated data for utilizing our data generation pipeline. This paper proposes the specifications of the pipeline, data details, and explanation of ensemble learning with evaluations of various approaches.
In this paper, we propose an automatic labeled sequential data generation pipeline for human segmentation and velocity estimation with point clouds. Considering the impact of deep neural networks, state-of-the-art network architectures have been proposed for human recognition using point clouds captured by Light Detection and Ranging (LiDAR). However, almost all conventional datasets are either a collection of single LiDAR scanning with label information or sequential LiDAR scanning without label information. This limitation has disturbed the progress of research to date. Therefore, we have developed an automatic labeled sequential data generation pipeline, in which we can control any parameter or data generation environment with pixel-wise and per-frame ground truth segmentation and pixel-wise velocity information for human recognition. Our approach uses a precise human model and reproduces a precise motion to generate realistic artificial data. We present more than 7K sequences, where each sequence consists of 32 frames generated by the proposed pipeline. With the proposed sequence generator, we confirm that human segmentation performance is improved when using the sequential data compared to when using the data from a single LiDAR scan. We also evaluate our data by comparing with data generated under different conditions. In addition, we estimate pedestrian velocity with LiDAR by only utilizing data generated by the proposed pipeline.
This study proposes a route prediction method using a self-organizing incremental neural network. The route trajectory is predicted from two location parameters (the latitude and longitude of the middle of a tropical storm) and the meteorological information (the atmospheric pressure). The method accurately predicted the normalized atmospheric pressure data of East Asia in the topological space of latitude and longitude, with low calculation cost. This paper explains the algorithms for training the self-organizing incremental neural network, the procedure for refining the datasets and the method for predicting the storm trajectory. The effectiveness of the proposed method was confirmed in experiments. With the results of experiments, possibility of prediction model improvement is discussed. Additionally, this paper explains the limitations of proposed method and brief solution to resolve. Although the proposed method was applied only to typhoon phenomena in the present study, it is potentially applicable to a wide range of global problems.
Future frame prediction in videos is a challenging problem because videos include complicated movements and large appearance changes. Learning-based future frame prediction approaches have been proposed in kinds of literature. A common limitation of the existing learning-based approaches is a mismatch of training data and test data. In the future frame prediction task, we can obtain the ground truth data by just waiting for a few frames. It means we can update the prediction model online in the test phase. Then, we propose an adaptive update framework for the future frame prediction task. The proposed adaptive updating framework consists of a pre-trained prediction network, a continuous-updating prediction network, and a weight estimation network. We also show that our pre-trained prediction model achieves comparable performance to the existing state-of-the-art approaches. We demonstrate that our approach outperforms existing methods especially for dynamically changing scenes.
Consecutive LiDAR scans compose dynamic 3D sequences, which contain more abundant information than a single frame. Similar to the development history of image and video perception, dynamic 3D sequence perception starts to come into sight after inspiring research on static 3D data perception. This work proposes a spatio-temporal neural network for human segmentation with the dynamic LiDAR point clouds. It takes a sequence of depth images as input. It has a two-branch structure, i.e., the spatial segmentation branch and the temporal velocity estimation branch. The velocity estimation branch is designed to capture motion cues from the input sequence and then propagates them to the other branch. So that the segmentation branch segments humans according to both spatial and temporal features. These two branches are jointly learned on a generated dynamic point cloud dataset for human recognition. Our works fill in the blank of dynamic point cloud perception with the spherical representation of point cloud and achieves high accuracy. The experiments indicate that the introduction of temporal feature benefits the segmentation of dynamic point cloud.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.