HUMANNET—A Two-Tiered Deep Neural Network Architecture for Self-Occluding Humanoid Pose Reconstruction

Kulikajevas, Audrius; Maskeliūnas, Rytis; Damaševičius, Robertas; Scherer, Rafał

doi:10.3390/s21123945

Cited by 10 publications

(6 citation statements)

References 56 publications

(53 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For each region of interest (ROI), a multi-objective loss function consisting of classification losses (1,2), localization losses (3,4), and mask segmentation ( 5) is used.…”

Section: Loss Functionmentioning

confidence: 99%

See 1 more Smart Citation

Reconstruction of a 3D Human Foot Shape Model Based on a Video Stream Using Photogrammetry and Deep Neural Networks

et al. 2021

View full text Add to dashboard Cite

Reconstructed 3D foot models can be used for 3D printing and further manufacturing of individual orthopedic shoes, as well as in medical research and for online shoe shopping. This study presents a technique based on the approach and algorithms of photogrammetry. The presented technique was used to reconstruct a 3D model of the foot shape, including the lower arch, using smartphone images. The technique is based on modern computer vision and artificial intelligence algorithms designed for image processing, obtaining sparse and dense point clouds, depth maps, and a final 3D model. For the segmentation of foot images, the Mask R-CNN neural network was used, which was trained on foot data from a set of 40 people. The obtained accuracy was 97.88%. The result of the study was a high-quality reconstructed 3D model. The standard deviation of linear indicators in length and width was 0.95 mm, with an average creation time of 1 min 35 s recorded. Integration of this technique into the business models of orthopedic enterprises, Internet stores, and medical organizations will allow basic manufacturing and shoe-fitting services to be carried out and will help medical research to be performed via the Internet.

show abstract

“…For each region of interest (ROI), a multi-objective loss function consisting of classification losses (1,2), localization losses (3,4), and mask segmentation ( 5) is used.…”

Section: Loss Functionmentioning

confidence: 99%

“…Kulikajev et al carried out a series of studies [2][3][4] involving 3D reconstruction of the entire human body. These studies were unique due to their consideration of data as objects from an imperfect real-world frame; that is, the original data for reconstruction may include noise, glare, highlights, and low photo quality.…”

Section: Introductionmentioning

confidence: 99%

Reconstruction of a 3D Human Foot Shape Model Based on a Video Stream Using Photogrammetry and Deep Neural Networks

et al. 2021

View full text Add to dashboard Cite

show abstract

“…Recognition and detection of human poses are very widely used in neural networks, as they have excellent accuracy and effectiveness on larger datasets [34,35]. However, there is a limitation in the DNN model since the minute intersections or joints of a canine feature detection are very confused to detect the pose.…”

Section: Feature Extractionmentioning

confidence: 99%

Markerless Dog Pose Recognition in the Wild Using ResNet Deep Learning Model

2021

Self Cite

View full text Add to dashboard Cite

The analysis and perception of behavior has usually been a crucial task for researchers. The goal of this paper is to address the problem of recognition of animal poses, which has numerous applications in zoology, ecology, biology, and entertainment. We propose a methodology to recognize dog poses. The methodology includes the extraction of frames for labeling from videos and deep convolutional neural network (CNN) training for pose recognition. We employ a semi-supervised deep learning model of reinforcement. During training, we used a combination of restricted labeled data and a large amount of unlabeled data. Sequential CNN is also used for feature localization and to find the canine’s motions and posture for spatio-temporal analysis. To detect the canine’s features, we employ image frames to locate the annotations and estimate the dog posture. As a result of this process, we avoid starting from scratch with the feature model and reduce the need for a large dataset. We present the results of experiments on a dataset of more than 5000 images of dogs in different poses. We demonstrated the effectiveness of the proposed methodology for images of canine animals in various poses and behavior. The methodology implemented as a mobile app that can be used for animal tracking.

show abstract

“…FPS uses Euclidean distance metric to iteratively search for the sampling points, and the selected point is that farthest from other unselected members in each iteration [16]. Kulikajevas et al [8] proposed a two-tiered deep neural network for self-occluding humanoid pose reconstruction, in which the clipping network is designed to clip the region of interest and down sampling it with FPS for the subsequent reconstruction network. Since FPS can well cover the whole set of points, several methods use it to extract the feature point [35,38].…”

Section: Introductionmentioning

confidence: 99%

UPSNet: Universal Point Cloud Sampling Network Without Knowing Downstream Tasks

Tian

Song

Jiang

et al. 2022

ITC

View full text Add to dashboard Cite

With the development of three-dimensional sensing technology, the data volume of point cloud grows rapidly. Therefore, point cloud is usually down-sampled in advance so as to save memory space and reduce the computational complexity for its downstream processing tasks such as classification, segmentation, reconstruction in learning based point cloud processing. Obviously, the sampled point clouds should be well representative and maintain the geometric structure of the original point clouds so that the downstream tasks can achieve satisfied performance based on the point clouds sampled from the original ones. Traditional point cloud sampling methods such as farthest point sampling and random sampling mainly heuristically select a subset of the original point cloud. However, they do not make full use of high-level semantic representation of point clouds, are sensitive to outliers. Some of other sampling methods are task oriented. In this paper, a Universal Point cloud Sampling Network without knowing downstream tasks (denoted as UPSNet) is proposed. It consists of three modules. The importance learning module is responsible for learning the mutual information between the points of input point cloud and calculating a group of variational importance probabilities to represent the importance of each point in the input point cloud, based on which a mask is designed to discard the points with lower importance so that the number of remaining points is controlled. Then, the regional learning module learns from the input point cloud to get the high dimensional space embedding of each region, and the global feature of each region are obtained by weighting the high dimensional space embedding with the variational importance probability. Finally, through the coordinate regression module, the global feature and the high dimensional space embedding of each region are cascaded for learning to obtain the sampled point cloud. A series of experiments are implemented in which the point cloud classification, segmentation, reconstruction and retrieval are performed on the reconstructed point clouds sampled with different point cloud sampling methods. The experimental results show that the proposed UPSNet can provide more reasonable sampling result of the input point cloud for the downstream tasks of classification, segmentation, reconstruction and retrieval, and is superior to the existing sampling methods without knowing the downstream tasks. The proposed UPSNet is not oriented to specific downstream tasks, so it has wide applicability.

show abstract

HUMANNET—A Two-Tiered Deep Neural Network Architecture for Self-Occluding Humanoid Pose Reconstruction

Cited by 10 publications

References 56 publications

Reconstruction of a 3D Human Foot Shape Model Based on a Video Stream Using Photogrammetry and Deep Neural Networks

Reconstruction of a 3D Human Foot Shape Model Based on a Video Stream Using Photogrammetry and Deep Neural Networks

Markerless Dog Pose Recognition in the Wild Using ResNet Deep Learning Model

UPSNet: Universal Point Cloud Sampling Network Without Knowing Downstream Tasks

Contact Info

Product

Resources

About