Optimized hand pose estimation CrossInfoNet-based architecture for embedded devices

Šimoník, Marek; Krumnikl, Michal

doi:10.1007/s00138-022-01332-8

Cited by 2 publications

(3 citation statements)

References 39 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We opted to only compare the two baselines because of the similarity in our approaches. On the first few samples both our architecture and CrossInfoMobileNet [12] performed well with insignificant differences, however as the number of samples increased, our proposed method exhibited superior performance compared to the competing state-of-the-art architectures. For instance, when evaluating the test samples based on the maximum joint error below a specified threshold, our method achieved a remarkable 96% accuracy at a threshold level of 40 mm.…”

Section: Error Analysis On Nyu Datasetsmentioning

confidence: 90%

“…This work presents lightweight network (LightWeightNet) hand pose estimation (HPE), a hand pose estimation based on convolutional neural network finetuned and optimized to mobile phone processors to minimize the computational cost and allow mobile phone users enjoy an immersive experience. The first attempt was the work of [11], where a CrossInfoMobilenet was presented replacing a computational critical CrossinfoNet [12]. Herein, we present an improved version of CrossInfoMobileNet with an additional depth-wise separable convolutions which greatly lowers the computational cost of a general convolutional neural network (CNN) model used in MobileNet3 [13].…”

mentioning

confidence: 99%

See 1 more Smart Citation

Hand LightWeightNet: an optimized hand pose estimation for interactive mobile interfaces

Banzi,

Leonard

2024

IJECE

View full text Add to dashboard Cite

In this paper, a hand pose estimation method is introduced that combines MobileNetV3 and CrossInfoNet into a single pipeline. The proposed approach is tailored for mobile phone processors through optimizations, modifications, and enhancements made to both architectures, resulting in a lightweight solution. MobileNetV3 provides the bottleneck for feature extraction and refinements while CrossInfoNet benefits the proposed system through a multitask information sharing mechanism. In the feature extraction stage, we utilized an inverted residual block that achieves a balance between accuracy and efficiency in limited parameters. Additionally, in the feature refinement stage, we incorporated a new best-performing activation function called “activate or not” ACON, which demonstrated stability and superior performance in learning linearly and non-linearly gates of the whole activation area of the network by setting hyperparameters to switch between active and inactive states. As a result, our network operated with 65% reduced parameters, but improved speed by 39% which is suitable for running in a mobile device processor. During experiment, we conducted test evaluation on three hand pose datasets to assess the generalization capacity of our system. On all the tested datasets, the proposed approach demonstrates consistently higher performance while using significantly fewer parameters than existing methods. This indicates that the proposed system has the potential to enable new hand pose estimation applications such as virtual reality, augmented reality and sign language recognition on mobile devices.

show abstract

Section: Error Analysis On Nyu Datasetsmentioning

confidence: 90%

mentioning

confidence: 99%

Hand LightWeightNet: an optimized hand pose estimation for interactive mobile interfaces

Banzi,

Leonard

2024

IJECE

View full text Add to dashboard Cite

show abstract

“…It comes with a new block type Squeeze-and-Excitation (SE) that better take into account feature maps based on their channel dependencies. Also, instead of the ReLU activation function, there is a Hard-swish function, which reduces the number of multiply-accumulate operations (MAC) but preserves nonlinearity [52]. It also comes separately in a version for more powerful and weaker target devices, and both versions can additionally be made minimalist or full.…”

Section: A Mobilenetmentioning

confidence: 99%

Facial Emotion Recognition for Mobile Devices: A Practical Review

Krumnikl,

Maiwald

2024

IEEE Access

View full text Add to dashboard Cite

Communicating via email or various chat applications on smartphones is part of most people's daily lives. But in written form, human communication loses a lot of valuable information, such as the facial expressions and emotions of the person you are communicating with. Thanks to techniques from the field of image processing, it is now possible to capture these non-verbal phenomena, and supplement written input with their non-verbal characteristics. In this paper, we explore the possibilities of emotion recognition from front camera images in mobile and embedded devices. A total of 63 classification and 28 regression models based on twelve different neural network architectures optimized for low performance mobile devices were trained and evaluated for success rate and latency. The training and evaluation of each neural network model is performed within the Keras API of the TensorFlow library and then converted to the TensorFlow Lite standard to reduce memory and computational requirements. Great care is taken to ensure that the entire process, from face detection to emotion classification, can operate in real time. To demonstrate and compare the performance of the evaluated models, a freely available optimized application running on Android mobile devices is created and published on Google Play, the source code of which is also available.

show abstract

Optimized hand pose estimation CrossInfoNet-based architecture for embedded devices

Cited by 2 publications

References 39 publications

Hand LightWeightNet: an optimized hand pose estimation for interactive mobile interfaces

Hand LightWeightNet: an optimized hand pose estimation for interactive mobile interfaces

Facial Emotion Recognition for Mobile Devices: A Practical Review

Contact Info

Product

Resources

About