Shuang Xu scite author profile

A convenient and effective binocular vision system is set up. Gesture information can be accurately extract from the complex environment with the system. The template calibration method is used to calibrate the binocular camera and the parameters of the camera are accurately obtained. In the phase of stereo matching, the BM algorithm is used to quickly and accurately match the images of the left and right cameras to get the parallax of the measured gesture. Combined with triangulation principle, resulting in a more dense depth map. Finally, the depth information is remapped to the original color image to realize three-dimensional reconstruction and three-dimensional cloud image generation. According to the cloud image information, it can be judged that the binocular vision system can effectively segment the gesture from the complex background.

show abstract

Gesture recognition based on an improved local sparse representation classification algorithm

Liao

et al. 2017

Cluster Comput

View full text Add to dashboard Cite

The sparse representation classification method has been widely concerned and studied in pattern recognition because of its good recognition effect and classification performance. Using the minimized l 1 norm to solve the sparse coefficient, all the training samples are selected as the redundant dictionary to calculate, but the computational complexity is higher. Aiming at the problem of high computational complexity of the l 1 norm based solving algorithm, l 2 norm local sparse representation classification algorithm is proposed. This algorithm uses the minimum l 2 norm method to select the local dictionary. Then the minimum l 1 norm is used in the dictionary to solve sparse coefficients for classify them, and the algorithm is used to verify the gesture recognition on the constructed gesture database. The experimental results show that the algorithm can effectively reduce the calculation time while ensuring the recognition rate, and the performance of the algorithm is slightly better than KNN-SRC algorithm.

show abstract

Syllable-Based Sequence-to-Sequence Speech Recognition with the Transformer in Mandarin Chinese

Zhou¹,

Dong²,

Xu³

et al. 2018

View full text Add to dashboard Cite

Sequence-to-sequence attention-based models have recently shown very promising results on automatic speech recognition (ASR) tasks, which integrate an acoustic, pronunciation and language model into a single neural network. In these models, the Transformer, a new sequence-to-sequence attention-based model relying entirely on self-attention without using RNNs or convolutions, achieves a new single-model state-of-the-art BLEU on neural machine translation (NMT) tasks. Since the outstanding performance of the Transformer, we extend it to speech and concentrate on it as the basic architecture of sequence-to-sequence attention-based model on Mandarin Chinese ASR tasks. Furthermore, we investigate a comparison between syllable based model and context-independent phoneme (CI-phoneme) based model with the Transformer in Mandarin Chinese. Additionally, a greedy cascading decoder with the Transformer is proposed for mapping CI-phoneme sequences and syllable sequences into word sequences. Experiments on HKUST datasets demonstrate that syllable based model with the Transformer performs better than CI-phoneme based counterpart, and achieves a character error rate (CER) of 28.77%, which is competitive to the state-of-the-art CER of 28.0% by the joint CTC-attention based encoder-decoder network.

show abstract

Gesture recognition based on modified adaptive orthogonal matching pursuit algorithm

Sun

et al. 2017

Cluster Comput

View full text Add to dashboard Cite

Aiming at the disadvantages of greedy algorithms in sparse solution, a modified adaptive orthogonal matching pursuit algorithm (MAOMP) is proposed in this paper. It is obviously improved to introduce sparsity and variable step size for the MAOMP. The algorithm estimates the initial value of sparsity by matching test, and will decrease the number of subsequent iterations. Finally, the step size is adjusted to select atoms and approximate the true sparsity at different stages. The simulation results show that the algorithm which has proposed improves the recognition accuracy and efficiency comparing with other greedy algorithms.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Shuang Xu

Speech-Transformer: A No-Recurrence Sequence-to-Sequence Model for Speech Recognition

Asymmetric Two-Stream Architecture for Accurate RGB-D Saliency Detection

Hand gesture recognition based on convolution neural network

Grasping force prediction based on sEMG signals

Gesture recognition based on binocular vision

Gesture recognition based on an improved local sparse representation classification algorithm

Syllable-Based Sequence-to-Sequence Speech Recognition with the Transformer in Mandarin Chinese

Gesture recognition based on modified adaptive orthogonal matching pursuit algorithm

Contact Info

Product

Resources

About