Comparing random forest approaches to segmenting and classifying gestures

Joshi, Ajjen; Monnier, Camille; Betke, Margrit; Sclaroff, Stan

doi:10.1016/j.imavis.2016.06.001

Cited by 52 publications

(12 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The main process of random forest is constructing the set of decision trees as in Fig. 14 during the training phase and finally output the class in the case of classification or the prediction in the case of regression [47].…”

Section: Random Forest Classifiermentioning

confidence: 99%

Supervised learning classifiers for Arabic gestures recognition using Kinect V2

Hisham

Hamouda

2019

SN Appl. Sci.

View full text Add to dashboard Cite

Human-Computer Interaction (HCI) refers to the interaction between the computers and human. One of the most important applications of (HCI) is sign language recognition. Several research works aimed to interpret and translate the sign language to a spoken language to help the hear impaired persons in integrating with communities. Sign language is the main way of communication for the deaf persons and hearing impaired which enable them to communicate with their societies and between each other's. According to the World Health Organization, there are 466 million hearing loss people (i.e. 5% of the world population), 432 million or 93% of them are adults and 34 million or 17% of them are children. The hearing-impaired persons often need to the same level of mental capability as the normal persons. The main problem is the most of the hearing persons who cannot understand the sign language and most of hearing impaired cannot read or write our spoken language, this represents a barrier between the deaf persons and their societies so developing an automatic sign language recognition system is very necessary. This research introduces dynamic Arabic Sign Language recognition system using Microsoft Kinect. The recognition depends on using two machine learning algorithms (a) Decision Tree and (b) Bayesian Network then applied Ada-Boosting technique to enhance the recognition of the system, we compared the results with two direct matching techniques: (a) Dynamic Time Wrapping and Hidden Markov Model the system was applied on 42 Arabic gestures related to medical field. The experimental results showed that the proposed system recognition rate reached 91.18% for Decision Tree classifier, 92.50% for Bayesian classifier and 93.7% after applying Ada-Boosting.

show abstract

Section: Random Forest Classifiermentioning

confidence: 99%

Supervised learning classifiers for Arabic gestures recognition using Kinect V2

Hisham

Hamouda

2019

SN Appl. Sci.

View full text Add to dashboard Cite

show abstract

“…No entanto, além de não ser destinada ao problema do HRI, a sua aquisição foi realizada utilizando o sensor Microsoft Kinect 360. Assim, os dados de natureza multimodal (cor, profundidade, esqueleto,áudio e máscara de usuário) fazem com que os principais trabalhos, como (Li et al, 2017), (Joshi et al, 2017), (Efthimiou et al, 2016), (Neverova et al, 2016), concentrem-se no uso da multimodalidade dos dados, quase que descartando a possibilidade do uso apenas de informação de cor. Isso, como discutido antes, pode condicionar seu uso apenas em ambientes possuidores de tais sensores, o que não acorre na maioria dos ambientes, onde as câmeras RGB são facilmente encontradas.…”

Section: Trabalhos Relacionadosunclassified

Reconhecimento De Gestos Dinâmico Para a Interação Humano-Robô

Canuto

Samatelo

Vassallo

2019

Anais Do 14º Simpósio Brasileiro De Automação Inteligente

View full text Add to dashboard Cite

With the advance of technologies, machines are closer to people. Thus, it is necessary to develop interfaces, like gestures, capable of providing an intuitive way of interaction. Therefore, this work proposes a modification of the Star RGB technique, which condenses the temporal information of a video in just one RGB image. The proposal, called Star RGB+, applies the Star RGB technique over the channels of a video. So, rather than only one RGB image, this proposes yields three images as a condensed representation of a gesture in an RGB video clip. As complement, still is proposed an ensemble-like architecture using 3 VGG16, as feature extractor, one for each image, and a fully connected architecture as classifier that recieves the fused information came out from the extractors. The main experiments were carried out on GRIT (Gesture Commands for Robot inTeracton) dataset, used for human-robot interaction, and achieve more than 97% of accuracy, precision, recall and f1-score, outperforming the author's original results in more than 5% for every metric. In order to compare results with the original propose of Star RGB, a secondary experiment was carried out on Montalbano dataset, achieving 92.34% of accuracy, outperforming the autor's results in more than 9%. This shows the contribution of this work for dynamic gesture recognition field, mainly for those ones used for human-robot interaction. Resumo: Com o avanço das tecnologias, as máquinas estão cada vez mais próximas das pessoas. Assim,é necessário desenvolver interfaces, como gestos, que forneçam uma maneira intuitiva de interação entre humano e robôs. Neste sentido, este trabalho visa propor uma modificação na técnica Star RGB, que condensa a informação temporal de um vídeo em apenas uma imagem RGB. A proposta aqui apresentada, chamada Star RGB+, aplica a técnica Star RGB nos canais de cor de um vídeo. Sendo assim, ao invés de apenas uma imagem RGB, esta proposta produz três imagens como representação condensada de um gesto presente em um vídeo colorido. Como complemento,é proposta também uma arquitetura do tipo ensemble utilizando para isso três redes VGG16 pretreinadas, uma para cada imagem, como extrator de características e uma arquitetura totalmente conectada como classificador que recebe a combinação das características extraídas por cada VGG16. Os principais experimentos foram realizados na base de dados GRIT (Gesture Commands for Robot inTeracton), usada para interação homem-robô, e atingiram mais de 97% em todas as métricas, acurácia, precisão, recall e F1-score, superando os resultados originais dos autores em mais de 5% em todas elas. A fim de comparar a melhora da proposta em relaçãoà original, um experimento secundário foi realizado na base de dados Montalbano, alcançando 92, 4% de taxa de reconhecimento, superando os resultados dos autores em mais de 9%. Isso mostra a contribuição deste trabalho para o reconhecimento de gestos dinâmicos, principalmente para aqueles destinadosà interação humano-robô.

show abstract

“…For feature encoding, the representative methods include bag of visual words (BoVW) [20], vector of locally aggregated descriptors (VLAD) [22] and fisher vector (FV) [7]. For the decision-making stage, the popular classifiers applied to the datasets in the Table 2 include KNN [20,22], SVM [12,7] and random forest [31]. Table 3 summarises the attributes of the human pose estimation dataset used for evaluation by some of the work presented in this special issue.…”

Section: The State-of-the-art In Human Motion Analysismentioning

confidence: 99%

Articulated motion and deformable objects

Wan

Escalera

Perales

et al. 2018

Pattern Recognition

View full text Add to dashboard Cite

This guest editorial introduces the twenty papers accepted for this Special Issue on Articulated Motion and Deformable Objects (AMDO). They are grouped into four main categories within the field of AMDO: human motion analysis (action/gesture), human pose estimation, deformable shape segmentation, and face analysis. For each of the four topics, a survey of the recent developments in the field is presented. The accepted papers are briefly introduced in the context of this survey. They contribute novel methods, algorithms with improved performance as measured on benchmarking datasets, as well as two new datasets for hand action detection and human posture analysis. The special issue should be of high relevance to the reader interested in AMDO recognition and promote future research directions in the field.

show abstract

Comparing random forest approaches to segmenting and classifying gestures

Cited by 52 publications

References 26 publications

Supervised learning classifiers for Arabic gestures recognition using Kinect V2

Supervised learning classifiers for Arabic gestures recognition using Kinect V2

Reconhecimento De Gestos Dinâmico Para a Interação Humano-Robô

Articulated motion and deformable objects

Contact Info

Product

Resources

About