Máté Szarvas scite author profile

This paper presents a novel pedestrian detection method based on the use of a convolutional neural network (CNN) classifier. Our method achieves high accuracy by automatidly optimizing the feature representation to the detection task and regularizing the neural network.We evaluate the proposed method on a diffcult database containing pedestrians in a city environment with no restrictions on pose, action, background and lighting conditions. The false positive rate (FPR) of the proposed CNN classifier is less than 1/5-th of the FPR of a support vector machine (SVM) classifier using Hilar-wavelet features when the detection rate is 90%. The accuracy of the SVM classifier s i n g the features learnt by the CNN is equivalent lo the accuracy of the CNN,

show abstract

Real-time Pedestrian Detection Using LIDAR and Convolutional Neural Networks

Szarvas

Utsushi

Ogata

View full text Add to dashboard Cite

Finite-state transducer based modeling of morphosyntax with applications to Hungarian LVCSR

Szarvas

Furui

View full text Add to dashboard Cite

This article introduces a novel approach to model morphosyntax in morpheme unit based speech recognizers. The proposed method is evaluated in our recent Hungarian large vocabulary continuous speech recognition (LVCSR) system. The architecture of the recognition system is based on the weighted finite state transducer (WFST) paradigm. The task domain is the recognition of fluently read sentences selected from a major daily newspaper. The vocabulary units used in the system are morpheme based in order to provide sufficient coverage of the large number of word-forms resulting from affixation and compounding. Besides the standard morpheme N-gram language model we evaluate the novel stochastic morphosyntactic language model (SMLM) that describes the valid word-forms (morpheme combinations) of the language. Thanks to the flexible transducer-based architecture of the system the morphosyntactic component is integrated seamlessly with the basic modules with no need to modify the decoder itself. The proposed stochastic morphosyntactic language model decreases the error rate by 17.9% relatively compared to the baseline trigram system. The morpheme error rate of the best configuration is 14.75% in a 1350 morpheme Hungarian dictation task.

show abstract

Finite-state transducer based hungarian LVCSR with explicit modeling of phonological changes

Szarvas¹,

Furui²

2002

View full text Add to dashboard Cite

This article describes the design and the experimental evaluation of the first Hungarian large vocabulary continuous speech recognition (LVCSR) system. The architecture of the recognition system is based on the recently proposed weighted finite state transducer (WFST) paradigm. The task domain is the recognition of fluently read sentences selected from a major daily newspaper. Recognition performance is evaluated using both monophone and triphone gender independent acoustic models. The vocabulary units used in the system are morpheme based in order to provide sufficient coverage of the large number of word-forms resulting from affixation and compounding in Hungarian. The language model is a statistical morpheme bigram model. Besides the basic list style pronunciation dictionary model we evaluate a novel phonology modeling component that describes the phonological changes prevalent in fluent Hungarian. Thanks to the flexible transducerbased architecture of the system the phonological component is integrated seamlessly with the basic modules with no need to modify the decoder itself. The proposed phonological model decreases the error rate by 8.32% relatively compared to the baseline triphone system. The morpheme error rate of the best configuration is 17.74% in a 1200 morpheme task with test set perplexity 70.

show abstract

Voxenter^TM - intelligent voice enabled call center for hungarian

Fegyó¹,

Mihajlik²,

Szarvas³

et al. 2003

View full text Add to dashboard Cite

Acoustic observation context modeling in segment based speech recognition

Szarvas¹,

Matsunaga²

1998

View full text Add to dashboard Cite

Evaluation of the stochastic morphosyntactic language model on a one million word hungarian dictation task

Szarvas¹,

Furui²

2003

View full text Add to dashboard Cite

Untitled

Fegyó

Szarvas

Tatai

et al. 2000

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Máté Szarvas

Pedestrian detection with convolutional neural networks

Real-time Pedestrian Detection Using LIDAR and Convolutional Neural Networks

Finite-state transducer based modeling of morphosyntax with applications to Hungarian LVCSR

Finite-state transducer based hungarian LVCSR with explicit modeling of phonological changes

Voxenter^TM - intelligent voice enabled call center for hungarian

Acoustic observation context modeling in segment based speech recognition

Evaluation of the stochastic morphosyntactic language model on a one million word hungarian dictation task

Untitled

Contact Info

Product

Resources

About