Michal Hradiš scite author profile

We introduce a novel method for odometry estimation using convolutional neural networks from 3D LiDAR scans. The original sparse data are encoded into 2D matrices for the training of proposed networks and for the prediction. Our networks show significantly better precision in the estimation of translational motion parameters comparing with state of the art method LOAM, while achieving real-time performance. Together with IMU support, high quality odometry estimation and LiDAR data registration is realized. Moreover, we propose alternative CNNs trained for the prediction of rotational motion parameters while achieving results also comparable with state of the art. The proposed method can replace wheel encoders in odometry estimation or supplement missing GPS data, when the GNSS signal absents (e.g. during the indoor mapping). Our solution brings real-time performance and precision which are useful to provide online preview of the mapping results and verification of the map completeness in real time.

show abstract

CNN for very fast ground segmentation in velodyne LiDAR data

Veľas

Španěl

Hradiš

et al. 2018

View full text Add to dashboard Cite

This paper presents a novel method for ground segmentation in Velodyne point clouds. We propose an encoding of sparse 3D data from the Velodyne sensor suitable for training a convolutional neural network (CNN). This general purpose approach is used for segmentation of the sparse point cloud into ground and non-ground points. The LiDAR data are represented as a multi-channel 2D signal where the horizontal axis corresponds to the rotation angle and the vertical axis the indexes channels (i.e. laser beams). Multiple topologies of relatively shallow CNNs (i.e. 3-5 convolutional layers) are trained and evaluated using a manually annotated dataset we prepared. The results show significant improvement of performance over the state-of-the-art method by Zhang et al. in terms of speed and also minor improvements in terms of accuracy.

show abstract

Multimodal Emotion Recognition for AVEC 2016 Challenge

Povolny

Matějka

Hradiš

et al. 2016

View full text Add to dashboard Cite

This paper describes a systems for emotion recognition and its application on the dataset from the AV+EC 2016 Emotion Recognition Challenge. The realized system was produced and submitted to the AV+EC 2016 evaluation, making use of all three modalities (audio, video, and physiological data). Our work primarily focused on features derived from audio. The original audio features were complement with bottleneck features and also text-based emotion recognition which is based on transcribing audio by an automatic speech recognition system and applying resources such as word embedding models and sentiment lexicons. Our multimodal fusion reached CCC=0.855 on dev set for arousal and 0.713 for valence. CCC on test set is 0.719 and 0.596 for arousal and valence respectively.

show abstract

CNN for license plate motion deblurring

Svoboda

Hradiš

Marsik

et al. 2016

View full text Add to dashboard Cite

In this work we explore the previously proposed approach of direct blind deconvolution and denoising with convolutional neural networks in a situation where the blur kernels are partially constrained. We focus on blurred images from a real-life traffic surveillance system, on which we, for the first time, demonstrate that neural networks trained on artificial data provide superior reconstruction quality on real images compared to traditional blind deconvolution methods. The training data is easy to obtain by blurring sharp photos from a target system with a very rough approximation of the expected blur kernels, thereby allowing custom CNNs to be trained for a specific application (image content and blur range). Additionally, we evaluate the behavior and limits of the CNNs with respect to blur direction range and length.

show abstract

What do you want to do next

Bednarik

Vrzáková

Hradiš

2012

View full text Add to dashboard Cite

Interaction intent prediction and the Midas touch have been a longstanding challenge for eye-tracking researchers and users of gazebased interaction. Inspired by machine learning approaches in biometric person authentication, we developed and tested an offline framework for task-independent prediction of interaction intents. We describe the principles of the method, the features extracted, normalization methods, and evaluation metrics. We systematically evaluated the proposed approach on an example dataset of gazeaugmented problem-solving sessions. We present results of three normalization methods, different feature sets and fusion of multiple feature types. Our results show that accuracy of up to 76% can be achieved with Area Under Curve around 80%. We discuss the possibility of applying the results for an online system capable of interaction intent prediction.

show abstract

MixedEmotions: An Open-Source Toolbox for Multimodal Emotion Analysis

Buitelaar

Wood

Negi

et al. 2018

IEEE Trans. Multimedia

View full text Add to dashboard Cite

Recently, there is an increasing tendency to embed functionalities for recognizing emotions from user generated media content in automated systems such as call-centre operations, recommendations and assistive technologies, providing richer and more informative user and content profiles. However, to date, adding these functionalities was a tedious, costly, and time consuming effort, requiring identification and integration of diverse tools with diverse interfaces as required by the use case at hand. The MixedEmotions Toolbox leverages the need for such functionalities by providing tools for text, audio, video, and linked data processing within an easily integrable plug-and-play platform. These functionalities include: (i) for text processing: emotion and sentiment recognition, (ii) for audio processing: emotion, age, and gender recognition, (iii) for video processing: face detection and tracking, emotion recognition, facial landmark localization, head pose estimation, face alignment, and body pose estimation, and (iv) for linked data: knowledge graph integration. Moreover, the MixedEmotions Toolbox is open-source and free. In this article, we present this toolbox in the context of the existing landscape, and provide a range of detailed benchmarks on standard test-beds showing its state-of-the-art performance. Furthermore, three real-world use-cases show its effectiveness, namely emotion-driven smart TV, call center monitoring, and brand reputation analysis.

show abstract

Gaze and conversational engagement in multiparty video conversation

Bednarik

Eivazi

Hradiš

2012

View full text Add to dashboard Cite

When using a multiparty video mediated system, interacting participants assume a range of various roles and exhibit behaviors according to how engaged in the communication they are. In this paper we focus on estimation of conversational engagement from gaze signal. In particular, we present an annotation scheme for conversational engagement, a statistical analysis of gaze behavior across varying levels of engagement, and we classify vectors of computed eye tracking measures. The results show that in 74% of cases the level of engagement can be correctly classified into either high or low level. In addition, we describe the nuances of gaze during distinct levels of engagement.

show abstract

12 3 4 5 6

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Michal Hradiš

Convolutional Neural Networks for Direct Text Deblurring

CNN for IMU assisted odometry estimation using velodyne LiDAR

CNN for very fast ground segmentation in velodyne LiDAR data

Multimodal Emotion Recognition for AVEC 2016 Challenge

CNN for license plate motion deblurring

What do you want to do next

MixedEmotions: An Open-Source Toolbox for Multimodal Emotion Analysis

Gaze and conversational engagement in multiparty video conversation

Contact Info

Product

Resources

About