Deep learning has recently gained popularity achieving state-of-the-art performance in tasks involving text, sound, or image processing. Due to its outstanding performance, there have been efforts to apply it in more challenging scenarios, for example, 3D data processing. This article surveys methods applying deep learning on 3D data and provides a classification based on how they exploit them. From the results of the examined works, we conclude that systems employing 2D views of 3D data typically surpass voxel-based (3D) deep models, which however, can perform better with more layers and severe data augmentation. Therefore, larger-scale datasets and increased resolutions are required.
Gaze-based keyboards offer a flexible way for human-computer interaction in both disabled and able-bodied people. Besides their convenience, they still lead to error-prone human-computer interaction. Eye tracking devices may misinterpret user’s gaze resulting in typesetting errors, especially when operated in fast mode. As a potential remedy, we present a novel error detection system that aggregates the decision from two distinct subsystems, each one dealing with disparate data streams. The first subsystem operates on gaze-related measurements and exploits the eye-transition pattern to flag a typo. The second, is a brain-computer interface that utilizes a neural response, known as Error-Related Potentials (ErrPs), which is inherently generated whenever the subject observes an erroneous action. Based on the experimental data gathered from 10 participants under a spontaneous typesetting scenario, we first demonstrate that ErrP-based Brain Computer Interfaces can be indeed useful in the context of gaze-based typesetting, despite the putative contamination of EEG activity from the eye-movement artefact. Then, we show that the performance of this subsystem can be further improved by considering also the error detection from the gaze-related subsystem. Finally, the proposed bimodal error detection system is shown to significantly reduce the typesetting time in a gaze-based keyboard.
We present a dataset that combines multimodal biosignals and eye tracking information gathered under a human-computer interaction framework. The dataset was developed in the vein of the MAMEM project that aims to endow people with motor disabilities with the ability to edit and author multimedia content through mental commands and gaze activity. The dataset includes EEG, eye-tracking, and physiological (GSR and Heart rate) signals collected from 34 individuals (18 able-bodied and 16 motor-impaired). Data were collected during the interaction with specifically designed interface for web browsing and multimedia content manipulation and during imaginary movement tasks. The presented dataset will contribute towards the development and evaluation of modern human-computer interaction systems that would foster the integration of people with severe motor impairments back into society.
In this paper, we present a novel 3D segmentation approach operating on point clouds generated from overlapping images. The aim of the proposed hybrid approach is to effectively segment co-planar objects, by leveraging the structural information originating from the 3D point cloud and the visual information from the 2D images, without resorting to learning based procedures. More specifically, the proposed hybrid approach, H-RANSAC, is an extension of the well-known RANSAC plane-fitting algorithm, incorporating an additional consistency criterion based on the results of 2D segmentation. Our expectation that the integration of 2D data into 3D segmentation will achieve more accurate results, is validated experimentally in the domain of 3D city models. Results show that H-RANSAC can successfully delineate building components like main facades and windows, and provide more accurate segmentation results compared to the typical RANSAC plane-fitting algorithm.
Electroencephalography signals inherently deviate from the notion of regular spatial sampling, as they reflect the coordinated action from multiple distributed overlapping cortical networks. Hence, the observed brain dynamics are influenced both by the topology of the sensor array and the underlying functional connectivity. Neural engineers are currently exploiting the advances in the domain of graph signal processing in an attempt to create robust and reliable brain decoding systems. In this direction, Geometric Deep Learning is a highly promising concept for combining the benefits of graph signal processing and deep learning towards revolutionising Brain-Computer Interfaces (BCIs). However, its exploitation has been hindered by its data-demanding character. As a remedy, we propose here a novel data augmentation approach that combines the multiplex network modelling of multichannel signal with a graph variant of the classical Empirical Mode Decomposition (EMD), and which proves to be a strong asset when combined with Graph Convolutional Neural Networks (GCNNs). As our graph-EMD algorithm makes no assumptions with respect to linearity and stationarity, it appears as an appealing solution towards analysing brain signals without artificially imposing regularities in either temporal or spatial domain. Our experimental results indicate that the proposed scheme for data augmentation leads to substantial improvement when it is combined with GCNNs. Using recordings from two distinct BCI applications and comparing against a state-of-the-art augmentation method, we illustrate the benefits from its use. By making it available to BCI community, we hope to further foster the application of geometric deep learning in the field.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.