The 2010 Silent Speech Challenge benchmark is updated with new results obtained in a Deep Learning strategy, using the same input features and decoding strategy as in the original article. A Word Error Rate of 6.4% is obtained, compared to the published value of 17.4%. Additional results comparing new auto-encoder-based features with the original features at reduced dimensionality, as well as decoding scenarios on two different language models, are also presented. The Silent Speech Challenge archive has been updated to contain both the original and the new auto-encoder features, in addition to the original raw data. Index Terms-silent speech interface, multimodal speech recognition, deep learning, language model INTRODUCTION 1.1.Silent speech interfaces and challengesA Silent Speech Interface, or SSI, is defined as a device enabling speech processing in the absence of an exploitable audio signal -for example, speech recognition obtained exclusively from video images of the mouth, or from electromyographic sensors (EMA) glued to the tongue. Classic applications targeted by SSIs include: 1)Voice-replacement for persons who have lost the ability to vocalize through illness or an accident, yet who retain the ability to articulate;2) Speech communication in environments where silence is either necessary or desired: responding to cellphone in meetings or public places without disturbing others; avoiding interference in call centers, conferences and classrooms; private communications by police, military, or business personnel. 7)Cortical implants for a "thought-driven" SSI. Figure 1: Overview of an SSI, showing non-acoustic sensors and non-acoustic automatic speech recognition, ASR, which can be followed by speech synthesis, or retained as a phonetic, text, or other digital representation, depending on the application.As a non-acoustic technology, SSIs initially stood somewhat apart from the main body of speech processing, where the standard techniques are intrinsically associated with an audio signal. Nevertheless, the novelty of the SSI concept and their exciting range of applications -perhaps aided by an accrued interest in multi-modal speech processing -are gradually allowing SSI technology to join the speech processing main stream. Activity in SSI research has remained strong since the publication of [1], which received the ISCA/Eurasip Best Paper Award in 2015. A recent survey of the literature reveals dozens of publications on SSI systems, using not only on the original seven non-acoustic technologies mentioned above, but also two additional ones, namely, low frequency air-borne ultrasound; and micropower radar .Despite this activity, SSIs today remain for the most part specialized laboratory instruments. The performance of any automatic speech recognition (ASR) system is most often characterized by a WordError Rate, or WER, expressed as a percentage of the total number of words appearing in a corpus. To date, no SSI ASR system has been able to achieve WER parity with state-of-the-art acoustic ASR.Indeed, a numbe...
Partial deletions on the long arm of chromosome 13 lead to a number of different phenotypes depending on the size and position of the deleted region. The present study investigated 2 patients with 13q terminal (13qter) deletion syndrome, which manifested as anal atresia with rectoperineal fistula, complex type congenital heart disease, esophageal hiatus hernia with gastroesophageal reflux, facial anomalies and developmental and mental retardation. Array comparative genomic hybridization identified 2 regions of deletion on chromosome 13q31-qter; 20.38 Mb in 13q31.3-qter and 12.99 Mb in 13q33.1-qter in patients 1 and 2, respectively. Comparisons between the results observed in the present study and those obtained from patients in previous studies indicate that the gene encoding ephrin B2 (EFNB2) located in the 13q33.3-q34 region, and the gene coding for endothelin receptor type B, in the 13q22.1–31.3 region, may be suitable candidate genes for the observed urogenital/anorectal anomalies. In addition, the microRNA-17-92a-1 cluster host gene and the glypican 6 gene in the 13q31.3 region, as well as EFNB2 and the collagen type IV a1 chain (COL4A1) and COL4A2 genes in the 13q33.1-q34 region may together contribute to cardiovascular disease development. It is therefore possible that these genes may be involved in the pathogenesis of complex type congenital heart disease in patients with 13q deletion syndrome.
Abstract. By the guidance of attention, human visual system is able to locate objects of interest in complex scene. In this paper, we propose a novel visual saliency detection method -the conditional saliency for both image and video. Inspired by biological vision, the definition of visual saliency follows a strictly local approach. Given the surrounding area, the saliency is defined as the minimum uncertainty of the local region, namely the minimum conditional entropy, when the perceptional distortion is considered. To simplify the problem, we approximate the conditional entropy by the lossy coding length of multivariate Gaussian data. The final saliency map is accumulated by pixels and further segmented to detect the proto-objects. Experiments are conducted on both image and video. And the results indicate a robust and reliable feature invariance saliency.
Groups of enterprises can guarantee each other and form complex networks in order to try to obtain loans from banks. Monitoring the financial status of a network, and preventing or reducing systematic risk in case of a crisis, is an area of great concern for the regulatory commission and for the banks. We set the ultimate goal of developing a visual analytic approach and tool for risk dissolving and decision-making. We have consolidated four main analysis tasks conducted by financial experts: i) Multi-faceted Default Risk Visualization, whereby a hybrid representation is devised to predict the default risk and an interface developed to visualize key indicators; ii) Risk Guarantee Patterns Discovery. We follow the Shneiderman mantra guidance for designing interactive visualization applications, whereby an interactive risk guarantee community detection and a motif detection based risk guarantee pattern discovery approach are described; iii) Network Evolution and Retrospective, whereby animation is used to help users to understand the guarantee dynamic; iv) Risk Communication Analysis. The temporal diffusion path analysis can be useful for the government and banks to monitor the spread of the default status. It also provides insight for taking precautionary measures to prevent and dissolve systematic financial risk. We implement the system with case studies using real-world bank loan data. Two financial experts are consulted to endorse the developed tool. To the best of our knowledge, this is the first visual analytics tool developed to explore networked-guarantee loan risks in a systematic manner.
Assessing and predicting the default risk of networked-guarantee loans is critical for the commercial banks and financial regulatory authorities. The guarantee relationships between the loan companies are usually modeled as directed networks. Learning the informative low-dimensional representation of the networks is important for the default risk prediction of loan companies, even for the assessment of systematic financial risk level. In this paper, we propose a high-order graph attention representation method (HGAR) to learn the embedding of guarantee networks. Because this financial network is different from other complex networks, such as social, language, or citation networks, we set the binary roles of vertices and define high-order adjacent measures based on financial domain characteristics. We design objective functions in addition to a graph attention layer to capture the importance of nodes. We implement a productive learning strategy and prove that the complexity is near-linear with the number of edges, which could scale to large datasets. Extensive experiments demonstrate the superiority of our model over state-of-the-art method. We also evaluate the model in a real-world loan risk control system, and the results validate the effectiveness of our proposed approaches.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.