Recently, end-to-end sequence-to-sequence models for speech recognition have gained significant interest in the research community. While previous architecture choices revolve around time-delay neural networks (TDNN) and long shortterm memory (LSTM) recurrent neural networks, we propose to use self-attention via the Transformer architecture as an alternative. Our analysis shows that deep Transformer networks with high learning capacity are able to exceed performance from previous end-to-end approaches and even match the conventional hybrid systems. Moreover, we trained very deep models with up to 48 Transformer layers for both encoder and decoders combined with stochastic residual connections, which greatly improve generalizability and training efficiency. The resulting models outperform all previous end-to-end ASR approaches on the Switchboard benchmark. An ensemble of these models achieve 9.9% and 17.7% WER on Switchboard and CallHome test sets respectively. This finding brings our end-to-end models to competitive levels with previous hybrid systems. Further, with model ensembling the Transformers can outperform certain hybrid systems, which are more complicated in terms of both structure and training procedure.
It has been suggested that plant phytochromes are autophosphorylating serine/threonine kinases. However, the biochemical properties and functional roles of putative phytochrome kinase activity in plant light signalling are largely unknown. Here, we describe the biochemical and functional characterization of Avena sativa phytochrome A (AsphyA) as a potential protein kinase. We provide evidence that phytochrome-interacting factors (PIFs) are phosphorylated by phytochromes in vitro. Domain mapping of AsphyA shows that the photosensory core region consisting of PAS-GAF-PHY domains in the N-terminal is required for the observed kinase activity. Moreover, we demonstrate that transgenic plants expressing mutant versions of AsphyA, which display reduced activity in in vitro kinase assays, show hyposensitive responses to far-red light. Further analysis reveals that far-red light-induced phosphorylation and degradation of PIF3 are significantly reduced in these transgenic plants. Collectively, these results suggest a positive relationship between phytochrome kinase activity and photoresponses in plants.
Data hiding is a well-known technique which embeds the secret data into a digital media. Most of the existing schemes either have the low image quality, or provide the restricted embedding capacity. In this paper, a new data hiding scheme based on turtle shell is proposed to obtain better image quality and higher embedding capacity. In the proposed scheme, a secret digit is embedded into each cover pixel pair with the guidance of the turtle shell. Experimental results reveal that the proposed scheme ensures not only higher embedding capacity, but also obtains better visual quality compared with the existing schemes.
Implementation of the WHO Trauma Care Checklist was associated with substantial improvements in patient care process measures among a cohort of patients in diverse settings.
Data hiding is a technique for sending secret information under the cover of the digital media. It is usually used to protect privacy and sensitive information when such information is transmitted via a public network. To date, high capacity remains one of the most important research aspects of data hiding. In this study, a new, turtle shell-based data hiding scheme is proposed to improve embedding capacity further while guaranteeing good image quality. In the proposed, turtle shell-based scheme, a reference matrix is composed and a location table is generated. Then, according to the reference matrix and the location table, each pixel pair is processed to conceal four secret bits. The experimental results indicated that the proposed scheme achieved higher embedding capacity and lower distortion of images than some existing schemes.
Sequence-to-Sequence (S2S) models recently started to show state-of-the-art performance for automatic speech recognition (ASR). With these large and deep models overfitting remains the largest problem, outweighing performance improvements that can be obtained from better architectures. One solution to the overfitting problem is increasing the amount of available training data and the variety exhibited by the training data with the help of data augmentation. In this paper we examine the influence of three data augmentation methods on the performance of two S2S model architectures. One of the data augmentation method comes from literature, while two other methods are our own development a time perturbation in the frequency domain and sub-sequence sampling. Our experiments on Switchboard and Fisher data show state-of-theart performance for S2S models that are trained solely on the speech training data and do not use additional text data.Many successful S2S models adopt log-mel frequency features as input. In the frequency domain, one major difficulty 1 The source code is available at https://github.com/thaisonngn/pynn arXiv:1910.13296v1 [eess.AS]
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.