Mixup is a recently proposed technique that creates virtual training examples by combining existing ones. It has been successfully used in various machine learning tasks. This paper focuses on applying mixup to automatic speech recognition (ASR). More specifically, several strategies for acoustic model training are investigated, including both conventional cross-entropy and novel lattice-free MMI models. Considering mixup as a method of data augmentation as well as regularization, we compare it with widely used speed perturbation and dropout techniques. Experiments on Switchboard-1, AMI and TED-LIUM datasets shows consistent improvement of word error rate up to 13% relative. Moreover, mixup is found to be particularly effective on test data mismatched to the training data.
This paper describes the Speech Technology Center (STC) system for the 5th CHiME challenge. This challenge considers the problem of distant multi-microphone conversational speech recognition in everyday home environments. Our efforts were focused on the single-array track, however, we participated in the multiple-array track as well. The system is in the ranking A of the challenge: acoustic models remain frame-level tied phonetic targets, lexicon and language model are not changed compared to the conventional ASR baseline. Our system employs a combination of 4 acoustic models based on convolutional and recurrent neural networks. Speaker adaptation with target speaker masks and multi-channel speaker-aware acoustic model with neural network beamforming are two major features of the system. Moreover, various techniques for improving acoustic models are applied, including array synchronization, data cleanup, alignment transfer, mixup, speed perturbation data augmentation, room simulation, and backstitch training. Our system scored 3rd in the single-array track with Word Error Rate (WER) of 55.5% and 4th in the multiple-array track with WER of 55.6% on the evaluation data, achieving a substantial improvement over the baseline system.
Electrical discharge machining (EDM) bearing currents that may occur within electric machines of variablespeed-drive motor systems have been recognized for a long time.One key influential factor, the machine's capacitive voltage divider "bearing-voltage-ratio" BVR strongly depends on the rotor-to-frame and the stator winding-to-rotor capacitances; these are, in turn, affected by the design of the machine's stator slot. This paper presents an approach to improve the accuracy with which these capacitances can be estimated. It is based on the well-known plate capacitance equation which is then corrected by normalization functions. The functions are defined by extensive parameter studies using electrostatic FEM simulations. The final expressions not only allow for the prediction of the statorwinding-to-rotor and rotor-to-frame capacitances, they are also readily applicable. Thereby, they facilitate, for example, the clear-cut study of the sensitivity of the BVR towards changes in the different stator slot parameters.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.