We describe a new approach to speech recognition, in which all Hidden Markov Model (HMM) states share the same Gaussian Mixture Model (GMM) structure with the same number of Gaussians in each state. The model is defined by vectors associated with each state with a dimension of, say, 50, together with a global mapping from this vector space to the space of parameters of the GMM. This model appears to give better results than a conventional model, and the extra structure offers many new opportunities for modeling innovations while maintaining compatibility with most standard techniques.
Although research has previously been done on multilingual speech recognition, it has been found to be very difficult to improve over separately trained systems. The usual approach has been to use some kind of "universal phone set" that covers multiple languages. We report experiments on a different approach to multilingual speech recognition, in which the phone sets are entirely distinct but the model has parameters not tied to specific states that are shared across languages. We use a model called a "Subspace Gaussian Mixture Model" where states' distributions are Gaussian Mixture Models with a common structure, constrained to lie in a subspace of the total parameter space. The parameters that define this subspace can be shared across languages. We obtain substantial WER improvements with this approach, especially with very small amounts of inlanguage training data.
Preparation of a lexicon for speech recognition systems can be a significant effort in languages where the written form is not exactly phonetic. On the other hand, in languages where the written form is quite phonetic, some common words are often mispronounced. In this paper, we use a combination of lexicon learning techniques to explore whether a lexicon can be learned when only a small lexicon is available for boot-strapping. We discover that for a phonetic language such as Spanish, it is possible to do that better than what is possible from generic rules or hand-crafted pronunciations. For a more complex language such as English, we find that it is still possible but with some loss of accuracy.
Task scheduling is one of the most difficult problems which is associated with cloud computing. Due to its nature, as it belongs to nondeterministic polynomial time (NP)-hard class of problem. Various heuristic as well as meta-heuristic approaches have been used to find the optimal solution. Task scheduling basically deals with the allocation of the task to the most efficient machine for optimal utilization of the computing resources and results in better makespan. As per literature, various meta-heuristic algorithms like genetic algorithm (GA), particle swarm optimization (PSO), ant colony optimization (ACO) and their other hybrid techniques have been applied. Through this paper, we are presenting a novel meta-heuristic technique — genetic algorithm enabled particle swarm optimization (PSOGA), a hybrid version of PSO and GA algorithm. PSOGA uses the diversification property of PSO and intensification property of the GA. The proposed algorithm shows its supremacy over other techniques which are taken into consideration by presenting less makespan time in majority of the cases which leads up to 22.2% improvement in performance of the system and also establishes that proposed PSOGA algorithm converges faster than the others.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.