Sequence-discriminative training of deep neural networks (DNNs) is investigated on a 300 hour American English conversational telephone speech task. Different sequencediscriminative criteria -maximum mutual information (MMI), minimum phone error (MPE), state-level minimum Bayes risk (sMBR), and boosted MMI -are compared. Two different heuristics are investigated to improve the performance of the DNNs trained using sequence-based criteria -lattices are regenerated after the first iteration of training; and, for MMI and BMMI, the frames where the numerator and denominator hypotheses are disjoint are removed from the gradient computation. Starting from a competitive DNN baseline trained using cross-entropy, different sequence-discriminative criteria are shown to lower word error rates by 8-9% relative, on average. Little difference is noticed between the different sequencebased criteria that are investigated. The experiments are done using the open-source Kaldi toolkit, which makes it possible for the wider community to reproduce these results.
Ghrelin, an endogenous ligand of the growth hormone secretagogue receptor (GHS-R), is the only circulating agent to powerfully promote a positive energy balance. Such action is mediated predominantly by central nervous system pathways controlling food intake, energy expenditure, and nutrient partitioning. The ghrelin pathway may therefore offer therapeutic potential for the treatment of catabolic states. However, the potency of the endogenous hormone ghrelin is limited due to a short half-life and the fragility of its bioactivity ensuring acylation at serine 3. Therefore, we tested the metabolic effects of two recently generated GHS-R agonists, BIM-28125 and BIM-28131, compared with ghrelin. All agents were administered continuously for 1 mo in doses of 50 and 500 nmol x kg(-1) x day(-1) using implanted subcutaneous minipumps in rats. High-dose treatment with single agonists or ghrelin increased body weight gain by promoting fat mass, whereas BIM-28131 was the only one also increasing lean mass significantly. Food intake increased during treatment with BIM-28131 or ghrelin, whereas no effects on energy expenditure were detected. With the lower dose, only BIM-28131 had a significant effect on body weight. This also held true when the compound was administered by subcutaneous injection three times/day. No symptoms or signs of undesired effects were observed in any of the studies or treated groups. These results characterize BIM-28131 as a promising GHS-R agonist with an attractive action profile for the treatment of catabolic disease states such as cachexia.
Severe caloric restriction, octreotide, and pituitary hormone replacement did not produce weight loss. Gastric bypass surgery led to reduced food cravings, significant weight loss, and amelioration of obesity-related comorbidities. Correction of fasting hyperinsulinemia, normalization of postprandial insulin responses, and reductions in active ghrelin and leptin concentrations were also observed.
Embeddings in machine learning are low-dimensional representations of complex input patterns, with the property that simple geometric operations like Euclidean distances and dot products can be used for classification and comparison tasks. We introduce meta-embeddings, which live in more general inner product spaces and which are designed to better propagate uncertainty through the embedding bottleneck. Traditional embeddings are trained to maximize between-class and minimize within-class distances. Meta-embeddings are trained to maximize relevant information throughput. As a proof of concept in speaker recognition, we derive an extractor from the familiar generative Gaussian PLDA model (GPLDA). We show that GPLDA likelihood ratio scores are given by Hilbert space inner products between Gaussian likelihood functions, which we term Gaussian meta-embeddings (GMEs). Meta-embedding extractors can be generatively or discriminatively trained. GMEs extracted by GPLDA have fixed precisions and do not propagate uncertainty. We show that a generalization to heavy-tailed PLDA gives GMEs with variable precisions, which do propagate uncertainty. Experiments on NIST SRE 2010 and 2016 show that the proposed method applied to i-vectors without length normalization is up to 20% more accurate than GPLDA applied to length-normalized i-vectors.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.