Speech technology plays an important role in our everyday life. Among others, speech is used for human-computer interaction, for instance for information retrieval and on-line shopping. In the case of an unwritten language, however, speech technology is unfortunately difficult to create, because it cannot be created by the standard combination of pre-trained speechto-text and text-to-speech subsystems. The research presented in this paper takes the first steps towards speech technology for unwritten languages. Specifically, the aim of this work was 1) to learn speech-to-meaning representations without using text as an intermediate representation, and 2) to test the sufficiency of the learned representations to regenerate speech or translated text, or to retrieve images that depict the meaning of an utterance in an unwritten language. The results suggest that building systems that go directly from speech-to-meaning and from meaning-tospeech, bypassing the need for text, is possible.
Recent advances in activity-based travel modeling and integrated land use and transportation modeling have significantly advanced the understanding of and the capacity to model location choices and travel behavior more realistically. These advances, however, come with greater data requirements, and the risk and the substantial cost involved with adoption of these models have slowed their move to operational use. The purpose of this research was twofold. First, the study addressed one aspect of an incremental approach that more carefully balanced the risks and benefits of moving operational models in new directions: replacement of the choice model of home-based work destination in the four-step travel model system with a pair of choice models at the level of the individual worker. The new choice models were implemented as long-term choices in the linked land use model system. Second, the models were used to provide a way to derive matches between workers and their workplace with commonly available data. These matches complemented synthetic populations and provided a key input for activity-based travel models. The models predicted whether a worker would choose to work at home on a long-term basis; if he or she did not, an out-of-home job was chosen. These models linked an individual worker to a specific job at a workplace and therefore directly predicted commuting patterns. The paper presents the model specifications, estimation results, and results of validation of the models against observed commuting data from the Census Transportation Planning Package. The model reproduced observed commuting flows well, and computational performance was fast, even though the model operated at the level of the individual worker and job.
Clustering is an important data processing tool for interpreting microarray data and genomic network inference. In this article, we propose a clustering algorithm based on the hierarchical Dirichlet processes (HDP). The HDP clustering introduces a hierarchical structure in the statistical model which captures the hierarchical features prevalent in biological data such as the gene express data. We develop a Gibbs sampling algorithm based on the Chinese restaurant metaphor for the HDP clustering. We apply the proposed HDP algorithm to both regulatory network segmentation and gene expression clustering. The HDP algorithm is shown to outperform several popular clustering algorithms by revealing the underlying hierarchical structure of the data. For the yeast cell cycle data, we compare the HDP result to the standard result and show that the HDP algorithm provides more information and reduces the unnecessary clustering fragments.
Due to its ecological and biotechnological relevance, polyphosphate in microalgae is currently the focus of intense research. Numerous biological functions are performed by or dependent on polyphosphate, and they depend, among other factors, on its chain length. Chain length determination is important for understanding polyphosphate metabolism and for maximizing intracellular polyphosphate abundance per unit weight of biomass. 31 P-DOSY NMR virtually separates various polyphosphate polymers in a mixture based on different translational diffusion coefficients. The diffusion coefficient of a polyphosphate molecule correlates with its molecular weight, enabling determination of individual chain lengths. Moreover, the polydispersity index can also be uniquely determined by DOSY as a measure of the overall chain-length distribution of polyphosphates. By contrast, conventional 31 P NMR is only able to estimate the average chain length of the entire polyphosphate pool. Therefore, DOSY provides the opportunity to deepen our insight into polyphosphate metabolism and dynamics in algal biomass.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.