A pseudoknot forms in an RNA when nucleotides in a loop pair with a region outside the helices that close the loop. Pseudoknots occur relatively rarely in RNA but are highly overrepresented in functionally critical motifs in large catalytic RNAs, in riboswitches, and in regulatory elements of viruses. Pseudoknots are usually excluded from RNA structure prediction algorithms. When included, these pairings are difficult to model accurately, especially in large RNAs, because allowing this structure dramatically increases the number of possible incorrect folds and because it is difficult to search the fold space for an optimal structure. We have developed a concise secondary structure modeling approach that combines SHAPE (selective 2′-hydroxyl acylation analyzed by primer extension) experimental chemical probing information and a simple, but robust, energy model for the entropic cost of single pseudoknot formation. Structures are predicted with iterative refinement, using a dynamic programming algorithm. This melded experimental and thermodynamic energy function predicted the secondary structures and the pseudoknots for a set of 21 challenging RNAs of known structure ranging in size from 34 to 530 nt. On average, 93% of known base pairs were predicted, and all pseudoknots in wellfolded RNAs were identified. Information is encoded in an RNA molecule at two levels: in its primary sequence and in its ability to form higher-order secondary and tertiary structures. Nearly all RNAs can fold to form some secondary structure and, in many RNAs, highly structured regions encode important regulatory motifs. Such structured regulatory elements can be composed of canonical base pairs but may also feature specialized and distinctive RNA structures. Among the best characterized of these specialized structures are RNA pseudoknots. Pseudoknots are relatively rare but occur overwhelmingly in functionally important regions of RNA (2-4). For example, all of the large catalytic RNAs contain pseudoknots (5, 6); roughly two-thirds of the known classes of riboswitches contain pseudoknots that appear to be essential for ligand binding and gene regulatory functions (7); and pseudoknots occur prominently in the regulatory elements that viruses use to usurp cellular metabolism (3). Pseudoknots are thus harbingers of biological function. An important and challenging goal is to identify these structures reliably.Pseudoknots are excluded from the most widely used algorithms that model RNA secondary structure (8). This exclusion is based on the challenge of incorporating the pseudoknot structure into the efficient dynamic programming algorithm used in the most popular secondary structure prediction approaches and because of the additional computational effort required. The prediction of lowest free energy structures with pseudoknots is NP-complete (9), which means that lowest free energy structure cannot be solved as a function of sequence length in polynomial time. In addition, allowing pseudoknots greatly increases the number of (incorrect) hel...
The potential for genome-wide association studies to relate phenotypes to specific genetic variation is greatly increased when data can be combined or compared across multiple studies. To facilitate replication and validation across studies, RTI International (Research Triangle Park, North Carolina) and the National Human Genome Research Institute (Bethesda, Maryland) are collaborating on the consensus measures for Phenotypes and eXposures (PhenX) project. The goal of PhenX is to identify 15 high-priority, well-established, and broadly applicable measures for each of 21 research domains. PhenX measures are selected by working groups of domain experts using a consensus process that includes input from the scientific community. The selected measures are then made freely available to the scientific community via the PhenX Toolkit. Thus, the PhenX Toolkit provides the research community with a core set of high-quality, well-established, low-burden measures intended for use in large-scale genomic studies. PhenX measures will have the most impact when included at the experimental design stage. The PhenX Toolkit also includes links to standards and resources in an effort to facilitate data harmonization to legacy data. Broad acceptance and use of PhenX measures will promote cross-study comparisons to increase statistical power for identifying and replicating variants associated with complex diseases and with gene-gene and gene-environment interactions.
The need for comprehensive analysis to compare and combine data across multiple studies in order to validate and extend results is widely recognized. This paper aims to assess the extent of data compatibility in the substance abuse and addiction (SAA) sciences through an examination of measure commonality, defined as the use of similar measures, across grants funded by the National Institute on Drug Abuse (NIDA) and the National Institute on Alcohol Abuse and Alcoholism (NIAAA). Data were extracted from applications of funded, active grants involving human-subjects research in four scientific areas (epidemiology, prevention, services, and treatment) and six frequently assessed scientific domains. A total of 548 distinct measures were cited across 141 randomly sampled applications. Commonality, as assessed by density (range of 0–1) of shared measurement, was examined. Results showed that commonality was low and varied by domain/area. Commonality was most prominent for (1) diagnostic interviews (structured and semi-structured) for substance use disorders and psychopathology (density of 0.88), followed by (2) scales to assess dimensions of substance use problems and disorders (0.70), (3) scales to assess dimensions of affect and psychopathology (0.69), (4) measures of substance use quantity and frequency (0.62), (5) measures of personality traits (0.40), and (6) assessments of cognitive/neurologic ability (0.22). The areas of prevention (density of 0.41) and treatment (0.42) had greater commonality than epidemiology (0.36) and services (0.32). To address the lack of measure commonality, NIDA and its scientific partners recommend and provide common measures for SAA researchers within the PhenX Toolkit.
BackgroundThe purpose of this manuscript is to describe the PhenX RISING network and the site experiences in the implementation of PhenX measures into ongoing population-based genomic studies.MethodsEighty PhenX measures were implemented across the seven PhenX RISING groups, thirty-three of which were used at more than two sites, allowing for cross-site collaboration. Each site used between four and 37 individual measures and five of the sites are validating the PhenX measures through comparison with other study measures. Self-administered and computer-based administration modes are being evaluated at several sites which required changes to the original PhenX Toolkit protocols. A network-wide data use agreement was developed to facilitate data sharing and collaboration.ResultsPhenX Toolkit measures have been collected for more than 17,000 participants across the PhenX RISING network. The process of implementation provided information that was used to improve the PhenX Toolkit. The Toolkit was revised to allow researchers to select self- or interviewer administration when creating the data collection worksheets and ranges of specimens necessary to run biological assays has been added to the Toolkit.ConclusionsThe PhenX RISING network has demonstrated that the PhenX Toolkit measures can be implemented successfully in ongoing genomic studies. The next step will be to conduct gene/environment studies.
The PhenX (consensus measures for Phenotypes and eXposures) Toolkit offers well-established, broadly validated measures of phenotypes and exposures relevant to investigators in human genomics, epidemiology, and biomedical research. This methods report describes the infrastructure and processes used to develop the content and features of the Toolkit. The PhenX consensus process is robust, yet flexible, as evidenced by its application to a range of research domains. During the initial phase of PhenX, from March 2008 through April 2010, working groups of content experts addressed 21 research domains and selected 295 measures for the Toolkit. The PhenX Steering Committee prioritized and defined the scope of the domains and guided the consensus process with input from liaisons representing the National Institutes of Health. After the 21 domains were completed, another project to add breadth and depth to the Toolkit for substance abuse and addiction (SAA) research served to validate the consensus process. With the support of the SAA Scientific Panel to define the scope for one core and six specialty collections and SAA working groups to select measures, the PhenX project team added 44 measures to the Toolkit in 2012. Now being used by more than 1,000 researchers, the PhenX Toolkit offers a catalog of measures, supporting documentation, and tools for collaborative research. It used a consensus process that can serve as a template for investigators who are considering a similar approach. Contents
Background Single birth cohort studies have been the basis for many discoveries about early life risk factors for childhood asthma but are limited in scope by sample size and characteristics of the local environment and population. The Children’s Respiratory and Environmental Workgroup (CREW) was established to integrate multiple established asthma birth cohorts and to investigate asthma phenotypes and associated causal pathways (endotypes), focusing on how they are influenced by interactions between genetics, lifestyle, and environmental exposures during the prenatal period and early childhood. Methods and results CREW is funded by the NIH Environmental influences on Child Health Outcomes (ECHO) program, and consists of 12 individual cohorts and three additional scientific centers. The CREW study population is diverse in terms of race, ethnicity, geographical distribution, and year of recruitment. We hypothesize that there are phenotypes in childhood asthma that differ based on clinical characteristics and underlying molecular mechanisms. Furthermore, we propose that asthma endotypes and their defining biomarkers can be identified based on personal and early life environmental risk factors. CREW has three phases: 1) to pool and harmonize existing data from each cohort, 2) to collect new data using standardized procedures, and 3) to enroll new families during the prenatal period to supplement and enrich extant data and enable unified systems approaches for identifying asthma phenotypes and endotypes. Conclusions The overall goal of CREW program is to develop a better understanding of how early life environmental exposures and host factors interact to promote the development of specific asthma endotypes.
The PhenX Toolkit provides researchers with recommended, well-established, low-burden measures suitable for human-subjects research. The database of Genotypes and Phenotypes (dbGaP) is the data repository for a variety of studies funded by the National Institutes of Health (NIH), including genome-wide association studies (GWAS). The dbGaP requires that investigators provide a data dictionary of study variables as part of the data submission process. Thus, dbGaP is a unique resource that can help investigators identify studies that share the same or similar variables. As a proof of concept, variables from 16 studies deposited in dbGaP were mapped to PhenX measures. Soon, investigators will be able to search dbGaP using PhenX variable identifiers and find comparable and related variables in these 16 studies. To enhance effective data exchange, PhenX measures, protocols, and variables were modeled in Logical Observation Identifiers Names and Codes (LOINC). PhenX domains and measures are also represented in the Cancer Data Standards Registry and Repository (caDSR). Associating PhenX measures with existing standards (LOINC and caDSR) and mapping to dbGaP study variables extends the utility of these measures by revealing new opportunities for cross-study analysis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.