SPROUTS (Structural Prediction for pRotein fOlding UTility System) is a new database that provides access to various structural data sets and integrated functionalities not yet available to the community. The originality of the SPROUTS database is the ability to gain access to a variety of structural analyses at one place and with a strong interaction between them. SPROUTS currently combines data pertaining to 429 structures that capture representative folds and results related to the prediction of critical residues expected to belong to the folding nucleus: the MIR (Most Interacting Residues), the description of the structures in terms of modular fragments: the TEF (Tightened End Fragments), and the calculation at each position of the free energy change gradient upon mutation by one of the 19 amino acids. All database results can be displayed and downloaded in textual files and Excel spreadsheets and visualized on the protein structure. SPROUTS is a unique resource to access as well as visualize state-of-the-art characteristics of protein folding and analyse the effect of point mutations on protein structure. It is available at http://bioinformatics.eas.asu.edu/sprouts.html.
Nowadays scientific data is inevitably digital and stored in a wide variety of formats in heterogeneous systems. Scientists need to access an integrated view of remote or local heterogeneous data sources with advanced data accessing, analyzing, and visualization tools. Building a digital library for scientific data requires accessing and manipulating data extracted from flat files or databases, documents retrieved from the Web as well as data generated by software. We present an approach to wrapping web data sources, databases, flat files, or data generated by tools through a database view mechanism. Generally, a wrapper has two tasks: it first sends a query to the source to retrieve data and, second builds the expected output with respect to the virtual structure. Our wrappers are composed of a retrieval component based on an intermediate object view mechanism called search views mapping the source capabilities to attributes, and an eXtensible Markup Language (XML) engine, respectively, to perform these two tasks. The originality of the approach consists of: 1) a generic view mechanism to access seamlessly data sources with limited capabilities and 2) the ability to wrap data sources as well as the useful specific tools they may provide. Our approach has been developed and demonstrated as part of the multidatabase system supporting queries via uniform object protocol model (OPM) interfaces.
No abstract
XML management systems vary widely in their expressive power and query-processing efficiency, and users should choose the XMLMS that best meets their needs. The Extensible Markup Language has become the standard for information interchange on the Web. Developed primarily as a document markup language more powerful than HTML yet less complex than SGML, XML does not require content to adhere to structural rules. XML gives a single, human-readable syntax for representing data, including data in relational format. Hence XML appeals to both the document and the database communities.Early developers of XML content storage tools, who came from the database community, regarded XML as yet another data format for adapting relational and sometimes object-relational data-processing tools. While this use of XML is acceptable, it does not harness XML's full power. XML is inherently semistructured. However, documents subscribing to the datacentric view of XML are highly structured and can be represented equivalently in tables or in XML with document type definitions (DTDs) or XML schema specifications (see the sidebar, "Related W3C Documents," for this and other XML specifications). As in traditional relational databases, sibling element order is unimportant in such documents.We refer to documents with implicitly ordered XML content as document-centric. The file's element order (as siblings in a tree-like representation) conveys its implicit order, whereas a document attribute or tag expresses an explicit order. Although it is easy to express explicit order in relational databases, capturing the implicit order while converting a document-centric XML document into a relational database is a problem. Besides the implicit order, document-centric XML documents allow little or no structure, deep nesting, and hyperlinked components. Tables can represent implicit order, nesting, and hyperlinks but only with costly time and space transformations.This article studies the data-and document-centric uses of XML management systems (XMLMS). We want to provide XML data users with a guideline for choosing the data management system that best meets their needs. Because the systems we test are first-generation
Today, scienti c data is inevitably digitized, stored in a wide variety of heterogeneous formats, and is accessible over the Internet. Scientists need to access an integrated view of multiple remote or local heterogeneous data sources. They then integrate the results of complex queries and apply further analysis and visualization to support the task of scienti c discovery. Building such a digital library for scienti c discovery requires accessing and manipulating data extracted from at les or databases, documents retrieved from the Web, as well as data that is locally materialized in warehouses or is generated by software. We consider several tasks to provide optimized and seamless integration of biomolecular data. Challenges to be addressed include capturing and representing source capabilities; developing a methodology to acquire and represent semantic knowledge and metadata about source contents, overlap in source contents, and access costs; and decision support to select sources and capabilities using cost based and semantic knowledge, and generating low cost query evaluation plans.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.