Nadezhda T. Doncheva scite author profile

Proteins and their functional interactions form the backbone of the cellular machinery. Their connectivity network needs to be considered for the full understanding of biological phenomena, but the available information on protein–protein associations is incomplete and exhibits varying levels of annotation granularity and reliability. The STRING database aims to collect, score and integrate all publicly available sources of protein–protein interaction information, and to complement these with computational predictions. Its goal is to achieve a comprehensive and objective global network, including direct (physical) as well as indirect (functional) interactions. The latest version of STRING (11.0) more than doubles the number of organisms it covers, to 5090. The most important new feature is an option to upload entire, genome-wide datasets as input, allowing users to visualize subsets as interaction networks and to perform gene-set enrichment analysis on the entire input. For the enrichment analysis, STRING implements well-known classification systems such as Gene Ontology and KEGG, but also offers additional, new classification systems based on high-throughput text-mining as well as on a hierarchical clustering of the association network itself. The STRING resource is available online at https://string-db.org/.

show abstract

The STRING database in 2017: quality-controlled protein–protein association networks, made broadly accessible

Szklarczyk

et al. 2016

View full text Add to dashboard Cite

A system-wide understanding of cellular function requires knowledge of all functional interactions between the expressed proteins. The STRING database aims to collect and integrate this information, by consolidating known and predicted protein–protein association data for a large number of organisms. The associations in STRING include direct (physical) interactions, as well as indirect (functional) interactions, as long as both are specific and biologically meaningful. Apart from collecting and reassessing available experimental data on protein–protein interactions, and importing known pathways and protein complexes from curated databases, interaction predictions are derived from the following sources: (i) systematic co-expression analysis, (ii) detection of shared selective signals across genomes, (iii) automated text-mining of the scientific literature and (iv) computational transfer of interaction knowledge between organisms based on gene orthology. In the latest version 10.5 of STRING, the biggest changes are concerned with data dissemination: the web frontend has been completely redesigned to reduce dependency on outdated browser technologies, and the database can now also be queried from inside the popular Cytoscape software framework. Further improvements include automated background analysis of user inputs for functional enrichments, and streamlined download options. The STRING resource is available online, at http://string-db.org/.

show abstract

The STRING database in 2021: customizable protein–protein networks, and functional characterization of user-uploaded gene/measurement sets

et al. 2020

View full text Add to dashboard Cite

Cellular life depends on a complex web of functional associations between biomolecules. Among these associations, protein–protein interactions are particularly important due to their versatility, specificity and adaptability. The STRING database aims to integrate all known and predicted associations between proteins, including both physical interactions as well as functional associations. To achieve this, STRING collects and scores evidence from a number of sources: (i) automated text mining of the scientific literature, (ii) databases of interaction experiments and annotated complexes/pathways, (iii) computational interaction predictions from co-expression and from conserved genomic context and (iv) systematic transfers of interaction evidence from one organism to another. STRING aims for wide coverage; the upcoming version 11.5 of the resource will contain more than 14 000 organisms. In this update paper, we describe changes to the text-mining system, a new scoring-mode for physical interactions, as well as extensive user interface features for customizing, extending and sharing protein networks. In addition, we describe how to query STRING with genome-wide, experimental data, including the automated detection of enriched functionalities and potential biases in the user's query data. The STRING resource is available online, at https://string-db.org/.

show abstract

Cytoscape StringApp: Network Analysis and Visualization of Proteomics Data

et al. 2018

View full text Add to dashboard Cite

Protein networks have become a popular tool for analyzing and visualizing the often long lists of proteins or genes obtained from proteomics and other high-throughput technologies. One of the most popular sources of such networks is the STRING database, which provides protein networks for more than 2000 organisms, including both physical interactions from experimental data and functional associations from curated pathways, automatic text mining, and prediction methods. However, its web interface is mainly intended for inspection of small networks and their underlying evidence. The Cytoscape software, on the other hand, is much better suited for working with large networks and offers greater flexibility in terms of network analysis, import, and visualization of additional data. To include both resources in the same workflow, we created stringApp, a Cytoscape app that makes it easy to import STRING networks into Cytoscape, retains the appearance and many of the features of STRING, and integrates data from associated databases. Here, we introduce many of the stringApp features and show how they can be used to carry out complex network analysis and visualization tasks on a typical proteomics data set, all through the Cytoscape user interface. stringApp is freely available from the Cytoscape app store: http://apps.cytoscape.org/apps/stringapp.

show abstract

The STRING database in 2023: protein–protein association networks and functional enrichment analyses for any sequenced genome of interest

et al. 2022

View full text Add to dashboard Cite

Much of the complexity within cells arises from functional and regulatory interactions among proteins. The core of these interactions is increasingly known, but novel interactions continue to be discovered, and the information remains scattered across different database resources, experimental modalities and levels of mechanistic detail. The STRING database (https://string-db.org/) systematically collects and integrates protein–protein interactions—both physical interactions as well as functional associations. The data originate from a number of sources: automated text mining of the scientific literature, computational interaction predictions from co-expression, conserved genomic context, databases of interaction experiments and known complexes/pathways from curated sources. All of these interactions are critically assessed, scored, and subsequently automatically transferred to less well-studied organisms using hierarchical orthology information. The data can be accessed via the website, but also programmatically and via bulk downloads. The most recent developments in STRING (version 12.0) are: (i) it is now possible to create, browse and analyze a full interaction network for any novel genome of interest, by submitting its complement of encoded proteins, (ii) the co-expression channel now uses variational auto-encoders to predict interactions, and it covers two new sources, single-cell RNA-seq and experimental proteomics data and (iii) the confidence in each experimentally derived interaction is now estimated based on the detection method used, and communicated to the user in the web-interface. Furthermore, STRING continues to enhance its facilities for functional enrichment analysis, which are now fully available also for user-submitted genomes.

show abstract

Topological analysis and interactive visualization of biological networks and protein structures

et al. 2012

View full text Add to dashboard Cite

PROTOCOL 670 | VOL.7 NO.4 | 2012 | NATURE PROTOCOLS INTRODUCTION Current high-throughput techniques such as yeast two-hybrid screens for protein interaction partners produce great volumes of experimental data that can be integrated and explored to gain insight into biological processes performed by interacting molecules 1-6. Furthermore, structural biologists study the interactions of residues in protein structures to understand complex protein structure-function relationships 7-11. Commonly, large-scale interaction data are represented as networks and initially analyzed by graph-theoretic methods to characterize the topological network structure and its global and local interaction properties 12-18. A number of software tools are available for the visual exploration and computational analysis of networks 19-21. General software libraries for network analysis are the Java framework JUNG 22 , the C + + library LEDA 23 , the Python package NetworkX 24 and R packages such as igraph 25 , statnet 26 , sna 27 , tnet 28 and QuACN 29. However, they cannot be applied by users without programming expertise. In contrast, sophisticated free software platforms such as Pajek 30 , VisANT 31 , ONDEX 32 and BIANA 33 provide graphical user interfaces for the analysis of biological networks. In addition, the free and stand-alone Cytoscape platform has gained considerable interest in recent years because of its open-source code development and its rapidly growing community of users and developers 34,35. In particular, its functionality is easily extendable by additional plug-ins that support specific network analysis tasks. For example, software protocols are already available for cluster analysis with the TransClust and ClusterExplorer plug-ins 36 , as well as for the integration of physical and genetic interactions into module maps with the PanGIA plug-in 37. Here we demonstrate how to apply two of our Cytoscape plug-ins, NetworkAnalyzer 38 and RINalyzer 39 , for the standard and advanced analysis of network topologies. NetworkAnalyzer performs a comprehensive analysis of network topologies without requiring advanced knowledge in graph theory or programming expertise 38. In particular, it supports the characterization of molecular networks in terms of scale-free and small-world properties, modularity and hierarchical structure 5,12,13,40 , the identification of important network nodes and edges based on topological parameters 11,41-43 , and the comparison of networks with regard to their topology 44-47. Since its initial release in 2007, NetworkAnalyzer has been extended by additional features and topological parameters and is widely used in academia and industry as indicated by thousands of software downloads. Recently, this plug-in became an integral part of each standard installation of Cytoscape, and its source code was published under the GNU Lesser General Public License. Basically, NetworkAnalyzer efficiently computes a number of topological parameters, including node degree, clustering and topological coefficient, charact...

show abstract

Dense genotyping of immune-related disease regions identifies nine new risk loci for primary sclerosing cholangitis

Liu¹,

Hov²,

Folseraas³

et al. 2013

Nat Genet

340

297

View full text Add to dashboard Cite

Primary sclerosing cholangitis (PSC) is a severe liver disease of unknown etiology leading to fibrotic destruction of the bile ducts and ultimately to the need for liver transplantation(1-3). We compared 3,789 PSC cases of European ancestry to 25,079 population controls across 130,422 SNPs genotyped using the Immunochip(4). We identified 12 genome-wide significant associations outside the human leukocyte antigen (HLA) complex, 9 of which were new, increasing the number of known PSC risk loci to 16. Despite comorbidity with inflammatory bowel disease (IBD) in 72% of the cases, 6 of the 12 loci showed significantly stronger association with PSC than with IBD, suggesting overlapping yet distinct genetic architectures for these two diseases. We incorporated association statistics from 7 diseases clinically occurring with PSC in the analysis and found suggestive evidence for 33 additional pleiotropic PSC risk loci. Together with network analyses, these findings add to the genetic risk map of PSC and expand on the relationship between PSC and other immune-mediated diseases

show abstract

Cytoscape stringApp: Network analysis and visualization of proteomics data

Doncheva

Morris

Gorodkin

et al. 2018

Preprint

235

265

View full text Add to dashboard Cite

Protein networks have become a popular tool for analyzing and visualizing the often long lists of proteins or genes obtained from proteomics and other high-throughput technologies.One of the most popular sources of such networks is the STRING database, which provides protein networks for more than 2000 organisms, including both physical interactions from experimental data and functional associations from curated pathways, automatic text mining, and prediction methods. However, its web interface is mainly intended for inspection of small networks and their underlying evidence. The Cytoscape software, on the other hand, is much better suited for working with large networks and offers greater flexibility in terms of network analysis, import and visualization of additional data. To include both resources in the same workflow, we created stringApp, a Cytoscape app that makes it easy to import STRING networks into Cytoscape, retains the appearance and many of the features of STRING, and integrates data from associated databases. Here, we introduce many of the stringApp features and show how they can be used to carry out complex network analysis and visualization tasks on a typical proteomics dataset, all through the Cytoscape user interface. stringApp is freely available from the Cytoscape app store: http://apps.cytoscape.org/apps/stringapp.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.