Automated domain annotation is an important tool for structural informatics. These pipelines typically involve searching query sequences against hidden Markov model (HMM) profiles, yielding matches to profiles for various domains. However, domain annotation can be ambiguous or inaccurate when proteins contain domains with non‐contiguous residue ranges, and especially when insertional domains are hosted within them. Here, we present DomainMapper, an algorithm that accurately assigns a unique domain structure annotation to a query sequence, including those with complex topologies. We validate our domain assignments using the AlphaFold database and confirm that non‐contiguity is pervasive (10.74% of all domains in yeast and 4.52% in human). Using this resource, we find that certain folds have strong propensities to be non‐contiguous or insertional across the Tree of Life. DomainMapper is freely available and can be ran as a single command‐line function.
Cross-linking mass spectrometry (XL-MS) is emerging as a method at the crossroads of structural and cellular biology, uniquely capable of identifying protein−protein interactions with residue-level resolution and on the proteome-wide scale. With the development of cross-linkers that can form linkages inside cells and easily cleave during fragmentation on the mass spectrometer (MScleavable cross-links), it has become increasingly facile to identify contacts between any two proteins in complex samples, including in live cells or tissues. Photo-cross-linkers possess the advantages of high temporal resolution and high reactivity, thereby engaging all residue-types (rather than just lysine); nevertheless, photo-crosslinkers have not enjoyed widespread use and are yet to be employed for proteome-wide studies because their products are challenging to identify. Here, we demonstrate the synthesis and application of two heterobifunctional photo-cross-linkers that feature diazirines and N-hydroxy-succinimidyl carbamate groups, the latter of which unveil doubly fissile MS-cleavable linkages upon acyl transfer to protein targets. Moreover, these cross-linkers demonstrate high water-solubility and cell-permeability. Using these compounds, we demonstrate the feasibility of proteome-wide photo-cross-linking in cellulo. These studies elucidate a small portion of Escherichia coli's interaction network, albeit with residue-level resolution. With further optimization, these methods will enable the detection of protein quinary interaction networks in their native environment at residue-level resolution, and we expect that they will prove useful toward the effort to explore the molecular sociology of the cell.
Automated domain annotation plays a number of important roles in structural informatics and typically involves searching query sequences against Hidden Markov Model (HMM) profiles. This process can be ambiguous or inaccurate when proteins contain domains with non-contiguous residue ranges, and especially when insertional domains are hosted within them. Here we present DomainMapper, an algorithm that accurately assigns a unique domain structure annotation to any query sequence, including those with complex topologies. We validate our domain assignments using the AlphaFold database and confirm that non-contiguity is pervasive (6.5% of all domains in yeast and 2.5% in human). Using this resource, we find that certain folds have strong propensities to be non-contiguous or insertional across the Tree of Life, likely underlying evolutionary preferences for domain topology. DomainMapper is freely available and can be run as a single command line function.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.