There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders—representing academia, industry, funding agencies, and scholarly publishers—have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent is that these may act as a guideline for those wishing to enhance the reusability of their data holdings. Distinct from peer initiatives that focus on the human scholar, the FAIR Principles put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals. This Comment is the first formal publication of the FAIR Principles, and includes the rationale behind them, and some exemplar implementations in the community.
PRofile ALIgNEment (PRALINE) is a fully customizable multiple sequence alignment application. In addition to a number of available alignment strategies, PRALINE can integrate information from database homology searches to generate a homology-extended multiple alignment. PRALINE also provides a choice of seven different secondary structure prediction programs that can be used individually or in combination as a consensus for integrating structural information into the alignment process. The program can be used through two separate interfaces: one has been designed to cater to more advanced needs of researchers in the field, and the other for standard construction of high confidence alignments. The web-based output is designed to facilitate the comprehensive visualization of the generated alignments by means of five default colour schemes based on: residue type, position conservation, position reliability, residue hydrophobicity and secondary structure, depending on the options set. A user can also define a custom colour scheme by selecting which colour will represent one or more amino acids in the alignment. All generated alignments are also made available in the PDF format for easy figure generation for publications. The grouping of sequences, on which the alignment is based, can also be visualized as a dendrogram. PRALINE is available at .
Recent advances in protein engineering have come from creating multi-functional chimeric proteins containing modules from various proteins. These modules are typically joined via an oligopeptide linker, the correct design of which is crucial for the desired function of the chimeric protein. Here we analyse the properties of naturally occurring inter-domain linkers with the aim to design linkers for domain fusion. Two main types of linker were identified; helical and non-helical. Helical linkers are thought to act as rigid spacers separating two domains. Non-helical linkers are rich in prolines, which also leads to structural rigidity and isolation of the linker from the attached domains. This means that both linker types are likely to act as a scaffold to prevent unfavourable interactions between folding domains. Based on these results we have constructed a linker database intended for the rational design of linkers for domain fusion, which can be accessed via the Internet at http://mathbio.nimr.mrc.ac.uk.
Mycobacterial pathogens use specialized type VII secretion (T7S) systems to transport crucial virulence factors across their unusual cell envelope into infected host cells. These virulence factors lack classical secretion signals and the mechanism of substrate recognition is not well understood. Here we demonstrate that the model T7S substrates PE25/PPE41, which form a heterodimer, are targeted to the T7S pathway ESX-5 by a signal located in the C terminus of PE25. Site-directed mutagenesis of residues within this C terminus resulted in the identification of a highly conserved motif, i.e., YxxxD/E, which is required for secretion. This motif was also essential for the secretion of LipY, another ESX-5 substrate. Pathogenic mycobacteria have several different T7S systems and we identified a PE protein that is secreted by the ESX-1 system, which allowed us to compare substrate recognition of these two T7S systems. Surprisingly, this ESX-1 substrate contained a C-terminal signal functionally equivalent to that of PE25. Exchange of these C-terminal secretion signals between the PE proteins restored secretion, but each PE protein remained secreted via its own ESX secretion system, indicating that an additional signal(s) provides system specificity. Remarkably, the YxxxD/E motif was also present in and required for efficient secretion of the ESX-1 substrates CFP-10 and EspB. Therefore, our data show that the YxxxD/E motif is a general secretion signal that is present in all known mycobacterial T7S substrates or substrate complexes.
The FAIR principles have been widely cited, endorsed and adopted by a broad range of stakeholders since their publication in 2016. By intention, the 15 FAIR guiding principles do not dictate specific technological implementations, but provide guidance for improving Findability, Accessibility, Interoperability and Reusability of digital resources. This has likely contributed to the broad adoption of the FAIR principles, because individual stakeholder communities can implement their own FAIR solutions. However, it has also resulted in inconsistent interpretations that carry the risk of leading to incompatible implementations. Thus, while the FAIR principles are formulated on a high level and may be interpreted and implemented in different ways, for true interoperability we need to support convergence in implementation choices that are widely accessible and (re)-usable. We introduce the concept of FAIR implementation considerations to assist accelerated global participation and convergence towards accessible, robust, widespread and consistent FAIR implementations. Any self-identified stakeholder community may either choose to reuse solutions from existing implementations, or when they spot a gap, accept the challenge to create the needed solution, which, ideally, can be used again by other communities in the future. Here, we provide interpretations and implementation considerations (choices and challenges) for each FAIR principle.
New findings are presented for the ~ 50 residue KH motif, a domain recently discovered in RNA‐binding proteins. The conserved sequence is ~ 10 residues larger than previously reported. Profile searches have revealed new members of this family, including two, E. coli NusA and human GAP‐associated p62 phosphoprotein, for which RNA‐binding data exists. A nus A homolog was detected in the RNA polymerase gene complex of six archaebacterial species and may encode an antiterminator. All KH‐containing proteins are linked with RNA and the KH motif most probably functions as a nucleic acid binding domain.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.