Physical interactions between members of the MYB and bHLH transcription factor (TF) families regulate many important biological processes in plants. Not all reported MYB–bHLH interactions can be explained by the known binding sites in the R3 repeat of the MYB DNA-binding domain. Noteworthy, most of the sequence diversity of MYB TFs lies in their non-MYB regions, which contain orphan small subgroup-defining motifs not yet linked to molecular functions. Here, we identified the motif mediating interaction between MYB TFs from subgroup 12 and their bHLH partners. Unlike other known MYB–bHLH interactions, the motif locates to the centre of the predicted disordered non-MYB region. We characterised the core motif, which enabled accurate prediction of previously unknown bHLH-interacting MYB TFs in Arabidopsis thaliana, and we confirmed its functional importance in planta. Our results indicate a correlation between the MYB–bHLH interaction affinity and the phenotypic output controlled by the TF complex. The identification of an interaction motif outside R3 indicates that MYB–bHLH interactions must have arisen multiple times, independently and suggests many more motifs of functional relevance to be harvested from subgroup-specific studies.
Protein domains constitute regions of distinct structural properties and molecular functions that are retained when removed from the rest of the protein. However, due to the lack of tertiary structure, the identification of domains has been largely neglected for long (>50 residues) intrinsically disordered regions. Here we present a sequence‐based approach to assess and visualize domain organization in long intrinsically disordered regions based on compositional sequence biases. An online tool to find putative intrinsically disordered domains (IDDomainSpotter) in any protein sequence or sequence alignment using any particular sequence trait is available at http://www.bio.ku.dk/sbinlab/IDDomainSpotter. Using this tool, we have identified a putative domain enriched in hydrophilic and disorder‐promoting residues (Pro, Ser, and Thr) and depleted in positive charges (Arg and Lys) bordering the folded DNA‐binding domains of several transcription factors (p53, GCR, NAC46, MYB28, and MYB29). This domain, from two different MYB transcription factors, was characterized biophysically to determine its properties. Our analyses show the domain to be extended, dynamic and highly disordered. It connects the DNA‐binding domain to other disordered domains and is present and conserved in several transcription factors from different families and domains of life. This example illustrates the potential of IDDomainSpotter to predict, from sequence alone, putative domains of functional interest in otherwise uncharacterized disordered proteins.
Intrinsically disordered proteins and regions with their associated short linear motifs play key roles in transcriptional regulation. The disordered MYC-interaction motif (MIM) mediates interactions between MYC and MYB transcription factors in Arabidopsis thaliana that are critical for constitutive and induced glucosinolate (GLS) biosynthesis. GLSs comprise a class of plant defense compounds that evolved in the ancestor of the Brassicales order. We used a diverse set of search strategies to discover additional occurrences of the MIM in other proteins and in other organisms and evaluate the findings by means of structural predictions, interaction assays, and biophysical experiments. Our search revealed numerous MIM instances spread throughout the angiosperm lineage. Experiments verify that several of the newly discovered MIM-containing proteins interact with MYC TFs. Only hits found within the same transcription factor family and having similar characteristics could be validated, indicating that structural predictions and sequence similarity are good indicators of whether the presence of a MIM mediates interaction. The experimentally validated MIMs are found in organisms outside the Brassicales order, showing that MIM function is broader than regulating GLS biosynthesis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.