Porphyrins and metalloporphyrins are the key pigments of life on earth as we know it, because they include chlorophyll (a magnesium-containing metalloporphyrin) and heme (iron protoporphyrin). In eukaryotes, porphyrins and heme are synthesized by a multistep pathway that involves eight enzymes. The first and rate-controlling step is the formation of delta-aminolevulinic acid (ALA) from glycine plus succinyl CoA, catalyzed by ALA synthase. Intermediate steps occur in the cytoplasm, with formation of the monopyrrole porphobilinogen and the tetrapyrroles hydroxymethylbilane and a series of porphyrinogens, which are serially decarboxylated. Heme is utilized chiefly for the formation of hemoglobin in erythrocytes, myoglobin in muscle cells, cytochromes P-450 and mitochondrial cytochromes, and other hemoproteins in hepatocytes. The rate-controlling step of heme breakdown is catalyzed by heme oxygenase (HMOX), of which there are two isoforms, called HMOX1 and HMOX2. HMOX breaks down heme to form biliverdin, carbon monoxide, and iron. The porphyrias are a group of disorders, mainly inherited, in which there are defects in normal porphyrin and heme synthesis. The cardinal clinical features are cutaneous (due to the skin-damaging effects of excess deposited porphyrins) or neurovisceral attacks of pain, sometimes with weakness, delirium, seizures, and the like (probably due mainly to neurotoxic effects of ALA). The treatment of choice for the acute hepatic porphyrias is intravenous heme therapy, which repletes a critical regulatory heme pool in hepatocytes and leads to downregulation of hepatic ALA synthase, which is a biochemical hallmark of all forms of acute porphyria in relapse.
BackgroundHeme is an essential molecule and plays vital roles in many biological processes. The structural determination of a large number of heme proteins has made it possible to study the detailed chemical and structural properties of heme binding environment. Knowledge of these characteristics can provide valuable guidelines in the design of novel heme proteins and help us predict unknown heme binding proteins.ResultsIn this paper, we constructed a non-redundant dataset of 125 heme-binding protein chains and found that these heme proteins encompass at least 31 different structural folds with all-α class as the dominating scaffold. Heme binding pockets are enriched in aromatic and non-polar amino acids with fewer charged residues. The differences between apo and holo forms of heme proteins in terms of the structure and the binding pockets have been investigated. In most cases the proteins undergo small conformational changes upon heme binding. We also examined the CP (cysteine-proline) heme regulatory motifs and demonstrated that the conserved dipeptide has structural implications in protein-heme interactions.ConclusionsOur analysis revealed that heme binding pockets show special features and that most of the heme proteins undergo small conformational changes after heme binding, suggesting the apo structures can be used for structure-based heme protein prediction and as scaffolds for future heme protein design.
The newly developed transcription activator-like effector protein (TALE) and clustered regularly interspaced short palindromic repeats/Cas9 transcription factors (TF) offered a powerful and precise approach for modulating gene expression. In this article, we systematically investigated the potential of these new tools in activating the stringently silenced pluripotency gene Oct4 (Pou5f1) in mouse and human somatic cells. First, with a number of TALEs and sgRNAs targeting various regions in the mouse and human Oct4 promoters, we found that the most efficient TALE-VP64s bound around −120 to −80 bp, while highly effective sgRNAs targeted from −147 to −89-bp upstream of the transcription start sites to induce high activity of luciferase reporters. In addition, we observed significant transcriptional synergy when multiple TFs were applied simultaneously. Although individual TFs exhibited marginal activity to up-regulate endogenous gene expression, optimized combinations of TALE-VP64s could enhance endogenous Oct4 transcription up to 30-fold in mouse NIH3T3 cells and 20-fold in human HEK293T cells. More importantly, the enhancement of OCT4 transcription ultimately generated OCT4 proteins. Furthermore, examination of different epigenetic modifiers showed that histone acetyltransferase p300 could enhance both TALE-VP64 and sgRNA/dCas9-VP64 induced transcription of endogenous OCT4. Taken together, our study suggested that engineered TALE-TF and dCas9-TF are useful tools for modulating gene expression in mammalian cells.
Chameleon sequences have been implicated in amyloid related diseases. Here we report an analysis of two types of chameleon sequences, chameleon-HS (Helix vs. Strand) and chameleon-HE (Helix vs. Sheet), based on known structures in Protein Data Bank. Our survey shows that the longest chameleon-HS is eight residues while the longest chameleon-HE is seven residues. We have done a detailed analysis on the local and global environment that might contribute to the unique conformation of a chameleon sequence. We found that the existence of chameleon sequences does not present a problem for secondary structure prediction programs, including the first generation prediction programs, such as Chou-Fasman algorithm, and the third generation prediction programs that utilize evolution information. We have also investigated the possible implication of chameleon sequences in structural conservation and functional diversity of alternatively spliced protein isoforms.
Computational evaluation of protein–DNA interaction is important for the identification of DNA-binding sites and genome annotation. It could validate the predicted binding motifs by sequence-based approaches through the calculation of the binding affinity between a protein and DNA. Such an evaluation should take into account structural information to deal with the complicated effects from DNA structural deformation, distance-dependent multi-body interactions and solvation contributions. In this paper, we present a knowledge-based potential built on interactions between protein residues and DNA tri-nucleotides. The potential, which explicitly considers the distance-dependent two-body, three-body and four-body interactions between protein residues and DNA nucleotides, has been optimized in terms of a Z-score. We have applied this knowledge-based potential to evaluate the binding affinities of zinc-finger protein–DNA complexes. The predicted binding affinities are in good agreement with the experimental data (with a correlation coefficient of 0.950). On a larger test set containing 48 protein–DNA complexes with known experimental binding free energies, our potential has achieved a high correlation coefficient of 0.800, when compared with the experimental data. We have also used this potential to identify binding motifs in DNA sequences of transcription factors (TF). The TFs in 79.4% of the known TF–DNA complexes have accurately found their native binding sequences from a large pool of DNA sequences. When tested in a genome-scale search for TF-binding motifs of the cyclic AMP regulatory protein (CRP) of Escherichia coli, this potential ranks all known binding motifs of CRP in the top 15% of all candidate sequences.
Structural domains are considered as the basic units of protein folding, evolution, function and design. Automatic decomposition of protein structures into structural domains, though after many years of investigation, remains a challenging and unsolved problem. Manual inspection still plays a key role in domain decomposition of a protein structure. We have previously developed a computer program, DomainParser, using network flow algorithms. The algorithm partitions a protein structure into domains accurately when the number of domains to be partitioned is known. However the performance drops when this number is unclear (the overall performance is 74.5% over a set of 1317 protein chains). Through utilization of various types of structural information including hydrophobic moment profile, we have developed an effective method for assessing the most probable number of domains a structure may have. The core of this method is a neural network, which is trained to discriminate correctly partitioned domains from incorrectly partitioned domains. When compared with the manual decomposition results given in the SCOP database, our new algorithm achieves higher decomposition accuracy (81.9%) on the same data set.
DNA-binding proteins play critical roles in biological processes including gene expression, DNA packaging and DNA repair. They bind to DNA target sequences with different degrees of binding specificity, ranging from highly specific to non-specific. Alterations of DNA-binding specificity, due to either genetic variation or somatic mutations, can lead to various diseases. In this study, a comparative analysis of protein-DNA complex structures was carried out to investigate the structural features that contribute to binding specificity. Protein-DNA complexes were grouped into three general classes based on degrees of binding specificity: highly specific (HS), multi-specific (MS), and non-specific (NS). Our results show a clear trend of structural features among the three classes, including amino acid binding propensities, simple and complex hydrogen bonds, major/minor groove and base contacts, and DNA shape. We found that aspartate is enriched in highly specific DNA binding proteins and predominately binds to a cytosine through a single hydrogen bond or two consecutive cytosines through bidentate hydrogen bonds. Aromatic residues, histidine and tyrosine, are highly enriched in the HS and MS groups and may contribute to specific binding through different mechanisms. To further investigate the role of protein flexibility in specific protein-DNA recognition, we analysed the conformational changes between the bound and unbound states of DNA-binding proteins and structural variations. The results indicate that highly specific and multi-specific DNA-binding domains have larger conformational changes upon DNA-binding and larger degree of flexibility in both bound and unbound states.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.