As manually curated and non-automated BLAST analysis of the published Pichia pastoris genome sequences revealed many differences between the gene annotations of the strains GS115 and CBS7435, RNA-Seq analysis, supported by proteomics, was performed to improve the genome annotation. Detailed analysis of sequence alignment and protein domain predictions were made to extend the functional genome annotation to all P. pastoris sequences. This allowed the identification of 492 new ORFs, 4916 hypothetical UTRs and the correction of 341 incorrect ORF predictions, which were mainly due to the presence of upstream ATG or erroneous intron predictions. Moreover, 175 previously erroneously annotated ORFs need to be removed from the annotation. In total, we have annotated 5325 ORFs. Regarding the functionality of those genes, we improved all gene and protein descriptions. Thereby, the percentage of ORFs with functional annotation was increased from 48% to 73%. Furthermore, we defined functional groups, covering 25 biological cellular processes of interest, by grouping all genes that are part of the defined process. All data are presented in the newly launched genome browser and database available at www.pichiagenome.org In summary, we present a wide spectrum of curation of the P. pastoris genome annotation from gene level to protein function.
Megasphaera elsdenii is a Gram-negative ruminal bacterium. It is being investigated as a probiotic supplement for ruminants as it may provide benefits for energy balance and animal productivity. Furthermore, it is of biotechnological interest due to its capability of producing various volatile fatty acids. Here we report the complete genome sequence of M. elsdenii DSM 20460, the type strain for the species.The anaerobic Gram-negative coccus Megasphaera elsdenii is found in cattle, sheep, and other ruminants. Elsden et al. (2,3) were the first to isolate this strain, and they have already described its capability to produce a variety of volatile fatty acids. This organism is interesting for the chemical industry as a possible biocatalyst, but characterization of its metabolism is also important for understanding the function of the rumen. The spectrum of short-chain carboxylic acids that is produced depends largely on the carbon source used by the microorganism. However, the metabolic pathways which underlie these observations have been unclear until now. Furthermore, carboxylic acids can also be used as carbon sources. For example, lactic acid is among the most preferred substrates of M. elsdenii. This renders this bacterium a very beneficial member of the rumen community. With the uptake of lactic acid, M. elsdenii can relieve acidosis, a dreaded condition of livestock (1).The genome of Megasphaera elsdenii was sequenced with a combination of next-generation sequencing methods. A firstdraft assembly (Roche 454 GS, FLX Titanium; 773,553 reads with a total of 164.6 Mb; 68-fold coverage) generated with Newbler 2.5.3 consisted of 56 contigs, which could be joined into 1 scaffold. To improve the quality of the sequence by eliminating the 454 sequencing errors in homopolymer stretches, the genome was subsequently sequenced using the Illumina paired-end method (HiSeq 2000; 13,481,796 reads with a total of 1.35 Gb; 558-fold coverage). The Illumina reads were aligned to the already-assembled scaffold with the Genomics Workbench 4.7.1 program (CLC, Aarhus, Denmark). The final consensus sequence was derived by counting instances of each nucleotide at a position and then letting the majority decide the nucleotide in the consensus sequence.The annotation was performed using Prodigal gene finder (5), tRNAscan-SE 1.21 (7), and RNAmmer 1.2 (6). Additionally, the origin of replication was predicted with OriginX (11), and the genome was scanned against Rfam to find other small RNA species. Functional annotation of the predicted genes was performed using a reciprocal best-hit strategy (8) against a group of phylogenetically related organisms. In addition, the putative proteins were searched against Uniprot, Clusters of Orthologous Groups (COG) (9), Pfam (4), and Superfam (10) databases.The draft genome includes 2,474,718 bases, with a GC content of 53%. The number of putative genes totals 2,220, with an average GC content in the coding regions of 54%. There are seven instances of the ribosomal 5S-23S-16S cluster, and 64 predicted...
The Crabtree phenotype defines whether a yeast can perform simultaneous respiration and fermentation under aerobic conditions at high growth rates. It provides Crabtree positive yeasts an evolutionary advantage of consuming glucose faster and producing ethanol to outcompete other microorganisms in sugar rich environments. While a number of genetic events are associated with the emergence of the Crabtree effect, its evolution remains unresolved. Here we show that overexpression of a single Gal4-like transcription factor is sufficient to convert Crabtree-negative Komagataella phaffii (Pichia pastoris) into a Crabtree positive yeast. Upregulation of the glycolytic genes and a significant increase in glucose uptake rate due to the overexpression of the Gal4-like transcription factor leads to an overflow metabolism, triggering both short-term and long-term Crabtree phenotypes. This indicates that a single genetic perturbation leading to overexpression of one gene may have been sufficient as the first molecular event towards respiro-fermentative metabolism in the course of yeast evolution.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.