About 85% of the maize genome consists of highly repetitive sequences that are interspersed by low-copy, gene-coding sequences. The maize community has dealt with this genomic complexity by the construction of an integrated genetic and physical map (iMap), but this resource alone was not sufficient for ensuring the quality of the current sequence build. For this purpose, we constructed a genome-wide, high-resolution optical map of the maize inbred line B73 genome containing >91,000 restriction sites (averaging 1 site/∼23 kb) accrued from mapping genomic DNA molecules. Our optical map comprises 66 contigs, averaging 31.88 Mb in size and spanning 91.5% (2,103.93 Mb/∼2,300 Mb) of the maize genome. A new algorithm was created that considered both optical map and unfinished BAC sequence data for placing 60/66 (2,032.42 Mb) optical map contigs onto the maize iMap. The alignment of optical maps against numerous data sources yielded comprehensive results that proved revealing and productive. For example, gaps were uncovered and characterized within the iMap, the FPC (fingerprinted contigs) map, and the chromosome-wide pseudomolecules. Such alignments also suggested amended placements of FPC contigs on the maize genetic map and proactively guided the assembly of chromosome-wide pseudomolecules, especially within complex genomic regions. Lastly, we think that the full integration of B73 optical maps with the maize iMap would greatly facilitate maize sequence finishing efforts that would make it a valuable reference for comparative studies among cereals, or other maize inbred lines and cultivars.
BackgroundMalignant pleural mesothelioma (MM) is an aggressive, asbestos-related pulmonary cancer that is increasing in incidence. Because diagnosis is difficult and the disease is relatively rare, most patients present at a clinically advanced stage where possibility of cure is minimal. To improve surveillance and detection of MM in the high-risk population, we completed a series of clinical studies to develop a noninvasive test for early detection.Methodology/Principal FindingsWe conducted multi-center case-control studies in serum from 117 MM cases and 142 asbestos-exposed control individuals. Biomarker discovery, verification, and validation were performed using SOMAmer proteomic technology, which simultaneously measures over 1000 proteins in unfractionated biologic samples. Using univariate and multivariate approaches we discovered 64 candidate protein biomarkers and derived a 13-marker random forest classifier with an AUC of 0.99±0.01 in training, 0.98±0.04 in independent blinded verification and 0.95±0.04 in blinded validation studies. Sensitivity and specificity at our pre-specified decision threshold were 97%/92% in training and 90%/95% in blinded verification. This classifier accuracy was maintained in a second blinded validation set with a sensitivity/specificity of 90%/89% and combined accuracy of 92%. Sensitivity correlated with pathologic stage; 77% of Stage I, 93% of Stage II, 96% of Stage III and 96% of Stage IV cases were detected. An alternative decision threshold in the validation study yielding 98% specificity would still detect 60% of MM cases. In a paired sample set the classifier AUC of 0.99 and 91%/94% sensitivity/specificity was superior to that of mesothelin with an AUC of 0.82 and 66%/88% sensitivity/specificity. The candidate biomarker panel consists of both inflammatory and proliferative proteins, processes strongly associated with asbestos-induced malignancy.SignificanceThe SOMAmer biomarker panel discovered and validated in these studies provides a solid foundation for surveillance and diagnosis of MM in those at highest risk for this disease.
Lung cancer remains the most common cause of cancer-related mortality. We applied a highly multiplexed proteomic technology (SOMAscan) to compare protein expression signatures of non small-cell lung cancer (NSCLC) tissues with healthy adjacent and distant tissues from surgical resections. In this first report of SOMAscan applied to tissues, we highlight 36 proteins that exhibit the largest expression differences between matched tumor and non-tumor tissues. The concentrations of twenty proteins increased and sixteen decreased in tumor tissue, thirteen of which are novel for NSCLC. NSCLC tissue biomarkers identified here overlap with a core set identified in a large serum-based NSCLC study with SOMAscan. We show that large-scale comparative analysis of protein expression can be used to develop novel histochemical probes. As expected, relative differences in protein expression are greater in tissues than in serum. The combined results from tissue and serum present the most extensive view to date of the complex changes in NSCLC protein expression and provide important implications for diagnosis and treatment.
Recent studies have identified a small number of genomic rearrangements that occur frequently in the general population. Bioinformatics tools are now available for systematic genome-wide surveys of higher-order structures predisposing to such common variations in genomic architecture. Segmental duplications (SDs) constitute up to 5 per cent of the genome and play an important role in generating additional rearrangements and in disease aetiology. We conducted a genome-wide database search for a form of SD, palindromic segmental duplications (PSDs), which consist of paired, inverted duplications, and which predispose to inversions, duplications and deletions. The survey was complemented by a search for SDs in tandem orientation (TSDs) that can mediate duplications and deletions but not inversions. We found more than 230 distinct loci with higher-order genomic structure that can mediate genomic variation, of these about 180 contained a PSD. A number of these sites were previously identified as harbouring common inversions or as being associated with specific genomic diseases characterised by duplication, deletions or inversions. Most of the regions, however, were previously unidentified; their characterisation should identify further common rearrangements and may indicate localisations for additional genomic disorders. The widespread distribution of complex chromosomal architecture suggests a potentially high degree of plasticity of the human genome and could uncover another level of genetic variation within human populations.
http://zhoulab.usc.edu/NeMo/.
BackgroundCT screening for lung cancer is effective in reducing mortality, but there are areas of concern, including a positive predictive value of 4% and development of interval cancers. A blood test that could manage these limitations would be useful, but development of such tests has been impaired by variations in blood collection that may lead to poor reproducibility across populations.ResultsBlood-based proteomic profiles were generated with SOMAscan technology, which measured 1033 proteins. First, preanalytic variability was evaluated with Sample Mapping Vectors (SMV), which are panels of proteins that detect confounders in protein levels related to sample collection. A subset of well collected serum samples not influenced by preanalytic variability was selected for discovery of lung cancer biomarkers. The impact of sample collection variation on these candidate markers was tested in the subset of samples with higher SMV scores so that the most robust markers could be used to create disease classifiers. The discovery sample set (n = 363) was from a multi-center study of 94 non-small cell lung cancer (NSCLC) cases and 269 long-term smokers and benign pulmonary nodule controls. The analysis resulted in a 7-marker panel with an AUC of 0.85 for all cases (68% adenocarcinoma, 32% squamous) and an AUC of 0.93 for squamous cell carcinoma in particular. This panel was validated by making blinded predictions in two independent cohorts (n = 138 in the first validation and n = 135 in the second). The model was recalibrated for a panel format prior to unblinding the second cohort. The AUCs overall were 0.81 and 0.77, and for squamous cell tumors alone were 0.89 and 0.87. The estimated negative predictive value for a 15% disease prevalence was 93% overall and 99% for squamous lung tumors. The proteins in the classifier function in destruction of the extracellular matrix, metabolic homeostasis and inflammation.ConclusionsSelecting biomarkers resistant to sample processing variation led to robust lung cancer biomarkers that performed consistently in independent validations. They form a sensitive signature for detection of lung cancer, especially squamous cell histology. This non-invasive test could be used to improve the positive predictive value of CT screening, with the potential to avoid invasive evaluation of nonmalignant pulmonary nodules.
Progression from health to disease is accompanied by complex changes in protein expression in both the circulation and affected tissues. Large-scale comparative interrogation of the human proteome can offer insights into disease biology as well as lead to the discovery of new biomarkers for diagnostics, new targets for therapeutics, and can identify patients most likely to benefit from treatment. Although genomic studies provide an increasingly sharper understanding of basic biological and pathobiological processes, they ultimately only offer a prediction of relative disease risk, whereas proteins offer an immediate assessment of "real-time" health and disease status. We have recently developed a new proteomic technology, based on modified aptamers, for biomarker discovery that is capable of simultaneously measuring more than a thousand proteins from small volumes of biological samples such as plasma, tissues, or cells. Our technology is enabled by SOMAmers (Slow Off-rate Modified Aptamers), a new class of protein binding reagents that contain chemically modified nucleotides that greatly expand the physicochemical diversity of nucleic acid-based ligands. Such modifications introduce functional groups that are absent in natural nucleic acids but are often found in protein-protein, small molecule-protein, and antibody-antigen interactions. The use of these modifications expands the range of possible targets for SELEX (Systematic Evolution of Ligands by EXponential Enrichment), results in improved binding properties, and facilitates selection of SOMAmers with slow dissociation rates. Our assay works by transforming protein concentrations in a mixture into a corresponding DNA signature, which is then quantified on current commercial DNA microarray platforms. In essence, we take advantage of the dual nature of SOMAmers as both folded binding entities with defined shapes and unique nucleic acid sequences recognizable by specific hybridization probes. Currently, our assay is capable of simultaneously measuring 1,030 proteins, extending to sub-pM detection limits, an average dynamic range of each analyte in the assay of > 3 logs, an overall dynamic range of at least 7 logs, and a throughput of one million analytes per week. Our collection includes SOMAmers that specifically recognize most of the complement cascade proteins. We have used this assay to identify potential biomarkers in a range of diseases such as malignancies, cardiovascular disorders, and inflammatory conditions. In this chapter, we describe the application of our technology to discovering large-scale protein expression changes associated with chronic kidney disease and non-small cell lung cancer. With this new proteomics technology-which is fast, economical, highly scalable, and flexible--we now have a powerful tool that enables whole-proteome proteomics, biomarker discovery, and advancing the next generation of evidence-based, "personalized" diagnostics and therapeutics.
The recent development of microarray technology provided unprecedented opportunities to understand the genetic basis of aging. So far, many microarray studies have addressed aging-related expression patterns in multiple organisms and under different conditions. The number of relevant studies continues to increase rapidly. However, efficient exploitation of these vast data is frustrated by the lack of an integrated data mining platform or other unifying bioinformatic resource to enable convenient cross-laboratory searches of array signals. To facilitate the integrative analysis of microarray data on aging, we developed a web database and analysis platform ‘Gene Aging Nexus’ (GAN) that is freely accessible to the research community to query/analyze/visualize cross-platform and cross-species microarray data on aging. By providing the possibility of integrative microarray analysis, GAN should be useful in building the systems-biology understanding of aging. GAN is accessible at .
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.