Tissue-specific genes as an underutilized resource in drug discovery

The reliability of non-invasive prenatal testing is highly dependent on accurate estimation of fetal fraction. Several methods have been proposed up to date, utilizing different attributes of analyzed genomic material, for example length and genomic location of sequenced DNA fragments. These two sources of information are relatively unrelated, but so far, there have been no published attempts to combine them to get an improved predictor. We collected 2454 single euploid male fetus samples from women undergoing NIPT testing. Fetal fractions were calculated using several proposed predictors and the state-of-the-art SeqFF method. Predictions were compared with the reference Y-based method. We demonstrate that prediction based on length of sequenced DNA fragments may achieve nearly the same precision as the state-of-the-art methods based on their genomic locations. We also show that combination of several sample attributes leads to a predictor that has superior prediction accuracy over any single approach. Finally, appropriate weighting of samples in the training process may achieve higher accuracy for samples with low fetal fraction and so allow more reliability for subsequent testing for genomic aberrations. We propose several improvements in fetal fraction estimation with a special focus on the samples most prone to wrong conclusion.

show abstract

Automated prediction of the clinical impact of structural copy number variations

Gažiová

Sládeček

Pös

et al. 2022

Sci Rep

View full text Add to dashboard Cite

Copy number variants (CNVs) play an important role in many biological processes, including the development of genetic diseases, making them attractive targets for genetic analyses. The interpretation of the effect of these structural variants is a challenging problem due to highly variable numbers of gene, regulatory, or other genomic elements affected by the CNV. This led to the demand for the interpretation tools that would relieve researchers, laboratory diagnosticians, genetic counselors, and clinical geneticists from the laborious process of annotation and classification of CNVs. We designed and validated a prediction method (ISV; Interpretation of Structural Variants) that is based on boosted trees which takes into account annotations of CNVs from several publicly available databases. The presented approach achieved more than 98% prediction accuracy on both copy number loss and copy number gain variants while also allowing CNVs being assigned “uncertain” significance in predictions. We believe that ISV’s prediction capability and explainability have a great potential to guide users to more precise interpretations and classifications of CNVs.

show abstract

On the critical evaluation and confirmation of germline sequence variants identified using massively parallel sequencing

Kubiritova

Gyurászová

Nagyová

et al. 2019

Journal of Biotechnology

View full text Add to dashboard Cite

Mitochondrial DNA copy number changes, heteroplasmy, and mutations in plasma-derived exosomes and brain tissue of glioblastoma patients

Soltész

Pös

Wlachovska

et al. 2022

Molecular and Cellular Probes

View full text Add to dashboard Cite

Privacy-preserving storage of sequenced genomic data

et al. 2021

View full text Add to dashboard Cite

Background The current and future applications of genomic data may raise ethical and privacy concerns. Processing and storing of this data introduce a risk of abuse by potential offenders since the human genome contains sensitive personal information. For this reason, we have developed a privacy-preserving method, named Varlock providing secure storage of sequenced genomic data. We used a public set of population allele frequencies to mask the personal alleles detected in genomic reads. Each personal allele described by the public set is masked by a randomly selected population allele with respect to its frequency. Masked alleles are preserved in an encrypted confidential file that can be shared in whole or in part using public-key cryptography. Results Our method masked the personal variants and introduced new variants detected in a personal masked genome. Alternative alleles with lower population frequency were masked and introduced more often. We performed a joint PCA analysis of personal and masked VCFs, showing that the VCFs between the two groups cannot be trivially mapped. Moreover, the method is reversible and personal alleles in specific genomic regions can be unmasked on demand. Conclusion Our method masks personal alleles within genomic reads while preserving valuable non-sensitive properties of sequenced DNA fragments for further research. Personal alleles in the desired genomic regions may be restored and shared with patients, clinics, and researchers. We suggest that the method can provide an additional security layer for storing and sharing of the raw aligned reads.

show abstract

Privacy preserving storage of sequenced genomic data

Hekel

Budiš

Kucharík

et al. 2020

Preprint

View full text Add to dashboard Cite

IntroductionCurrent and future applications of genomic data may raise ethical and privacy concerns. Processing and storing genomic data introduces a risk of abuse by a potential adversary since the human genome contains information about sensitive personal traits. For this reason, we developed a privacy preserving method, called Varlock, for secure storage and dissemination of sequenced genomic data.Materials and methodsThe Varlock uses a set of population allele frequencies to mask personal alleles detected in genomic reads. Each detected allele is replaced by a randomly selected population allele concerning its frequency. Masked alleles are preserved in an encrypted confidential file that can be shared, in whole or in part, using public-key cryptography.ResultsOur method masked personal variants and introduced new variants called on an individual’s genome, while alternative alleles with lower population frequency were masked and introduced more often. We performed joint PCA analysis of personal and masked VCFs, showing that the VCFs between the two groups can not be trivially mapped. Moreover, the method is reversible; therefore, personal alleles can be unmasked in specific genomic regions on demand.ConclusionOur method masks personal alleles within mapped reads while preserving valuable non-sensitive properties of sequenced DNA fragments for further research. Accordingly, masked reads can be stored publicly, since they are deprived of sensitive personal information. Personal alleles may be restored in arbitrary genomic regions for interested parties: patients, medical units, and researchers.

show abstract

Evaluation and Limitations of Different Approaches Among COVID-19 Fatal Cases Using Whole-exome Sequencing Data

Forgacova

Holesova

Hekel

et al. 2022

Preprint

View full text Add to dashboard Cite

Background: COVID-19 caused by SARS-CoV-2 infection may result in various disease symptoms and severity, ranging from asymptomatic, through mild, up to very severe and fatal cases. Although environmental, clinical, and social factors play important roles in both susceptibility to SARS-CoV-2 infection and COVID-19 disease progress, it is becoming evident that both pathogen and host genetic factors are important too. Here we report whole-exome sequencing (WES) findings of 27 individuals who died as a result of COVID-19 infection, especially focusing on frequencies of DNA variants in genes previously associated with SARS-CoV-2 infection and COVID-19 severity. Results: We selected risk DNA variants/alleles or target genes using four different approaches: 1) aggregated GWAS results from the GWAS Catalog; 2) selected publications from PubMed; 3) the aggregated results of the Host Genetics Initiative database; and 4) a commercial DNA variant annotation/interpretation tool providing its own knowledgebase. We divided these variants/genes into those reported to influence the susceptibility to SARS-CoV-2 infection and those influencing COVID-19 severity. Based on these, we compared frequencies of alleles among the fatal COVID-19 cases to frequencies identified in two population control datasets (non-Finnish European population from the gnomAD database and genomic frequencies specific for the Slovak population from our own database). Our comparisons delineated a trend of higher frequencies of severe COVID-19 associated risk alleles among fatal COVID-19 cases, when compared to both control population datasets. This trend reached statistical significance specifically when using the HGI derived variant list. We also analyzed other approaches to WES data evaluation, where we showed their usage as well as limitations. Conclusions: Although our results proved the likely involvement of host genetic factors pinned out by previous studies for COVID-19 disease severity, careful considerations about the molecular-testing strategies and the evaluated genomic positions may have a strong impact on the utility of genomic testing.

show abstract

Evaluation and limitations of different approaches among COVID-19 fatal cases using whole-exome sequencing data

et al. 2023

View full text Add to dashboard Cite

Background COVID-19 caused by the SARS-CoV-2 infection may result in various disease symptoms and severity, ranging from asymptomatic, through mildly symptomatic, up to very severe and even fatal cases. Although environmental, clinical, and social factors play important roles in both susceptibility to the SARS-CoV-2 infection and progress of COVID-19 disease, it is becoming evident that both pathogen and host genetic factors are important too. In this study, we report findings from whole-exome sequencing (WES) of 27 individuals who died due to COVID-19, especially focusing on frequencies of DNA variants in genes previously associated with the SARS-CoV-2 infection and the severity of COVID-19. Results We selected the risk DNA variants/alleles or target genes using four different approaches: 1) aggregated GWAS results from the GWAS Catalog; 2) selected publications from PubMed; 3) the aggregated results of the Host Genetics Initiative database; and 4) a commercial DNA variant annotation/interpretation tool providing its own knowledgebase. We divided these variants/genes into those reported to influence the susceptibility to the SARS-CoV-2 infection and those influencing the severity of COVID-19. Based on the above, we compared the frequencies of alleles found in the fatal COVID-19 cases to the frequencies identified in two population control datasets (non-Finnish European population from the gnomAD database and genomic frequencies specific for the Slovak population from our own database). When compared to both control population datasets, our analyses indicated a trend of higher frequencies of severe COVID-19 associated risk alleles among fatal COVID-19 cases. This trend reached statistical significance specifically when using the HGI-derived variant list. We also analysed other approaches to WES data evaluation, demonstrating its utility as well as limitations. Conclusions Although our results proved the likely involvement of host genetic factors pointed out by previous studies looking into severity of COVID-19 disease, careful considerations of the molecular-testing strategies and the evaluated genomic positions may have a strong impact on the utility of genomic testing.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Rastislav Hekel

Combination of Fetal Fraction Estimators Based on Fragment Lengths and Fragment Counts in Non-Invasive Prenatal Testing

Automated prediction of the clinical impact of structural copy number variations

On the critical evaluation and confirmation of germline sequence variants identified using massively parallel sequencing

Mitochondrial DNA copy number changes, heteroplasmy, and mutations in plasma-derived exosomes and brain tissue of glioblastoma patients

Privacy-preserving storage of sequenced genomic data

Privacy preserving storage of sequenced genomic data

Evaluation and Limitations of Different Approaches Among COVID-19 Fatal Cases Using Whole-exome Sequencing Data

Evaluation and limitations of different approaches among COVID-19 fatal cases using whole-exome sequencing data

Contact Info

Product

Resources

About