Charlotte M. Brannon scite author profile

Background: As pharmacogenomics data becomes increasingly integral to clinical treatment decisions, appropriate data storage and sharing protocols need to be adopted. One promising option for secure, high-integrity storage and sharing is Ethereum smart contracts. Ethereum is a blockchain platform, and smart contracts are immutable pieces of code running on virtual machines in this platform that can be invoked by a user or another contract (in the blockchain network). The 2019 iDASH (Integrating Data for Analysis, Anonymization, and Sharing) competition for Secure Genome Analysis challenged participants to develop time-and space-efficient Ethereum smart contracts for gene-drug relationship data. Methods: Here we design a specific smart contract to store and query gene-drug interactions in Ethereum using an index-based, multi-mapping approach. Our contract stores each pharmacogenomics observation, a gene-variant-drug triplet with outcome, in a mapping searchable by a unique identifier, allowing for time and space efficient storage and query. This solution ranked in the top three at the 2019 IDASH competition. We further improve our "challenge solution" and develop an alternate "fastQuery" smart contract, which combines together identical gene-variant-drug combinations into a single storage entry, leading to significantly better scalability and query efficiency. Results: On a private, proof-of-authority network, both our challenge and fastQuery solutions exhibit approximately linear memory and time usage for inserting into and querying small databases (<1,000 entries). For larger databases (1000 to 10,000 entries), fastQuery maintains this scaling. Furthermore, both solutions can query by a single field ("0-AND") or a combination of fields ("1-or 2-AND"). Specifically, the challenge solution can complete a 2-AND query from a small database (100 entries) in 35ms using 0.1 MB of memory. For the same query, fastQuery has a 2-fold improvement in time and a 10-fold improvement in memory. Conclusion: We show that pharmacogenomics data can be stored and queried efficiently using Ethereum blockchain. Our solutions could potentially be used to store a range of clinical data and extended to other fields requiring high-integrity data storage and efficient access.

show abstract

Functional genomics data: privacy risk assessment and technological mitigation

Gürsoy

Liu

et al. 2021

Nat Rev Genet

View full text Add to dashboard Cite

Storing and analyzing a genome on a blockchain

Gürsoy

Brannon

Wagner

et al. 2020

Preprint

View full text Add to dashboard Cite

The genomic characterization of individuals promises to be immensely useful for medical research. Moreover, sequencing, analysis, and interpretation of patients' genomes is projected to be a staple of healthcare in the future. A critical barrier to expanding personal genome sequencing is the ability to store genomic data securely and with high integrity. While cloud storage offers solutions to access such data from any place and device, the security, data integrity, and robustness vulnerabilities such as single-point-of failure losses have not yet been addressed. Here, we developed novel tools for decentralized storage, access, and analysis of genome sequencing data on private blockchain networks. Storing and analyzing large-scale data on a blockchain can be challenging because of the slow transaction speed and limitations on querying data stored on-chain. Hence, current genomic blockchain applications only log links to the data. We overcome this challenge by implementing data compression techniques and nested database indexing. Our tools provide open-source blockchain-based storage and access tools for advanced genomic analyses such as variant calling.

show abstract

Storing and analyzing a genome on a blockchain

Gürsoy

Brannon

et al. 2022

Genome Biol

View full text Add to dashboard Cite

There are major efforts underway to make genome sequencing a routine part of clinical practice. A critical barrier to these is achieving practical solutions for data ownership and integrity. Blockchain provides solutions to these challenges in other realms, such as finance. However, its use in genomics is stymied due to the difficulty in storing large-scale data on-chain, slow transaction speeds, and limitations on querying. To overcome these roadblocks, we developed a private blockchain network to store genomic variants and reference-aligned reads on-chain. It uses nested database indexing with an accompanying tool suite to rapidly access and analyze the data.

show abstract

Division of labor in bacteria

Brannon

Ackermann

2018

View full text Add to dashboard Cite

show abstract

Privacy-preserving genotype imputation with fully homomorphic encryption

Gürsoy

Chielle²,

Brannon

et al. 2022

Cell Systems

View full text Add to dashboard Cite

Recurrent repeat expansions in human cancer genomes

Erwin

Gürsoy

Al-Abri

et al. 2022

Nature

View full text Add to dashboard Cite

Expansion of a single repetitive DNA sequence, termed a tandem repeat (TR), is known to cause more than 50 diseases1,2. However, repeat expansions are often not explored beyond neurological and neurodegenerative disorders. In some cancers, mutations accumulate in short tracts of TRs, a phenomenon termed microsatellite instability; however, larger repeat expansions have not been systematically analysed in cancer3–8. Here we identified TR expansions in 2,622 cancer genomes spanning 29 cancer types. In seven cancer types, we found 160 recurrent repeat expansions (rREs), most of which (155/160) were subtype specific. We found that rREs were non-uniformly distributed in the genome with enrichment near candidate cis-regulatory elements, suggesting a potential role in gene regulation. One rRE, a GAAA-repeat expansion, located near a regulatory element in the first intron of UGT2B7 was detected in 34% of renal cell carcinoma samples and was validated by long-read DNA sequencing. Moreover, in preliminary experiments, treating cells that harbour this rRE with a GAAA-targeting molecule led to a dose-dependent decrease in cell proliferation. Overall, our results suggest that rREs may be an important but unexplored source of genetic variation in human cancer, and we provide a comprehensive catalogue for further study.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.