2021
DOI: 10.1101/2021.02.08.430070
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

On the application of BERT models for nanopore methylation detection

Abstract: DNA methylation is a common epigenetic modification, which is widely associated with various biological processes, such as gene expression, aging, and disease. Nanopore sequencing provides a promising methylation detection approach through monitoring abnormal signal shifts for detecting modified bases in target motif regions. Recently, model-based approaches, especially those with deep learning models, have achieved significant performance improvements on nanopore methylation detection. In this work, we explo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
6
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
2
2

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(6 citation statements)
references
References 8 publications
0
6
0
Order By: Relevance
“…Several different versions of nanopore chemistry have been developed by ONT to improve the accuracy of single-molecule sequencing (Fig. 1A [9,[12][13][14][15][16][17][18][19][20][21][22][23]). Both the first pore version, termed R6 ("R" for Reader), and the subsequent R7 pore series yielded high error rates and only mediocre accuracy [11].…”
Section: Introductionmentioning
confidence: 99%
See 3 more Smart Citations
“…Several different versions of nanopore chemistry have been developed by ONT to improve the accuracy of single-molecule sequencing (Fig. 1A [9,[12][13][14][15][16][17][18][19][20][21][22][23]). Both the first pore version, termed R6 ("R" for Reader), and the subsequent R7 pore series yielded high error rates and only mediocre accuracy [11].…”
Section: Introductionmentioning
confidence: 99%
“…Specifically, the electric current patterns, also known as "squiggles," resulting from the passage of modified bases through the pores differs from the patterns produced by the passage of unmodified bases [26,30]. The difference can be determined after nanopore read basecalling and alignment by (1) statistical tests comparing the electric current pattern to an in silico reference or the pattern from a nonmodified control sample [20,31]; (2) pre-trained supervised learning models, e.g., neural network [23,[32][33][34][35][36][37], machine learning model [38], and Hidden Markov Models (HMM) [9,39]. However, DNA-methylation detection using nanopore sequencing presents a methodological challenge, i.e., the capacity to detect modifications in different CpGs that are in close proximity to one another on a DNA fragment (i.e., nonsingleton), as it is assumed that all CpGs within a 10-bp region share the same methylation status.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…Here, we phrase DNA methylation-site detection as a Natural Language Processing (NLP) problem and propose a novel framework to address it. Previous studies for identifying methylation sites usually use BERT, a classic NLP approach, or, in the context of DNA sequences, the variant DNABERT (36), either as a model that accepts embeddings from Word2vec, or as an encoder that generates embeddings for input to a deep neural network (23, 25, 32, 33, 37).…”
Section: Introductionmentioning
confidence: 99%