BackgroundRepeat elements constitute a large proportion of the human genome and recent evidence indicates that repeat element expression has functional roles in both physiological and pathological states. Specifically for cancer, transcription of endogenous retrotransposons is often suppressed in order to attenuate an anti-tumor immune response, whereas aberrant expression of heterochromatin-derived satellite RNA has been identified as a tumor driver. These insights demonstrate separate functions for the dysregulation of distinct repeat subclasses in either the attenuation or progression of human solid tumors. For hematopoietic malignancies, such as AML, only very few studies on the expression/dysregulation of repeat elements were done. MethodsTo study the expression of repeat elements in acute myeloid leukemia (AML), we performed total-RNA sequencing of healthy CD34+ cells and of leukemic blast cells from primary AML patient material. We also developed an integrative bioinformatic approach that can quantify the expression of repeat transcripts from all repeat subclasses (SINE/ALU, LINE and ERV elements and satellite repeats) in relation to the expression of gene and other non-repeat transcripts. This novel approach can be used as an instructive signature (‘rep/gene’ ratio) for repeat element expression and has been extended to the analysis of poly(A)-RNA sequencing datasets from Blueprint and TCGA consortia that together comprise 120 AML patient samples. ResultsWe identified that repeat element expression is generally down-regulated during hematopoietic differentiation and that relative changes in repeat to gene expression (i.e. ‘rep/gene’ ratios) can stratify risk prediction of AML patients and correlate with overall survival probabilities. A high repeat to gene expression ratio identifies AML patient subgroups with a favorable prognosis, whereas a low repeat to gene expression is prevalent in AML patient subgroups with a poor prognosis. ConclusionsWe developed an integrative bioinformatic approach that defines a general model for the analysis of repeat element dysregulation in physiological and pathological development. We find that changes in repeat to gene expression (‘rep/gene’ ratios) correlate with hematopoietic differentiation and can sub-stratify AML patients into low-risk and high-risk subgroups. Thus, the definition of a ‘rep/gene’ expression ratio can serve as a valuable biomarker for AML and could also provide insights into differential patient response to epigenetic drug treatment.