DNA methylation is a key epigenetic factor regulating gene expression. While promoterassociated methylation has been extensively studied, recent publications have revealed that functionally important methylation also occurs in intergenic and distal regions, and varies across genes and tissue types. Given the growing importance of inter-platform integrative genomic analyses, there is an urgent need to develop methods to construct gene-level methylation summaries that account for the potentially complex relationships between methylation and expression. We introduce a novel sequential penalized regression approach to construct gene-specific methylation profiles (GSMPs) which find for each gene and tissue type a sparse set of CpGs best explaining gene expression and weights indicating direction and strength of association. Using TCGA and MD Anderson colorectal cohorts to build and validate our models, we demonstrate our strategy better explains expression variability than standard approaches and produces gene-level scoresshowing key methylation differences across recently discovered colorectal cancer subtypes. We share an R Shiny app that presents GSMP results for colorectal, breast, and pancreatic cancer with plans to extend it to all TCGA cancer types. Our approach yields tissue-specific, genespecific sparse lists of functionally important CpGs that can be used to construct gene-level methylation scores that are maximally correlated with gene expression for use in integrative models, and produce a tissue-specific summary of which genes appear to be strongly regulated by methylation. Our results introduce an important resource to the biomedical community for integrative genomics analyses involving DNA methylation. gene expression. Integrative models like iBAG and iCluster require calculation of gene-level summaries for each genomic platform. For a platform such as copy number, it is relatively easy to come up with a reasonable strategy for computing gene level summaries (e.g. average copy number in coding region of gene), but a simple strategy like this may not work well for platforms like methylation that affect expression in more complex and subtle ways. In existing literature, we have encountered two strategies for constructing gene-level methylation summaries: (1) computing the average methylation level across all probes located in the gene's promoter region, or (2) using the methylation level for the single probe that appears to be most negatively correlated with gene expression. While reasonable, both of these strategies appear to be simplistic and could miss the most important epigenetic effects for a given gene.This can be seen in a study of methylation and expression of the genes EREG and AREG in colorectal cancer (CRC) that motivated this work (Lee et al. 2016). It was hypothesized that (a) higher gene expression of EREG and AREG, which encode EGFR ligands epiregulin and amphiregulin, is associated with increased sensitivity to anti-EGFR therapy, and (b) this expression is largely modulated by methylation....