Smoking accounts for almost 80–90% of lung cancer cases, which is also the most frequent cause of cancer-related deaths in humans. With over 60 carcinogens in tobacco smoke, cells dividing at the time of carcinogen exposure are at particular risk of neoplasia. The present study aimed to investigate global gene expression differences in lung adenocarcinoma (LUAD) tumour samples of current smokers and non-smokers, in an attempt to elucidate biological mechanisms underlying divergent smoking effects. Current and non-smoker tumour samples were analysed using bioinformatics tools, examining differences in molecular drivers of cancer initiation and progression, as well as evaluating the effect of smoking and sex on epithelial mesenchymal transition (EMT). As a result, we identified 1150 differentially expressed genes showing visible differences in the expression profiles between the smoking subgroups. The genes were primarily involved in cell cycle, DNA replication, DNA repair, VEGF, GnRH, ErbB and T cell receptor signalling pathways. Our results show that smoking clearly affected E2F transcriptional activity and DNA repair pathways including mismatch repair, base excision repair and homologous recombination. We observed that sex could modify the effects of
PLA2G2A
and
PRG4
in LUAD tumour samples, whereas sex and smoking status might possibly have a biological effect on the EMT-related genes:
HEY2
,
OLFM1
,
SFRP1
and
STRAP
. We also identified potential epigenetic changes smoking solely might have on EMT-related genes, which may serve as potential diagnostic and prognostic biomarkers for LUAD patients.
Electronic supplementary material
The online version of this article (10.1007/s13353-020-00569-1) contains supplementary material, which is available to authorized users.