Joseph Larkman scite author profile

Joseph Larkman

3Publications

41Citation Statements Received

153Citation Statements Given

How they've been cited

How they cite others

106

147

Affiliations

University Hospitals Birmingham NHS Foundation Trust, University of Birmingham

Publications

Order By: Most citations

A random forest based biomarker discovery and power analysis framework for diagnostics research

Acharjee

Larkman

et al. 2020

BMC Med Genomics

View full text Add to dashboard Cite

Background Biomarker identification is one of the major and important goal of functional genomics and translational medicine studies. Large scale –omics data are increasingly being accumulated and can provide vital means for the identification of biomarkers for the early diagnosis of complex disease and/or for advanced patient/diseases stratification. These tasks are clearly interlinked, and it is essential that an unbiased and stable methodology is applied in order to address them. Although, recently, many, primarily machine learning based, biomarker identification approaches have been developed, the exploration of potential associations between biomarker identification and the design of future experiments remains a challenge. Methods In this study, using both simulated and published experimentally derived datasets, we assessed the performance of several state-of-the-art Random Forest (RF) based decision approaches, namely the Boruta method, the permutation based feature selection without correction method, the permutation based feature selection with correction method, and the backward elimination based feature selection method. Moreover, we conducted a power analysis to estimate the number of samples required for potential future studies. Results We present a number of different RF based stable feature selection methods and compare their performances using simulated, as well as published, experimentally derived, datasets. Across all of the scenarios considered, we found the Boruta method to be the most stable methodology, whilst the Permutation (Raw) approach offered the largest number of relevant features, when allowed to stabilise over a number of iterations. Finally, we developed and made available a web interface (https://joelarkman.shinyapps.io/PowerTools/) to streamline power calculations thereby aiding the design of potential future studies within a translational medicine context. Conclusions We developed a RF-based biomarker discovery framework and provide a web interface for our framework, termed PowerTools, that caters the design of appropriate and cost-effective subsequent future omics study.

show abstract

PowerTools: A web based user-friendly tool for future translational study design

Acharjee

Larkman

Cardoso

et al. 2020

Preprint

View full text Add to dashboard Cite

show abstract

Effect of biomarker identification on power analysis for diagnostics research

Acharjee

Larkman

Cardoso

et al. 2020

Preprint

View full text Add to dashboard Cite

Background Biomarker identification is one of the major and important goal of the functional genomics and translational medicine remits. Large scale –omics data are increasing being accumulated and can provide vital means for the identification of biomarkers for the early diagnosis of complex disease and/or patient/diseases stratification for prospective studies. These tasks are clearly interlinked and it is essential that an unbiased and stable methodology is applied in order to address them. Although, recently, many, primarily machine learning based, biomarker identification approaches have been developed, the exploration of potential associations between biomarker identification and the design of future experiments remains a challenge. Methods In this study, using both simulated and published experimentally derived (real) datasets. We compared the performance of decision based machine learning approach called Random Forest. Four Random forest based feature selection methods namely, Boruta, Permutation based feature selection without correction, Permutation based feature selection with correction, Backward elimination based feature selection. Moreover, we conducted power analysis to estimate the number of samples required for potential future studies using the derived stable from the previous step. Results We presented a number of different RF based stable feature selection methods and compared their performances using simulated as well as published experimentally derived datasets. Across all of the scenarios considered, we found Boruta to be the most stable methodology, whilst Permutation (Raw) offered the largest number of relevant features when allowed to stabilise over a number of iterations. Finally, we developed a web interface (https://joelarkman.shinyapps.io/PowerTools/) to streamline power calculations and aid future study design within a translational medicine context. Conclusions We developed a pipeline to discover biomarkers using RF methods. The web interface, “PowerTools” offers the potential for designing appropriate and cost-effective subsequent future omics study designs.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Joseph Larkman

A random forest based biomarker discovery and power analysis framework for diagnostics research

PowerTools: A web based user-friendly tool for future translational study design

Effect of biomarker identification on power analysis for diagnostics research

Contact Info

Product

Resources

About