Aiden Smith scite author profile

Aiden Smith

3Publications

23Citation Statements Received

105Citation Statements Given

How they've been cited

How they cite others

101

Affiliations

University of Leicester

Publications

Order By: Most citations

Understanding the impact of sex and stage differences on melanoma cancer patient survival: a SEER-based study

2020

View full text Add to dashboard Cite

Background This paper investigates the difference in survival of melanoma patients across stage and sex by utilising net survival measures. Metrics are presented at both the individual and population level. Methods Flexible parametric models were fitted to estimate life-expectancy metrics to be applied to a group of 104,938 subjects with a melanoma skin cancer diagnosis from 2000 to 2017. Period analysis was used for better predictions for newly diagnosed patients, and missing-stage information was imputed for 9918 patients. Female relative survival was assigned to male subjects to demonstrate the survival discrepancies experienced between sexes. Results At the age of 60, males diagnosed at the regional stage lose an average of 4.99 years of life compared to the general population, and females lose 4.79 years, demonstrating the sex variation in expected mortality. In 2017, males contributed 3545 more life years lost than females, and a potential 1931 life years could be preserved if sex differences in survival were eliminated. Conclusions This study demonstrates the survival differences across population subgroups as a result of a melanoma cancer diagnosis. Females experience better prognosis across age and stage at diagnosis; however, further investigation is necessary to better understand the mechanisms behind this difference.

show abstract

Generating high-fidelity synthetic time-to-event datasets to improve data transparency and accessibility

Smith

Lambert

Rutherford

2022

BMC Med Res Methodol

View full text Add to dashboard Cite

Background A lack of available data and statistical code being published alongside journal articles provides a significant barrier to open scientific discourse, and reproducibility of research. Information governance restrictions inhibit the active dissemination of individual level data to accompany published manuscripts. Realistic, high-fidelity time-to-event synthetic data can aid in the acceleration of methodological developments in survival analysis and beyond by enabling researchers to access and test published methods using data similar to that which they were developed on. Methods We present methods to accurately emulate the covariate patterns and survival times found in real-world datasets using synthetic data techniques, without compromising patient privacy. We model the joint covariate distribution of the original data using covariate specific sequential conditional regression models, then fit a complex flexible parametric survival model from which to generate survival times conditional on individual covariate patterns. We recreate the administrative censoring mechanism using the last observed follow-up date information from the initial dataset. Metrics for evaluating the accuracy of the synthetic data, and the non-identifiability of individuals from the original dataset, are presented. Results We successfully create a synthetic version of an example colon cancer dataset consisting of 9064 patients which aims to show good similarity to both covariate distributions and survival times from the original data, without containing any exact information from the original data, therefore allowing them to be published openly alongside research. Conclusions We evaluate the effectiveness of the methods for constructing synthetic data, as well as providing evidence that there is minimal risk that a given patient from the original data could be identified from their individual unique patient information. Synthetic datasets using this methodology could be made available alongside published research without breaching data privacy protocols, and allow for data and code to be made available alongside methodological or applied manuscripts to greatly improve the transparency and accessibility of medical research.

show abstract

Improving Data Transparency and Accessibility in the Research Community through the Construction of Accurately Simulated Time-to-Event Datasets

Smith

Lambert

Rutherford

2021

Preprint

View full text Add to dashboard Cite

BackgroundA lack of availability of data and statistical code being published alongside journal articles provides a significant barrier to open scientific discourse, and reproducibility of research. Information governance restrictions inhibit the active dissemination of individual level data to accompany published manuscripts. Realistic, accurate time-to-event synthetic data can aid in the acceleration of methodological developments in survival analysis and beyond by enabling researchers to access and test published methods using data similar to that which they were developed on.MethodsThis paper presents methods to accurately replicate the covariate patterns and survival times found in real-world datasets using simulation techniques, without compromising individual patient identifiability. We model the joint covariate distribution of the original data using covariate specific sequential conditional regression models, then fit a complex flexible parametric survival model from which to simulate survival times conditional on individual covariate patterns. We recreate the administrative censoring mechanism using the last observed follow-up date information from the initial dataset. Metrics for evaluating the accuracy of the synthetic data, and the non-identifiability of individuals from the original dataset, are presented.ResultsWe successfully create a synthetic version of an example colon cancer dataset consisting of 9064 patients which aims to show good similarity to both covariate distributions and survival times from the original data, without containing any exact information from the original data, therefore allowing them to be published openly alongside research. ConclusionsWe evaluate the effectiveness of the simulation methods for constructing synthetic data, as well as providing evidence that it is almost impossible that a given patient from the original data could be identified from their individual unique date information. Simulated datasets using this methodology could be made available alongside published research without breaching data privacy protocols, and allow for data and code to be made available alongside methodological or applied manuscripts to greatly improve the transparency and accessibility of medical research.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Aiden Smith

Understanding the impact of sex and stage differences on melanoma cancer patient survival: a SEER-based study

Generating high-fidelity synthetic time-to-event datasets to improve data transparency and accessibility

Improving Data Transparency and Accessibility in the Research Community through the Construction of Accurately Simulated Time-to-Event Datasets

Contact Info

Product

Resources

About