Background Hepatocellular carcinoma (HCC) is a common primary liver cancer with poor overall survival. We hypothesized that there are HCC-associated cell-types that impact patient survival. Methods We combined liver single nucleus (snRNA-seq), single cell (scRNA-seq), and bulk RNA-sequencing (RNA-seq) data to search for cell-type differences in HCC. To first identify cell-types in HCC, adjacent non-tumor tissue, and normal liver, we integrated single-cell level data from a healthy liver cohort (n = 9 non-HCC samples) collected in the Strasbourg University Hospital; an HCC cohort (n = 1 non-HCC, n = 14 HCC-tumor, and n = 14 adjacent non-tumor samples) collected in the Singapore General Hospital and National University; and another HCC cohort (n = 3 HCC-tumor and n = 3 adjacent non-tumor samples) collected in the Dumont-UCLA Liver Cancer Center. We then leveraged these single cell level data to decompose the cell-types in liver bulk RNA-seq data from HCC patients’ tumor (n = 361) and adjacent non-tumor tissue (n = 49) from the Cancer Genome Atlas (TCGA) multi-center cohort. For replication, we decomposed 221 HCC and 209 adjacent non-tumor liver microarray samples from the Liver Cancer Institute (LCI) cohort collected by the Liver Cancer Institute and Zhongshan Hospital of Fudan University. Results We discovered a tumor-associated proliferative cell-type, Prol (80.4% tumor cells), enriched for cell cycle and mitosis genes. In the liver bulk tissue from the TCGA cohort, the proportion of the Prol cell-type is significantly increased in HCC and associates with a worse overall survival. Independently from our decomposition analysis, we reciprocally show that Prol nuclei/cells significantly over-express both tumor-elevated and survival-decreasing genes obtained from the bulk tissue. Our replication analysis in the LCI cohort confirmed that an increased estimated proportion of the Prol cell-type in HCC is a significant marker for a shorter overall survival. Finally, we show that somatic mutations in the tumor suppressor genes TP53 and RB1 are linked to an increase of the Prol cell-type in HCC. Conclusions By integrating liver single cell, single nucleus, and bulk expression data from multiple cohorts we identified a proliferating cell-type (Prol) enriched in HCC tumors, associated with a decreased overall survival, and linked to TP53 and RB1 somatic mutations.
Modern data-driven research increasingly depends on quantitative analysis, yet effectivemechanisms ensuring data and analysis transparency and reproducibility are yet to be developedand adopted widely. The importance and benefits of sharing research products has beenrecognized widely by the scientific community. In biomedical research, it is not only imperativeto publish a detailed description of the study design, methodology, results and interpretation, butthere is a pressing need to make all the research products publicly available, shareable, welldocumented to increase transparency and reproducibility. Current efforts in sharing researchproducts mostly rely on individual researchers and widely but variably enforced by theseindividuals and research organizations. However, an increasing body of evidence in recent yearsalso points to a growing problem of reproducibility across scientific disciplines, i.e. publishedresults often contain analyses that are non replicated due to lack of documentation, code anddata required to reproduce the analysis. Our results indicate that only 36% of the scientificmanuscripts published in prominent biomedical journals share raw data and 9% of the papersshare code. We hope that our analysis informs and exhorts the biomedical community to designeffective strategies to be widely adopted by the researchers to improve the current scenario oftransparency and reproducibility of data-driven biomedical research.
Data-driven computational analysis is becoming increasingly important in biomedical research, as the amount of data being generated continues to grow. However, the lack of practices of sharing research outputs, such as data, source code and methods, affects transparency and reproducibility of studies, which are critical to the advancement of science. Many published studies are not reproducible due to insufficient documentation, code, and data being shared. We conducted a comprehensive analysis of 453 manuscripts published between 2016-2021 and found that 50.1% of them fail to share the analytical code. Even among those that did disclose their code, a vast majority failed to offer additional research outputs, such as data. Furthermore, only one in ten papers organized their code in a structured and reproducible manner. We discovered a significant association between the presence of code availability statements and increased code availability (p=2.71x10-9). Additionally, a greater proportion of studies conducting secondary analyses were inclined to share their code compared to those conducting primary analyses (p=1.15*10-07). In light of our findings, we propose raising awareness of code sharing practices and taking immediate steps to enhance code availability to improve reproducibility in biomedical research. By increasing transparency and reproducibility, we can promote scientific rigor, encourage collaboration, and accelerate scientific discoveries. We must prioritize open science practices, including sharing code, data, and other research products, to ensure that biomedical research can be replicated and built upon by others in the scientific community.
Twitter is one of the most popular microblogging and social networking services, where users can post, retweet, comment, and engage in collaborative discussions. However, improper usage of Twitter can be detrimental to science and even have a negative impact on mental health. Thus, analyzing tweets and Twitter data of various researchers will help us to deduce appropriate ways of using Twitter to advance in our research careers. Existing literature has analyzed the activity of scientists on Twitter, such as studying the relationship between Twitter mentions and article citations, determining the benefits of Twitter in the development and distribution of scientific knowledge, relevant metrics for prediction of highly cited articles, and type of content that researchers tweet etc. For example, Eysenbach et al performed an analysis of tweets and citations and how one can predict citations using tweets. Most of the existing literature analyzed a limited number of researchers, compromising the generalizability of derived results. In our study, we have taken a comprehensive and systematic approach to analyze 167,000 scientists who published research papers on PubMed using data-driven methods. We observed various parameters like number of followers, number of friends, citation count and K-index, in the light of gender, ancestry and profession of the researchers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.