This study aimed to explore the prognostic impact of spatial distribution of tumor‐infiltrating lymphocytes (TILs) quantified by deep learning (DL) approaches based on digitalized whole‐slide images stained with hematoxylin and eosin in patients with colorectal cancer (CRC). The prognostic impact of spatial distributions of TILs in patients with CRC was explored in the Yonsei cohort (n = 180) and validated in The Cancer Genome Atlas (TCGA) cohort (n = 268). Two experienced pathologists manually measured TILs at the most invasive margin (IM) as 0–3 by the Klintrup–Mäkinen (KM) grading method and this was compared to DL approaches. Inter‐rater agreement for TILs was measured using Cohen's kappa coefficient. On multivariate analysis of spatial TIL features derived by DL approaches and clinicopathological variables including tumor stage, microsatellite instability, and KRAS mutation, TIL densities within 200 μm of the IM (f_im200) remained the most significant prognostic factor for progression‐free survival (PFS) (hazard ratio [HR] 0.004 [95% confidence interval, CI, 0.0001–0.15], p = 0.0028) in the Yonsei cohort. On multivariate analysis using the TCGA dataset, f_im200 retained prognostic significance for PFS (HR 0.031 [95% CI 0.001–0.645], p = 0.024). Inter‐rater agreement of manual KM grading was insignificant in the Yonsei (κ = 0.109) and the TCGA (κ = 0.121) cohorts. The survival analysis based on KM grading showed statistically significant different PFS in the TCGA cohort, but not the Yonsei cohort. Automatic quantification of TILs at the IM based on DL approaches shows prognostic utility to predict PFS, and could provide robust and reproducible TIL density measurement in patients with CRC.
The tumor mutational burden (TMB) is a genomic biomarker, which can help in identifying patients most likely to benefit from immunotherapy across a wide range of tumor types including bladder cancer. DNA sequencing, such as whole exome sequencing (WES) is typically used to determine the number of acquired mutations in the tumor. However, WES is expensive, time consuming and not applicable to all patients, and hence it is difficult to be incorporated into clinical practice. This study investigates the feasibility to predict bladder cancer patients TMB by using histological image features. We design an automated whole slide image analysis pipeline that predicts bladder cancer patient TMB via histological features extracted by using transfer learning on deep convolutional networks. The designed pipeline is evaluated to publicly available large histopathology image dataset for a cohort of 253 patients with bladder cancer obtained from The Cancer Genome Atlas (TCGA) project. Experimental results show that our technique provides over 73% classification accuracy, and an area under the receiver operating characteristic curve of 0.75 in distinguishing low and high TMB patients. In addition, it is found that the predicted low and high TMB patients have statistically different survivals, with the p value of 0.047. Our results suggest that bladder cancer patient TMB is predictable by using histological image features derived from digitized H&E slides. Our method is extensible to histopathology images of other organs for predicting patient clinical outcomes.
A growing issue in the modern cyberspace world is the direct identification of malicious activity over network connections. The boom of the machine learning industry in the past few years has led to the increasing usage of machine learning technologies, which are especially prevalent in the network intrusion detection research community. When utilizing these fairly contemporary techniques, the community has realized that datasets are pivotal for identifying malicious packets and connections, particularly ones associated with information concerning labeling in order to construct learning models. However, there exists a shortage of publicly available, relevant datasets to researchers in the network intrusion detection community. Thus, in this paper, we introduce a method to construct labeled flow data by combining the packet meta-information with IDS logs to infer labels for intrusion detection research. Specifically, we designed a NetFlow-compatible format due to the capability of a a large body of network devices, such as routers and switches, to export NetFlow records from raw traffic. In doing so, the introduced method at hand would aid researchers to access relevant network flow datasets along with label information.1.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.