Computational integrative analysis has become a significant approach in the data-driven exploration of biological problems. Many integration methods for cancer subtyping have been proposed, but evaluating these methods has become a complicated problem due to the lack of gold standards. Moreover, questions of practical importance remain to be addressed regarding the impact of selecting appropriate data types and combinations on the performance of integrative studies. Here, we constructed three classes of benchmarking datasets of nine cancers in TCGA by considering all the eleven combinations of four multi-omics data types. Using these datasets, we conducted a comprehensive evaluation of ten representative integration methods for cancer subtyping in terms of accuracy measured by combining both clustering accuracy and clinical significance, robustness, and computational efficiency. We subsequently investigated the influence of different omics data on cancer subtyping and the effectiveness of their combinations. Refuting the widely held intuition that incorporating more types of omics data always produces better results, our analyses showed that there are situations where integrating more omics data negatively impacts the performance of integration methods. Our analyses also suggested several effective combinations for most cancers under our studies, which may be of particular interest to researchers in omics data analysis.
License plate recognition is a core module for intelligent transportation systems, while license plate location is an important part of it. Haar-like cascade classifier is good for face detection, but its application to license plate localization largely depends on selection of positive and negative samples. In this paper we studied on how to choose good samples for Haar-like cascade classifiers and image postprocessing methods to achieve good location results. It is hoped that the study could be useful to guide sample preparation for other object detection using Haar-like cascade classifiers.
Cast indexing is an important video mining technique which provides audience the capability to efficiently retrieve interested scenes, events, and stories from a long video. This paper proposes a novel cast indexing approach based on Normalized Graph Cuts (NCuts) and Page Ranking. The system first adopts face tracker to group face images in each shot into face sets, and then extract local SIFT feature as the feature representation. There are two key problems for cast indexing. One is to find an optimal partition to cluster face sets into main cast. The other is how to exploit the latent relationships among characters to provide a more accurate cast ranking. For the first problem, we model each face set as a graph node, and adopt Normalized Graph Cuts (NCuts) to realize an optimal graph partition. A novel local neighborhood distance is proposed to measure the distance between face sets for NCuts, which is robust to outliers. For the second problem, we build a relation graph for characters by their co-occurrence information, and then adopt the PageRank algorithm to estimate the Important Factor (IF) of each character. The PageRank IF is fused with the content based retrieval score for final ranking. Extensive experiments are carried out on movies, TV series and home videos. Promising results demonstrate the effectiveness of proposed methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.