We propose a method for the semi-automated refinement of the results of feature subset selection algorithms. Feature subset selection is a preliminary step in data analysis which identifies the most useful subset of features (columns) in a data table. So-called filter techniques use statistical ranking measures for the correlation of features. Usually a measure is applied to all entities (rows) of a data table. However, the differing contributions of subsets of data entities are masked by statistical aggregation. Feature and entity subset selection are, thus, highly interdependent. Due to the difficulty in visualizing a high-dimensional data table, most feature subset selection algorithms are applied as a black box at the outset of an analysis. Our visualization technique, SmartStripes, allows users to step into the feature subset selection process. It enables the investigation of dependencies and interdependencies between different feature and entity subsets. A user may even choose to control the iterations manually, taking into account the ranking measures, the contributions of different entity subsets, as well as the semantics of the features
The assessment of patient well-being is highly relevant for the early detection of diseases, for assessing the risks of therapies, or for evaluating therapy outcomes. The knowledge to assess a patient's well-being is actually tacit knowledge and thus, can only be used by the physicians themselves. The rationale of this research approach is to use visual interfaces to capture the mental models of experts and make them available more explicitly. We present a visual active learning system that enables physicians to label the well-being state of patient histories suffering prostate cancer. The labeled instances are iteratively learned in an active learning approach. In addition, the system provides models and visual interfaces for a) estimating the number of patients needed for learning, b) suggesting meaningful learning candidates and c) visual feedback on test candidates. We present the results of two evaluation strategies that prove the validity of the applied model. In a representative real-world use case, we learned the feedback of physicians on a data collection of more than 16.000 prostate cancer histories
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.