It is conventional wisdom in machine learning and data mining that logical models such as rule sets are more interpretable than other models, and that among such rule-based models, simpler models are more interpretable than more complex ones. In this position paper, we question this latter assumption by focusing on one particular aspect of interpretability, namely the plausibility of models. Roughly speaking, we equate the plausibility of a model with the likeliness that a user accepts it as an explanation for a prediction. In particular, we argue that-all other things being equal-longer explanations may be more convincing than shorter ones, and that the predominant bias for shorter models, which is typically necessary for learning powerful discriminative models, may not be suitable when it comes to user acceptance of the learned models. To that end, we first recapitulate evidence for and against this postulate, and then report the results of an evaluation in a crowdsourcing study based on about 3,000 judgments. The results do not reveal a strong preference for simple rules, whereas we can observe a weak preference for longer rules in some domains. We then relate these results to well-known cognitive biases such as the conjunction fallacy, the representative heuristic, or the recogition heuristic, and investigate their relation to rule length and plausibility.
We present a framework for efficiently exploiting free-text annotations as a complementary resource to image classification. A novel approach called Semantic Concept Mapping (SCM) is used to classify entities occurring in the text to a custom-defined set of concepts. SCM performs unsupervised classification by exploiting the relations between common entities codified in the Wordnet thesaurus. SCM exploits Targeted Hypernym Discovery (THD) to map unknown entities extracted from the text to concepts in Wordnet. We show how the result of SCM/THD can be fused with the outcome of Knowledge Assisted Image Analysis (KAA), a classification algorithm that extracts and labels multiple segments from an image. In the experimental evaluation, THD achieved an accuracy of 75%, and SCM an accuracy of 52%. In one of the first experiments with fusing the results of a free-text and image-content classifier, SCM/THD + KAA achieved a relative improvement of 49% and 31% over the text-only and image-content-only baselines.
Abstract. Targeted Hypernym Discovery (THD) performs unsupervised classification of entities appearing in text. A hypernym mined from the free-text of the Wikipedia article describing the entity is used as a class. The type as well as the entity are cross-linked with their representation in DBpedia, and enriched with additional types from DBpedia and YAGO knowledge bases providing a semantic web interoperability. The system, available as a web application and web service at entityclassifier.eu, currently supports English, German and Dutch.
The motivation of this paper is to increase the user perceived precision of results of Content Based Information Retrieval (CBIR) systems with Query Refinement (QR), Visual Analysis (VA) and Relevance Feedback (RF) algorithms. The proposed algorithms were implemented as modules into K-Space CBIR system. The QR module discovers hypernyms for the given query from a free text corpus (Wikipedia) and uses these hypernyms as refinements for the original query. Extracting hypernyms from Wikipedia makes it possible to apply query refinement to more queries than in related approaches that use static predefined thesaurus such as Wordnet. The VA Module uses the K-Means algorithm for clustering the images based on low-level features. The RF Module uses the preference information expressed by the user to build user profiles by applying SOM-based supervised classification, which is further optimized by a hybrid Particle Swarm Optimization (PSO) algorithm. The experiments evaluating the performance of QR and VA modules show promising results.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.