The method for automatic knowledge acquisition from categorical data is explained. Empirical implications are generated from data according to their frequencies. Only those of them are inserted to created knowledge base whose validity in data statistically significantly differs from the weight composed by the PROSPECTOR like inference mechanism from the weights of the implications already present in the base. A comparison with classical machine learning algorithms is discussed. The method is implemented as a part of the Knowledge EXplorer system.
Abstract. Relations between two Boolean attributes derived from data can be quantified by truth functions defined on four-fold tables corresponding to pairs of the attributes. In the paper, several classes of such quantifiers (implicational, double implicational, equivalence ones) with truth values in the unit interval are investigated. The method of construction of the logically nearest double implicational and equivalence quantifiers to a given implicational quantifier (and vice versa) is described and approved.
Relations between two Boolean attributes derived from data can be quantified by truth functions defined on fourfold tables corresponding to pairs of the attributes. Several classes of such quantifiers~implicational, double implicational, and equivalence ones! with truth values in the unit interval were investigated in the frame of the theory of data mining methods. The definition of double implicational quantifiers is based on the idea of conjunction of both directions implications~similarly for equivalence!. In the fuzzy logic theory, there are well-defined classes of fuzzy operators, namely t-norms representing various types of evaluations of fuzzy conjunctioñ and t-conorms representing fuzzy disjunction!. I assert that each t-norm applied to an implicational quantifier gives a double implicational quantifier. Analogously, each t-conorm applied to a double implicational quantifier gives an equivalence quantifier. Logical properties of obtained quantifiers are discussed. The method is illustrated by examples of well-known quantifiers and operators.
When reusing existing ontologies for publishing a dataset in RDF (or developing a new ontology), preference may be given to those providing extensive subcategorization for important classes (denoted as focus classes). The subcategories may consist not only of named classes but also of compound class expressions. We define the notion of focused categorization power of a given ontology, with respect to a focus class and a concept expression language, as the (estimated) weighted count of the categories that can be built from the ontology’s signature, conform to the language, and are subsumed by the focus class. For the sake of tractable initial experiments we then formulate a restricted concept expression language based on existential restrictions, and heuristically map it to syntactic patterns over ontology axioms (so-called FCE patterns). The characteristics of the chosen concept expression language and associated FCE patterns are investigated using three different empirical sources derived from ontology collections: first, the concept expression pattern frequency in class definitions; second, the occurrence of FCE patterns in the Tbox of ontologies; and last, for class expressions generated from the Tbox of ontologies (through the FCE patterns); their ‘meaningfulness’ was assessed by different groups of users, yielding a ‘quality ordering’ of the concept expression patterns. The complementary analyses are then compared and summarized. To allow for further experimentation, a web-based prototype was also implemented, which covers the whole process of ontology reuse from keyword-based ontology search through the FCP computation to the selection of ontologies and their enrichment with new concepts built from compound expressions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.