Given a family of Horn clauses, what is the minimal number of Horn clauses implying all other clauses in the family? What is the maximal number of Horn clauses from the family without having resolvents of a certain kind? We consider various problems of this type, and give some sharp bounds. We also consider the probability that a random family of a given size implies all other clauses in the family, and we prove the existence of a sharp threshold.
Selman and Kautz [22] proposed an approach to compute a Horn CNF approximation of CNF formulas. We extend this approach and propose a new algorithm for the Horn least upper bound that involves renaming variables. Although we provide negative results for the quality of approximation, experimental results for random CNF demonstrate that the proposed algorithm improves both computational efficiency and approximation quality. We observe an interesting behavior in the Horn approximation sizes and errors which we call the "Horn bump": Maxima occur in an intermediate range of densities below the satisfiability threshold.
Function annotation efforts provide a foundation to our understanding of cellular processes and the functioning of the living cell. This motivates high-throughput computational methods to characterize new protein members of a particular function. Research work has focused on discriminative machine-learning methods, which promise to make efficient,
de novo
predictions of protein function. Furthermore, available function annotation exists predominantly for individual proteins rather than residues of which only a subset is necessary for the conveyance of a particular function. This limits discriminative approaches to predicting functions for which there is sufficient residue-level annotation, e.g., identification of DNA-binding proteins or where an excellent global representation can be divined. Complete understanding of the various functions of proteins requires discovery and functional annotation at the residue level. Herein, we cast this problem into the setting of multiple-instance learning, which only requires knowledge of the protein’s function yet identifies functionally relevant residues and need not rely on homology. We developed a new multiple-instance leaning algorithm derived from AdaBoost and benchmarked this algorithm against two well-studied protein function prediction tasks: annotating proteins that bind DNA and RNA. This algorithm outperforms certain previous approaches in annotating protein function while identifying functionally relevant residues involved in binding both DNA and RNA, and on one protein-DNA benchmark, it achieves near perfect classification.
Selman and Kautz [22] proposed an approach to compute a Horn CNF approximation of CNF formulas. We extend this approach and propose a new algorithm for the Horn least upper bound that involves renaming variables. Although we provide negative results for the quality of approximation, experimental results for random CNF demonstrate that the proposed algorithm improves both computational efficiency and approximation quality. We observe an interesting behavior in the Horn approximation sizes and errors which we call the "Horn bump": Maxima occur in an intermediate range of densities below the satisfiability threshold.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.