Information theoretic based measures form a fundamental class of similarity measures for comparing clusterings, beside the class of pair-counting based and set-matching based measures. In this paper, we discuss the necessity of correction for chance for information theoretic based measures for clusterings comparison. We observe that the baseline for such measures, i.e. average value between random partitions of a data set, does not take on a constant value, and tends to have larger variation when the ratio between the number of data points and the number of clusters is small. This effect is similar in some other non-information theoretic based measures such as the well-known Rand Index. Assuming a hypergeometric model of randomness, we derive the analytical formula for the expected mutual information value between a pair of clusterings, and then propose the adjusted version for several popular information theoretic based measures. Some examples are given to demonstrate the need and usefulness of the adjusted measures.
There is an urgent clinical need for
antimalarial compounds that target malaria caused by both Plasmodium falciparum and Plasmodium
vivax. The M1 and M17 metalloexopeptidases play key
roles in Plasmodium hemoglobin digestion and are
validated drug targets. We used a multitarget strategy to rationally
design inhibitors capable of potent inhibition of the M1 and M17 aminopeptidases
from both P. falciparum (Pf-M1 and Pf-M17) and P. vivax (Pv-M1 and Pv-M17). The novel
chemical series contains a hydroxamic acid zinc binding group to coordinate
catalytic zinc ion/s, and a variety of hydrophobic groups to probe
the S1′ pockets of the four target enzymes. Structural characterization
by cocrystallization showed that selected compounds utilize new and
unexpected binding modes; most notably, compounds substituted with
bulky hydrophobic substituents displace the Pf-M17
catalytic zinc ion. Excitingly, key compounds of the series potently
inhibit all four molecular targets and show antimalarial activity
comparable to current clinical candidates.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.