Ming Li scite author profile

The relationship between the Bayesian approach and the minimum description length approach is established. We sharpen and clarify the general modeling principles MDL and MML, abstracted as the ideal MDL principle and defined from Bayes's rule by means of Kolmogorov complexity. The basic condition under which the ideal principle should be applied is encapsulated as the Fundamental Inequality, which in broad terms states that the principle is valid when the data are random, relative to every contemplated hypothesis and also these hypotheses are random relative to the (universal) prior. Basically, the ideal principle states that the prior probability associated with the hypothesis should be given by the algorithmic universal probability, and the sum of the log universal probability of the model plus the log of the probability of the data given the model should be minimized. If we restrict the model class to the finite sets then application of the ideal principle turns into Kolmogorov's minimal sufficient statistic. In general we show that data compression is almost always the best strategy, both in hypothesis identification and prediction.

show abstract

On spaced seeds for similarity search

Keich

et al. 2004

Discrete Applied Mathematics

119

125

View full text Add to dashboard Cite

Shared Information and Program Plagiarism Detection

Chen

Francia

et al. 2004

IEEE Trans. Inform. Theory

188

View full text Add to dashboard Cite

Entity Disambiguation by Knowledge and Text Jointly Embedding

Fang¹,

Zhang

Wang

et al. 2016

View full text Add to dashboard Cite

For most entity disambiguation systems, the secret recipes are feature representations for mentions and entities, most of which are based on Bag-of-Words (BoW) representations. Commonly, BoW has several drawbacks: (1) It ignores the intrinsic meaning of words/entities; (2) It often results in high-dimension vector spaces and expensive computation; (3) For different applications, methods of designing handcrafted representations may be quite different, lacking of a general guideline. In this paper, we propose a different approach named EDKate. We first learn low-dimensional continuous vector representations for entities and words by jointly embedding knowledge base and text in the same vector space. Then we utilize these embeddings to design simple but effective features and build a two-layer disambiguation model. Extensive experiments on real-world data sets show that (1) The embedding-based features are very effective. Even a single one embedding-based feature can beat the combination of several BoW-based features. (2) The superiority is even more promising in a difficult set where the mention-entity prior cannot work well. (3) The proposed embedding method is much better than trivial implementations of some off-the-shelf embedding algorithms. (4) We compared our EDKate with existing methods/systems and the results are also positive.

show abstract

Semi-supervised learning by disagreement

Zhou

2009

Knowl Inf Syst

329

View full text Add to dashboard Cite

Normalized Information Distance

Vitányi

Balbach

Cilibrasi

et al.

View full text Add to dashboard Cite

The normalized information distance is a universal distance measure for objects of all kinds. It is based on Kolmogorov complexity and thus uncomputable, but there are ways to utilize it. First, compression algorithms can be used to approximate the Kolmogorov complexity if the objects have a string representation. Second, for names and abstract concepts, page count statistics from the World Wide Web can be used. These practical realizations of the normalized information distance can then be applied to machine learning tasks, expecially clustering, to perform feature-free and parameter-free data mining. This chapter discusses the theoretical foundations of the normalized information distance and both practical realizations. It presents numerous examples of successful real-world applications based on these distance measures, ranging from bioinformatics to music clustering to machine translation.

show abstract

Fast Haar Transforms for Graph Neural Networks

Wang

et al. 2020

Neural Networks

View full text Add to dashboard Cite

Social participation of the elderly in China: The roles of conventional media, digital access and social media engagement

Huang

et al. 2020

Telematics and Informatics

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ming Li

Minimum description length induction, Bayesianism, and Kolmogorov complexity

On spaced seeds for similarity search

Shared Information and Program Plagiarism Detection

Entity Disambiguation by Knowledge and Text Jointly Embedding

Semi-supervised learning by disagreement

Normalized Information Distance

Fast Haar Transforms for Graph Neural Networks

Social participation of the elderly in China: The roles of conventional media, digital access and social media engagement

Contact Info

Product

Resources

About