Irina Matveeva scite author profile

Abstract. We consider the problem of labeling a partially labeled graph. This setting may arise in a number of situations from survey sampling to information retrieval to pattern recognition in manifold settings. It is also of potential practical importance, when the data is abundant, but labeling is expensive or requires human assistance. Our approach develops a framework for regularization on such graphs. The algorithms are very simple and involve solving a single, usually sparse, system of linear equations. Using the notion of algorithmic stability, we derive bounds on the generalization error and relate it to structural invariants of the graph. Some experimental results testing the performance of the regularization algorithm and the usefulness of the generalization bound are presented.

show abstract

A geometric view on bilingual lexicon extraction from comparable corpora

Gaussier

Renders

Matveeva

et al. 2004

View full text Add to dashboard Cite

show abstract

High accuracy retrieval with multiple nested ranker

Matveeva

Burges

Burkard

et al. 2006

View full text Add to dashboard Cite

Term representation with Generalized Latent Semantic Analysis

Matveeva¹,

Levow²,

Farahat³

et al. 2007

View full text Add to dashboard Cite

Tikhonov regularization and semi-supervised learning on large graphs

View full text Add to dashboard Cite

The SED heuristic for morpheme discovery

Matveeva

Goldsmith

et al. 2005

View full text Add to dashboard Cite

show abstract

Document representation and multilevel measures of document similarity

Matveeva

2006

View full text Add to dashboard Cite

show abstract

Using morphology and syntax together in unsupervised learning

Matveeva

Goldsmith

et al. 2005

View full text Add to dashboard Cite

Unsupervised learning of grammar is a problem that can be important in many areas ranging from text preprocessing for information retrieval and classification to machine translation. We describe an MDL based grammar of a language that contains morphology and lexical categories. We use an unsupervised learner of morphology to bootstrap the acquisition of lexical categories and use these two learning processes iteratively to help and constrain each other. To be able to do so, we need to make our existing morphological analysis less fine grained. We present an algorithm for collapsing morphological classes (signatures) by using syntactic context. Our experiments demonstrate that this collapse preserves the relation between morphology and lexical categories within new signatures, and thereby minimizes the description length of the model.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Irina Matveeva

Regularization and Semi-supervised Learning on Large Graphs

A geometric view on bilingual lexicon extraction from comparable corpora

High accuracy retrieval with multiple nested ranker

Term representation with Generalized Latent Semantic Analysis

Tikhonov regularization and semi-supervised learning on large graphs

The SED heuristic for morpheme discovery

Document representation and multilevel measures of document similarity

Using morphology and syntax together in unsupervised learning

Contact Info

Product

Resources

About