We describe a framework for inducing probabilistic grammars from corpora of positive samples. First, samples are incorporated by adding ad-hoc rules to a working grammar; subsequently, elements of the model (such as states or nonterminals) are merged to achieve generalization and a more compact representation. The choice of what to merge and when to stop is governed by the Bayesian posterior probability of the grammar given the data, which formalizes a trade-off between a close fit to the data and a default preference for simpler models ('Occam's Razor'). The general scheme is illustrated using three types of probabilistic grammars: Hidden Markov models, class-based ¢ -grams, and stochastic context-free grammars.
Sather extends the notion of an iterator in a powerful new way. We argue that iteration abstractions belong in class interfaces on an equal footing with routines. Sather iterators were derived from CLU iterators but are much more flexible and better suited for object-oriented programming. We retain the property that iterators are structured, i.e., strictly bound to a controlling structured statement. We motivate and describe the construct along with several simple examples. We compare it with iteration based on CLU
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.