Finite-state cascades represent an attractive architecture for parsing unrestricted text. Deterministic parsers specified by finite-state cascades are fast and reliable. They can be extended at modest cost to construct parse trees with finite feature structures. Finally, such deterministic parsers do not necessarily involve trading off accuracy against speed-they may in fact be more accurate than exhaustive-search stochastic contextfree parsers.
Finite-State CascadesOf current interest in corpus-oriented computational linguistics are techniques for bootstrapping broad-coverage parsers from text corpora. The work described here is a step along the way toward a bootstrapping scheme that involves inducing a tagger from word distributions, a lowlevel "chunk" parser from a tagged corpus, and lexical dependencies from a chunked corpus. In particular, I describe a chunk parsing technique based on what I will call a finite-state cascade. Though I shall not address the question of inducing such a parser from a corpus, the parsing technique has been implemented and is being used in a project for inducing lexical dependencies from corpora in English and German. The resulting parsers are robust and very fast.A finite-state cascade consists of a sequence of levels. Phrases at one level are built on phrases at the previous level, and there is no recursion: phrases never contain same-level or higher-level phrases. Two levels of special importance are the level of chunks and the level of simplex clauses [2,1]. Chunks are the non-recursive cores of "major" phrases, i.e., NP, VP, PP, AP, AdvP. Simplex clauses are clauses in which embedded clauses have been turned into siblingstail recursion has been replaced with iteration, so to speak. To illustrate, (1) shows a parse tree represented as a sequence of levels.
This paper refines the analysis of cotraining, defines and evaluates a new co-training algorithm that has theoretical justification, gives a theoretical justification for the Yarowsky algorithm, and shows that co-training and the Yarowsky algorithm are based on different independence assumptions.
Problem Setting and NotationA bootstrapping problem consists of a space of instances X , a set of labels L, a function
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.