2006
DOI: 10.1007/11872436_9
|View full text |Cite
|
Sign up to set email alerts
|

Stochastic Analysis of Lexical and Semantic Enhanced Structural Language Model

Abstract: Abstract. In this paper, we present a directed Markov random field model that integrates trigram models, structural language models (SLM) and probabilistic latent semantic analysis (PLSA) for the purpose of statistical language modeling. The SLM is essentially a generalization of shift-reduce probabilistic push-down automata thus more complex and powerful than probabilistic context free grammars (PCFGs). The added context-sensitiveness due to trigrams and PLSAs and violation of tree structure in the topology o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
12
0

Year Published

2012
2012
2012
2012

Publication Types

Select...
1
1

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(12 citation statements)
references
References 17 publications
(24 reference statements)
0
12
0
Order By: Relevance
“…When combining n-gram, m-SLM, and PLSA together to build a composite generative language model under the directed MRF paradigm (Wang et al 2005b(Wang et al , 2006, the composite language model is simply a complicated generative model that has four operators: WORD-PREDICTOR, TAGGER, CONSTRUCTOR, and SEMANTIZER. The TAGGER and CONSTRUCTOR in SLM and the SEMANTIZER in PLSA remain unchanged; the WORD-PREDICTORs in n-gram, m-SLM, and PLSA, however, are combined to form a stronger WORD-PREDICTOR that generates the next word, w k+1 , not only depending on the m most recently exposed headwords h −1 −m in the word-parse k-prefix but also its n-gram history w k k−n+2 and its semantic content g k+1 .…”
Section: The Composite N-gram/slm/plsa Language Modelmentioning
confidence: 99%
See 4 more Smart Citations
“…When combining n-gram, m-SLM, and PLSA together to build a composite generative language model under the directed MRF paradigm (Wang et al 2005b(Wang et al , 2006, the composite language model is simply a complicated generative model that has four operators: WORD-PREDICTOR, TAGGER, CONSTRUCTOR, and SEMANTIZER. The TAGGER and CONSTRUCTOR in SLM and the SEMANTIZER in PLSA remain unchanged; the WORD-PREDICTORs in n-gram, m-SLM, and PLSA, however, are combined to form a stronger WORD-PREDICTOR that generates the next word, w k+1 , not only depending on the m most recently exposed headwords h −1 −m in the word-parse k-prefix but also its n-gram history w k k−n+2 and its semantic content g k+1 .…”
Section: The Composite N-gram/slm/plsa Language Modelmentioning
confidence: 99%
“…The composite n-gram/m-SLM/PLSA language model can be formulated as a rather complex chain-tree-table directed MRF model (Wang et al 2006) A composite n-gram/m-SLM/PLSA language model where the hidden information is the parse tree T and semantic content g. The n-gram encodes local word interactions, the m-SLM models the sentence's syntactic structure, and the PLSA captures the document's semantic content; all interact together to constrain the generation of natural language. The WORD-PREDICTOR generates the next word w k+1 with probability p…”
Section: The Composite N-gram/slm/plsa Language Modelmentioning
confidence: 99%
See 3 more Smart Citations