A Systematic Cross-Comparison of Sequence Classifiers

Rosenfeld, Binyamin; Feldman, Ronen; Fresko, Moshe

doi:10.1137/1.9781611972764.61

Cited by 11 publications

(9 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…CRFs are a statistical sequence modeling framework that is reported to outperform other popular learning models, including MaxEnt (maximum entropy), in a number of natural language processing applications [35]. CRFs modeling is first applied to Chinese word segmentation in [31], treating it as a binary decision task to determine whether a character is the beginning of a word in a sentence.…”

Section: Baseline Systemmentioning

confidence: 99%

Integrating unsupervised and supervised word segmentation: The role of goodness measures

Zhang

Kit

2011

Information Sciences

View full text Add to dashboard Cite

Section: Baseline Systemmentioning

confidence: 99%

Integrating unsupervised and supervised word segmentation: The role of goodness measures

Zhang

Kit

2011

Information Sciences

View full text Add to dashboard Cite

“…CRFs often outperform maximum entropy Markov model (MEMM) [2], another popular structure learning method. The main reason is that, among directed graphical models, CRFs do not suffer from the label bias problem as much as MEMM and other conditional Markov models do [1].…”

Section: Introductionmentioning

confidence: 99%

A Simple and Efficient Model Pruning Method for Conditional Random Fields

Zhao

Kit

2009

Computer Processing of Oriental Languages. Language Technology for the Knowledge-Based Economy

View full text Add to dashboard Cite

Abstract. Conditional random fields (CRFs) have been quite successful in various machine learning tasks. However, as larger and larger data become acceptable for the current computational machines, trained CRFs Models for a real application quickly inflate. Recently, researchers often have to use models with tens of millions features. This paper considers pruning an existing CRFs model for storage reduction and decoding speedup. We propose a simple but efficient rank metric for feature group rather than features that previous work usually focus on. A series of experiments in two typical labeling tasks, word segmentation and named entity recognition for Chinese, are carried out to check the effectiveness of the proposed method. The results are quite positive and show that CRFs models are highly redundant, even using carefully selected label set and feature templates.

show abstract

“…It often outperforms a maximum entropy (MaxEnt) model [2] , another popular machine learning method in NLP. The main reason is that, among directed graphical models, CRF does not suffer from the label bias problem as much as the MaxEnt and other conditional Markov models do [1] .…”

Section: Introductionmentioning

confidence: 99%

“…(2) L-BFGS [4] is a typical algorithm for CRFs training. To label an unseen sequence, we compute the most likely labeling Y * as in (3) by Viterbi algorithm [5] .…”

Section: Introductionmentioning

confidence: 99%

Scaling Conditional Random Fields by One-Against-the-Other Decomposition

Zhang

Kit

2008

J. Comput. Sci. Technol.

View full text Add to dashboard Cite

As a powerful sequence labeling model, conditional random fields (CRFs) have had successful applications in many natural language processing (NLP) tasks. However, the high complexity of CRFs training only allows a very small tag (or label) set, because the training becomes intractable as the tag set enlarges. This paper proposes an improved decomposed training and joint decoding algorithm for CRF learning. Instead of training a single CRF model for all tags, it trains a binary sub-CRF independently for each tag. An optimal tag sequence is then produced by a joint decoding algorithm based on the probabilistic output of all sub-CRFs involved. To test its effectiveness, we apply this approach to tackling Chinese word segmentation (CWS) as a sequence labeling problem. Our evaluation shows that it can reduce the computational cost of this language processing task by 40-50% without any significant performance loss on various large-scale data sets.

show abstract

A Systematic Cross-Comparison of Sequence Classifiers

Cited by 11 publications

References 10 publications

Integrating unsupervised and supervised word segmentation: The role of goodness measures

Integrating unsupervised and supervised word segmentation: The role of goodness measures

A Simple and Efficient Model Pruning Method for Conditional Random Fields

Scaling Conditional Random Fields by One-Against-the-Other Decomposition

Contact Info

Product

Resources

About