Named entity recognition using a character-based probabilistic approach

Whitelaw, Casey; Patrick, Jon

doi:10.3115/1119176.1119208

Cited by 11 publications

(4 citation statements)

References 5 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Some years later, many researchers incorporated machine learning algorithms to their systems, but there was still a strong dependency on external resources and domain-specific features and rules (Tjong Kim Sang and De Meulder, 2003). In addition, the majority of the systems used Maximum Entropy (Bender et al, 2003;Chieu and Ng, 2003b;Curran and Clark, 2003;Florian et al, 2003b;Klein et al, 2003) and Hidden Markov Models (Florian et al, 2003b;Klein et al, 2003;Mayfield et al, 2003;Whitelaw and Patrick, 2003). Furthermore, McCallum and Li (2003) used a CRF combined with webaugmented lexicons.…”

Section: Related Workmentioning

confidence: 99%

Modeling Noisiness to Recognize Named Entities using Multitask Neural Networks on Social Media

Aguilar¹,

Monroy²,

González³

et al. 2018

Proceedings of the 2018 Conference of the North American Chapter Of the Association for Computational Linguistics: Hu

View full text Add to dashboard Cite

Recognizing named entities in a document is a key task in many NLP applications. Although current state-of-the-art approaches to this task reach a high performance on clean text (e.g. newswire genres), those algorithms dramatically degrade when they are moved to noisy environments such as social media domains. We present two systems that address the challenges of processing social media data using character-level phonetics and phonology, word embeddings, and Part-of-Speech tags as features. The first model is a multitask end-toend Bidirectional Long Short-Term Memory (BLSTM)-Conditional Random Field (CRF) network whose output layer contains two CRF classifiers. The second model uses a multitask BLSTM network as feature extractor that transfers the learning to a CRF classifier for the final prediction. Our systems outperform the current F1 scores of the state of the art on the Workshop on Noisy User-generated Text 2017 dataset by 2.45% and 3.69%, establishing a more suitable approach for social media environments.

show abstract

Section: Related Workmentioning

confidence: 99%

Modeling Noisiness to Recognize Named Entities using Multitask Neural Networks on Social Media

Aguilar¹,

Monroy²,

González³

et al. 2018

Proceedings of the 2018 Conference of the North American Chapter Of the Association for Computational Linguistics: Hu

View full text Add to dashboard Cite

show abstract

“…Modeling at the orthographic level has been shown to be a successful method of named entity recognition. Orthographic Tries (Cucerzan and Yarowsky, 1999;Whitelaw and Patrick, 2003; and character n-gram modelling are two methods for capturing orthographic features. While Tries give a rich representation of a word, they are fixed to one boundary of a word and cannot extend beyond unseen character sequences.…”

Section: Character N-gram Modellingmentioning

confidence: 99%

“…To the best knowledge of the authors, the only other attempt to use computational inference methods for this task is Whitelaw and Patrick (2003). Here we assumed all words in the training and raw data sets that were not sentence initial, did not occur in a title sentence, and did not immediately follow punctuation were in the correct case.…”

Section: Normalising Case Informationmentioning

confidence: 99%

Meta-learning orthographic and contextual models for language independent named entity recognition

Munro¹,

Ler²,

Patrick³

2003

Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003 -

Self Cite

View full text Add to dashboard Cite

show abstract

“…Chieu and Ng [9] successfully used local features, which are near the word, and global features, which are in the whole document together. Klein et al [14] and Whitelaw et al [15] report that character-based features are useful for recognizing some special structure for the name entity.…”

Section: Name Entity Recognition Paraphrases Acquisition and Heurismentioning

confidence: 99%

An evidence‐based iterative content trust algorithm for the credibility of online news

Zeng

Wang²

2009

Concurrency and Computation

View full text Add to dashboard Cite

SUMMARYPeople encounter more information than they can possibly use every day. But all information is not necessarily of equal value. In many cases, certain information appears to be better, or more trustworthy, than other information. And the challenge that most people then face is to judge which information is more credible. In this paper we propose a new problem called Corroboration Trust, which studies how to find credible news events by seeking more than one source to verify information on a given topic. We design an evidence-based corroboration trust algorithm called TrustNewsFinder, which utilizes the relationships between news articles and related evidence information (person, location, time and keywords about the news). A news article is trustworthy if it provides many pieces of trustworthy evidence, and a piece of evidence is likely to be true if it is provided by many trustworthy news articles. Our experiments show that TrustNewsFinder successfully finds true events among conflicting information and identifies trustworthy news better than the popular search engines.

show abstract

Named entity recognition using a character-based probabilistic approach

Cited by 11 publications

References 5 publications

Modeling Noisiness to Recognize Named Entities using Multitask Neural Networks on Social Media

Modeling Noisiness to Recognize Named Entities using Multitask Neural Networks on Social Media

Meta-learning orthographic and contextual models for language independent named entity recognition

An evidence‐based iterative content trust algorithm for the credibility of online news

Contact Info

Product

Resources

About