2014
DOI: 10.1162/coli_a_00167
|View full text |Cite
|
Sign up to set email alerts
|

Learning Representations for Weakly Supervised Natural Language Processing Tasks

Abstract: Finding the right representations for words is critical for building accurate NLP systems when domain-specific labeled data for the task is scarce. This article investigates novel techniques for extracting features from n-gram models, Hidden Markov Models, and other statistical language models, including a novel Partial Lattice Markov Random Field model. Experiments on part-of-speech tagging and information extraction, among other tasks, indicate that features taken from statistical language models, in combina… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
32
0

Year Published

2014
2014
2024
2024

Publication Types

Select...
5
4

Relationship

1
8

Authors

Journals

citations
Cited by 49 publications
(34 citation statements)
references
References 56 publications
0
32
0
Order By: Relevance
“…They have recently been shown to capture both semantic and syntactic information about words very well, setting performance records in several word similarity tasks (Mikolov et al, 2013;Pennington et al, 2014). Using word embeddings that have been trained a priori has become common practice for enhancing many other NLP tasks (Parikh et al, 2014;Huang et al, 2014). A common method of training a neural network is to randomly initialize all parameters and then optimize them using an optimization algorithm.…”
Section: Word Embeddingsmentioning
confidence: 99%
“…They have recently been shown to capture both semantic and syntactic information about words very well, setting performance records in several word similarity tasks (Mikolov et al, 2013;Pennington et al, 2014). Using word embeddings that have been trained a priori has become common practice for enhancing many other NLP tasks (Parikh et al, 2014;Huang et al, 2014). A common method of training a neural network is to randomly initialize all parameters and then optimize them using an optimization algorithm.…”
Section: Word Embeddingsmentioning
confidence: 99%
“…Fortunately, there is some indication that other typical measures of extraction performance, like precision and recall of extracted relations, correlates with the standard perplexity metric used in language modeling. In Figure 1, we show experiments from previous work [29] that demonstrate how the perplexity of a Hidden Markov Model correlates strongly with the model's accuracy in a standard "set expansion" WIE task.…”
Section: The Nl Objectivementioning
confidence: 98%
“…This objective has [29]). Number labels indicate the number of latent states in the HMM, and performance is shown for three training corpus sizes (the full corpus consists of approximately 60 million tokens).…”
Section: The Nl Objectivementioning
confidence: 99%
“…This method aims to extract the knowledge of previously trained models from source domains and use it to facilitate the training procedure of the learning tasks in target domains where there may be limited labeled data. Till now, transfer learning has been widely applied in image recognition [14,15,16], natural language process [17,18,19] and robotics [20], and achieved a big success. Yet, the applications in marketing campaign analysis are not many.…”
Section: Introductionmentioning
confidence: 99%