2015
DOI: 10.1007/978-3-319-16501-1_2
|View full text |Cite
|
Sign up to set email alerts
|

Learning Text Patterns Using Separate-and-Conquer Genetic Programming

Abstract: Abstract. The problem of extracting knowledge from large volumes of unstructured textual information has become increasingly important. We consider the problem of extracting text slices that adhere to a syntactic pattern and propose an approach capable of generating the desired pattern automatically, from a few annotated examples. Our approach is based on Genetic Programming and generates extraction patterns in the form of regular expressions that may be input to existing engines without any post-processing. K… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
21
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
4
2
1

Relationship

5
2

Authors

Journals

citations
Cited by 19 publications
(23 citation statements)
references
References 19 publications
(23 reference statements)
0
21
0
Order By: Relevance
“…Then, once a regular expression is found that provides adequate performance on a subset of the examples, we restart the evolutionary search from the scratch by using only the remaining examples that are not yet solved adequately. This procedure is also inspired by a recent proposal designed for extraction of short text snippets [5]: differently from the cited paper, here we focus on classification instead of extraction and allow the generation of regular expressions that do not exhibit perfect precision.…”
Section: Our Approachmentioning
confidence: 99%
“…Then, once a regular expression is found that provides adequate performance on a subset of the examples, we restart the evolutionary search from the scratch by using only the remaining examples that are not yet solved adequately. This procedure is also inspired by a recent proposal designed for extraction of short text snippets [5]: differently from the cited paper, here we focus on classification instead of extraction and allow the generation of regular expressions that do not exhibit perfect precision.…”
Section: Our Approachmentioning
confidence: 99%
“…Our algorithm, like theirs, has an evolutionary search phase using the separate-and-conquer strategy, followed by an improvement phase. The separate-and-conquer strategy [BLMT15], which in the context of policy mining means learning one rule at a time, instead of an entire policy at once, is essential to obtain good results. We also adopt their fitness function, which, in turn, is based on Xu and Stoller's rule quality metric [XS15].…”
Section: Policy Miningmentioning
confidence: 99%
“…Experiments showed the validity of the approach when compared to standard techniques for the task at hand. Other applications of ensemble methods to GP includes the use of querying-by-committee methods [26,2] and of a divide-andconquer strategy, in which ax solution need to work well only on a subset of the entire training set [31,1] With respect to ensembles of regression models, a quite recent contribution was proposed in [38]. The idea explored by the authors was to generate several regression models by concurrently executing multiple independent instances of a GP and, subsequently to analyze several strategies for fusing predictions from the multiple regression models.…”
Section: Related Workmentioning
confidence: 99%