Learning Text Patterns Using Separate-and-Conquer Genetic Programming

Bartoli, Alberto; Lorenzo, Andrea De; Medvet, Eric; Tarlao, Fabiano

doi:10.1007/978-3-319-16501-1_2

Cited by 19 publications

(23 citation statements)

References 19 publications

(23 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Then, once a regular expression is found that provides adequate performance on a subset of the examples, we restart the evolutionary search from the scratch by using only the remaining examples that are not yet solved adequately. This procedure is also inspired by a recent proposal designed for extraction of short text snippets [5]: differently from the cited paper, here we focus on classification instead of extraction and allow the generation of regular expressions that do not exhibit perfect precision.…”

Section: Our Approachmentioning

confidence: 99%

Evolutionary Learning of Syntax Patterns for Genic Interaction Extraction

Bartoli

Lorenzo

Medvet

et al. 2015

Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation

Self Cite

View full text Add to dashboard Cite

There is an increasing interest in the development of techniques for automatic relation extraction from unstructured text. The biomedical domain, in particular, is a sector that may greatly benefit from those techniques due to the huge and ever increasing amount of scientific publications describing observed phenomena of potential clinical interest.\ud In this paper, we consider the problem of automatically identifying sentences that contain interactions between genes and proteins, based solely on a dictionary of genes and proteins and a small set of sample sentences in natural language. We propose an evolutionary technique for learning a classifier that is capable of detecting the desired sentences within scientific publications with high accuracy. The key feature of our proposal, that is internally based on Genetic Programming, is the construction of a model of the relevant syntax patterns in terms of standard part-of-speech annotations. The model consists of a set of regular expressions that are learned automatically despite the large alphabet size involved.\ud We assess our approach on two realistic datasets and obtain 77% accuracy, a value sufficiently high to be of practical interest and that is in line with significant baseline methods

show abstract

Section: Our Approachmentioning

confidence: 99%

Evolutionary Learning of Syntax Patterns for Genic Interaction Extraction

Bartoli

Lorenzo

Medvet

et al. 2015

Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation

Self Cite

View full text Add to dashboard Cite

show abstract

“…Our algorithm, like theirs, has an evolutionary search phase using the separate-and-conquer strategy, followed by an improvement phase. The separate-and-conquer strategy [BLMT15], which in the context of policy mining means learning one rule at a time, instead of an entire policy at once, is essential to obtain good results. We also adopt their fitness function, which, in turn, is based on Xu and Stoller's rule quality metric [XS15].…”

Section: Policy Miningmentioning

confidence: 99%

Greedy and evolutionary algorithms for mining relationship-based access control policies

Bui

Stoller

2019

Computers & Security

View full text Add to dashboard Cite

Relationship-based access control (ReBAC) provides a high level of expressiveness and flexibility that promotes security and information sharing. We formulate ReBAC as an object-oriented extension of attribute-based access control (ABAC) in which relationships are expressed using fields that refer to other objects, and path expressions are used to follow chains of relationships between objects.ReBAC policy mining algorithms have potential to significantly reduce the cost of migration from legacy access control systems to ReBAC, by partially automating the development of a ReBAC policy from an existing access control policy and attribute data. This paper presents two algorithms for mining ReBAC policies from access control lists (ACLs) and attribute data represented as an object model: a greedy algorithm guided by heuristics, and a grammar-based evolutionary algorithm. An evaluation of the algorithms on four sample policies and two large case studies demonstrates their effectiveness.An access control list (ACL) policy is a tuple CM , OM , Act, SP 0 , where CM is a class model, OM is an object model, Act is a set of actions, and SP 0 ⊆ OM × OM × Act is a subject-permission relation. Conceptually, SP 0 is the union of the resources' access control lists.An ReBAC policy π is consistent with an ACL policy CM , OM , Act, SP 0 if they have the same class model, object model, and actions and [[π]] = SP 0 .An ReBAC policy consistent with a given ACL policy can be trivially constructed, by creating a separate

show abstract

“…Experiments showed the validity of the approach when compared to standard techniques for the task at hand. Other applications of ensemble methods to GP includes the use of querying-by-committee methods [26,2] and of a divide-andconquer strategy, in which ax solution need to work well only on a subset of the entire training set [31,1] With respect to ensembles of regression models, a quite recent contribution was proposed in [38]. The idea explored by the authors was to generate several regression models by concurrently executing multiple independent instances of a GP and, subsequently to analyze several strategies for fusing predictions from the multiple regression models.…”

Section: Related Workmentioning

confidence: 99%

Pruning Techniques for Mixed Ensembles of Genetic Programming Models

Castelli

Gonçalves

Manzoni

et al. 2018

Lecture Notes in Computer Science

View full text Add to dashboard Cite

The objective of this paper is to define an effective strategy for building an ensemble of Genetic Programming (GP) models. Ensemble methods are widely used in machine learning due to their features: they average out biases, they reduce the variance and they usually generalize better than single models. Despite these advantages, building ensemble of GP models is not a well-developed topic in the evolutionary computation community. To fill this gap, we propose a strategy that blends individuals produced by standard syntax-based GP and individuals produced by geometric semantic genetic programming, one of the newest semantics-based method developed in GP. In fact, recent literature showed that combining syntax and semantics could improve the generalization ability of a GP model. Additionally, to improve the diversity of the GP models used to build up the ensemble, we propose different pruning criteria that are based on correlation and entropy, a commonly used measure in information theory. Experimental results, obtained over different complex problems, suggest that the pruning criteria based on correlation and entropy could be effective in improving the generalization ability of the ensemble model and in reducing the computational burden required to build it.

show abstract

Learning Text Patterns Using Separate-and-Conquer Genetic Programming

Cited by 19 publications

References 19 publications

Evolutionary Learning of Syntax Patterns for Genic Interaction Extraction

Evolutionary Learning of Syntax Patterns for Genic Interaction Extraction

Greedy and evolutionary algorithms for mining relationship-based access control policies

Pruning Techniques for Mixed Ensembles of Genetic Programming Models

Contact Info

Product

Resources

About