Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing 2018
DOI: 10.18653/v1/d18-1189
|View full text |Cite
|
Sign up to set email alerts
|

SemRegex: A Semantics-Based Approach for Generating Regular Expressions from Natural Language Specifications

Abstract: Recent research proposes syntax-based approaches to address the problem of generating programs from natural language specifications. These approaches typically train a sequence-to-sequence learning model using a syntax-based objective: maximum likelihood estimation (MLE). Such syntax-based approaches do not effectively address the goal of generating semantically correct programs, because these approaches fail to handle Program Aliasing, i.e., semantically equivalent programs may have many syntactically differe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
34
0
1

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 29 publications
(36 citation statements)
references
References 23 publications
(30 reference statements)
0
34
0
1
Order By: Relevance
“…Instead of using input-output examples, there are other approaches that synthesize regexes solely from natural language [9,12,27]. We see these approaches as orthogonal to ours and expect that Forest can be improved by hints provided by a natural language component such as was done in Regel.…”
Section: Related Workmentioning
confidence: 90%
See 1 more Smart Citation
“…Instead of using input-output examples, there are other approaches that synthesize regexes solely from natural language [9,12,27]. We see these approaches as orthogonal to ours and expect that Forest can be improved by hints provided by a natural language component such as was done in Regel.…”
Section: Related Workmentioning
confidence: 90%
“…Form validations often rely on complex regexes which require programming skills that not all users possess. To help users write regexes, prior work has proposed to synthesize regular expressions from natural language [1,9,12,27] or from positive and negative examples [1,7,10,26]. Even though these techniques assist users in writing regexes for search and replace operations, they do not specifically target digital form validation and do not take advantage of the structured format of the data.…”
Section: Introductionmentioning
confidence: 99%
“…The average accuracy of 10 evaluations is given. The distinguishing test cases method is based on the membership test of samples for the case when an oracle is not available and is described in (Zhong et al, 2018a). The accuracy of SoftRegex is similar to or better than SemRegex (Oracle) and always better than Deep Regex and SemRegex (Distinguishing Test Cases).…”
Section: Model Performancementioning
confidence: 99%
“…Recently, Locascio et al (2016) designed the Deep-Regex model based on the sequence-to-sequence (Seq2Seq) model (Sutskever et al, 2014) using minimal domain knowledge during the learning phase while still accurately predicting regular expressions from NLs. Later, Zhong et al (2018a) improved the performance by training on not only syntactic content of the expressions (i.e. the exact textual representation of the expression that was used), but also the semantic content (the regular language described by the expression).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation