“…We assess our proposal on several datasets representative of possible applications of our similarity learning method (the name of each dataset describes the nature of the data and the type of the entities to be extracted): HTML-href [14,13,11], Log-MAC+IP [14,13,11], Email-Phone [14,13,11,8,7], Bills-Date [14,12], Web-URL [14,13,11,7], Twitter-URL [14,13,11]. Each dataset consists of a text annotated with all and only the snippets that should be extracted.…”