Proceedings of the 9th Web as Corpus Workshop (WaC-9) 2014
DOI: 10.3115/v1/w14-0403
|View full text |Cite
|
Sign up to set email alerts
|

Less Destructive Cleaning of Web Documents by Using Standoff Annotation

Abstract: Standoff annotation, that is, the separation of primary data and markup, can be an interesting option to annotate web pages since it does not demand the removal of annotations already present in web pages. We will present a standoff serialization that allows for annotating wellformed web pages with multiple annotation layers in a single instance, easing processing and analyzing of the data.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 7 publications
(6 reference statements)
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?