2017
DOI: 10.1007/978-3-319-67162-8_15
|View full text |Cite
|
Sign up to set email alerts
|

Evaluating Reference String Extraction Using Line-Based Conditional Random Fields: A Case Study with German Language Publications

Abstract: The extraction of individual reference strings from the reference section of scientific publications is an important step in the citation extraction pipeline. Current approaches divide this task into two steps by first detecting the reference section areas and then grouping the text lines in such areas into reference strings. We propose a classification model that considers every line in a publication as a potential part of a reference string. By applying line-based conditional random fields rather than constr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
4
3
2

Relationship

3
6

Authors

Journals

citations
Cited by 10 publications
(8 citation statements)
references
References 10 publications
(16 reference statements)
0
8
0
Order By: Relevance
“…The integrated search system is developed following the usercentered design process according to ISO 9241-201:2010 17 . Table 3 provides an overview of the performed user studies to understand the context of use, to specify user requirements, to evaluate design decisions and usability.…”
Section: User-centered Design Processmentioning
confidence: 99%
“…The integrated search system is developed following the usercentered design process according to ISO 9241-201:2010 17 . Table 3 provides an overview of the performed user studies to understand the context of use, to specify user requirements, to evaluate design decisions and usability.…”
Section: User-centered Design Processmentioning
confidence: 99%
“…This is done by parsing the references that are used within a table. For this, we use the state-of-the-art PDF extraction tool GROBID [13]. GROBID focuses specifically on extracting bibliographic data from scholarly articles [19].…”
Section: Related Workmentioning
confidence: 99%
“…Instead of SVM, the approaches proposed by Patrice Lopez [13] and Körner it al. [11] use CRF to extract reference strings in view of its capability to model decision boundaries among different classes. A popular reference extracting tool, called ParsCit [5], uses a set of heuristics to identify the references by scanning the entire document for section headers such as "Reference", "Bibliography", "Notes" or any possible variations.…”
Section: Reference Extractionmentioning
confidence: 99%