2022
DOI: 10.3389/fmars.2022.940844
|View full text |Cite
|
Sign up to set email alerts
|

Automating the Curation Process of Historical Literature on Marine Biodiversity Using Text Mining: The DECO Workflow

Abstract: Historical biodiversity documents comprise an important link to the long-term data life cycle and provide useful insights on several aspects of biodiversity research and management. However, because of their historical context, they present specific challenges, primarily time- and effort-consuming in data curation. The data rescue process requires a multidisciplinary effort involving four tasks: (a) Document digitisation (b) Transcription, which involves text recognition and correction, and (c) Information Ext… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 105 publications
0
2
0
Order By: Relevance
“…The growing volume of scientific literature on biodiversity has led to a focus on the development of computational methods for extracting meaningful information from unstructured textual data (Farrell et al, 2022 ; Paragkamian et al, 2022 ). This computational task is known as text mining, and it has been used to identify trends, patterns, and relationships that would otherwise be difficult to detect.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The growing volume of scientific literature on biodiversity has led to a focus on the development of computational methods for extracting meaningful information from unstructured textual data (Farrell et al, 2022 ; Paragkamian et al, 2022 ). This computational task is known as text mining, and it has been used to identify trends, patterns, and relationships that would otherwise be difficult to detect.…”
Section: Related Workmentioning
confidence: 99%
“…Information extraction (IE) is an umbrella term for tasks that seek to automatically extract structured information from unstructured text. With the exponential growth of digitized literature over the years, IE has become increasingly pertinent, due to its role in (semi-)automatically populating databases with content (Ravikumar et al, 2015 ; Lee et al, 2018 ; Paragkamian et al, 2022 ). Relation extraction (RE) is an IE task that is concerned with the identification of semantic relationships between entities or concepts in text.…”
Section: Introductionmentioning
confidence: 99%