2019
DOI: 10.18352/lq.10285
|View full text |Cite
|
Sign up to set email alerts
|

Annif: DIY automated subject indexing using multiple algorithms

Abstract: Manually indexing documents for subject-based access is a labour-intensive process. We propose using metadata gathered from bibliographic databases to train algorithms that assist librarians in that work. We have developed Annif, an open source tool and microservice for automated subject indexing. After training it with a subject vocabulary and existing metadata, Annif can be used to assign subject headings for new documents. We have tested Annif with different document collections including scientific papers,… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
21
0
8

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
4
1

Relationship

1
8

Authors

Journals

citations
Cited by 27 publications
(29 citation statements)
references
References 10 publications
0
21
0
8
Order By: Relevance
“…Annif, an automated subject indexing tool currently being tested and implemented at the National Library of Finland, is also comparable to our approach (Suominen, 2019). Annif annotates terms from different subject vocabularies and thesauri to documents based on textual information, such as abstracts and/or titles.…”
Section: Using Textual Data and Machine Learning To Cluster Or Classify (Social Science) Publicationsmentioning
confidence: 99%
“…Annif, an automated subject indexing tool currently being tested and implemented at the National Library of Finland, is also comparable to our approach (Suominen, 2019). Annif annotates terms from different subject vocabularies and thesauri to documents based on textual information, such as abstracts and/or titles.…”
Section: Using Textual Data and Machine Learning To Cluster Or Classify (Social Science) Publicationsmentioning
confidence: 99%
“…Sisällönkuvailun avuksi on kehitetty työkaluja, joista yksi on tässä artikkelissa käsiteltävä Annif (Suominen 2019). Annifin toiminta perustuu erilaisiin kieliteknologiaa ja koneoppimista hyödyntäviin työkaluihin ja algoritmeihin, joita voi käyttää erikseen tai yhdessä toistensa kanssa.…”
Section: Automaattisen Sisällönkuvailun Työkalu Annifunclassified
“…1. Automated subject indexing of mathematical library inventories with the toolkit annif [13]. The optimal input format for classifications and vocabularies is a Turtle serialisation.…”
Section: Msc 2020 Skosificationmentioning
confidence: 99%