Purpose-This paper presents a technical perspective when implementing the Slovenian open access infrastructure that consists of four institutional repositories and a national portal that aggregates content from the repositories in order to provide a common search engine, recommendations of similar documents, and similar text detection. Design/methodology/approach-During the project, the necessary legal background and processes for mandatory submissions of final study works, research publications and research data were established, as well as processes for data exchange between the institutional repositories and the national portal, and processes for similar text detection. Findings-The consortium consisted of four Slovenian universities that significantly differ in size, organisation, and workflows. It was anticipated that exactly the same legal background and software would be used for the four repositories. It turned out that complete unification was impossible due to the differences. Practical implication-The national open access infrastructure will improve the visibility of Slovenian research organisations. It supports the compliance with the funders' open access mandates. The established infrastructure enables the depositing and archiving of approximately eighty percent of the peer-reviewed scientific publications that are annually published by Slovenian researchers. At the same time, the majority of final study works from Slovenian higher education institutions are available in full-text format. Originality/value-This paper describes a technical perspective for setting up a national open access infrastructure, which has not been described in the literature previously.
This paper presents a hybrid document recommender system intended for use in digital libraries and institutional repositories that are part of the Slovenian Open Access Infrastructure. The recommender system provides recommendations of similar documents across different digital libraries and institutional repositories with the aim to connect researchers and improve collaboration efforts. The hybrid recommender system makes use of document processing techniques, document metadata, and the similarity ranking function BM25 to provide content-based recommendations as a primary method. It also uses collaborative-filtering methods as a secondary method in a cascade hybrid recommendation technique. We also provide a real-world data feedback collection analysis for our hybrid recommender system on an academic digital repository in order to be able to identify suitable time-frames for direct feedback collection during the year.
The OpenScience Slovenia metadata dataset contains metadata entries for Slovenian public domain academic documents which include undergraduate and postgraduate theses, research and professional articles, along with other academic document types. The data within the dataset was collected as a part of the establishment of the Slovenian Open-Access Infrastructure which defined a unified document collection process and cataloguing for universities in Slovenia within the infrastructure repositories. The data was collected from several already established but separate library systems in Slovenia and merged into a single metadata scheme using metadata deduplication and merging techniques. It consists of text and numerical fields, representing attributes that describe documents. These attributes include document titles, keywords, abstracts, typologies, authors, issue years and other identifiers such as URL and UDC. The potential of this dataset lies especially in text mining and text classification tasks and can also be used in development or benchmarking of content-based recommender systems on real-world data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.