Structured abstract PurposeThe paper proposes a tool that generates authority files to be integrated with Linked Data by means of learning rules.AUTHORIS is software developed to enhance authority control and information exchange among bibliographic and non-bibliographic entities. Design/methodology/approachThe article analyzes different methods previously developed for authority control as well as IFLA and ALA standards for managing bibliographic records. Semantic Web technologies are also evaluated.AUTHORIS relies on Drupal and incorporates the protocols of Dublin Core, SIOC, SKOS and FOAF. The tool has also taken into account the obsolescence of MARC and its substitution by FRBR and RDA.Its effectiveness was evaluated applying a learning test proposed by RDA (2011). Over 80% of the actions were carried out correctly. FindingsThe use of learning rules and the facilities of Linked Data make it easier for information organizations to reutilize products for authority control and distribute them in a fair and efficient manner. Research limitations/implicationThe ISAD-G records were the ones presenting most errors. EAD was found to be second in the number of errors produced. The rest of the formats -MARC 21, Dublin Core, FRAD, RDF, OWL, XBRL and FOAF-showed fewer than 20 errors in total. Practical implicationsAUTHORIS offers institutions the means of sharing data with a high level of stability, helping to detect records that are duplicated and contributing to lexical disambiguation and data enrichment. Originality/valueThe software combines the facilities of Linked Data, the potency of the algorithms for converting bibliographic data, and the precision of learning rules.
Purpose Information from Current Research Information Systems (CRIS) is stored in different formats, in platforms that are not compatible, or even in independent networks. It would be helpful to have a well-defined methodology to allow for management data processing from a single site, so as to take advantage of the capacity to link disperse data found in different systems, platforms, sources and/or formats. Based on functionalities and materials of the VLIR project, the purpose of this paper is to present a model that provides for interoperability by means of semantic alignment techniques and metadata crosswalks, and facilitates the fusion of information stored in diverse sources. Design/methodology/approach After reviewing the state of the art regarding the diverse mechanisms for achieving semantic interoperability, the paper analyzes the following: the specific coverage of the data sets (type of data, thematic coverage and geographic coverage); the technical specifications needed to retrieve and analyze a distribution of the data set (format, protocol, etc.); the conditions of re-utilization (copyright and licenses); and the “dimensions” included in the data set as well as the semantics of these dimensions (the syntax and the taxonomies of reference). The semantic interoperability framework here presented implements semantic alignment and metadata crosswalk to convert information from three different systems (ABCD, Moodle and DSpace) to integrate all the databases in a single RDF file. Findings The paper also includes an evaluation based on the comparison – by means of calculations of recall and precision – of the proposed model and identical consultations made on Open Archives Initiative and SQL, in order to estimate its efficiency. The results have been satisfactory enough, due to the fact that the semantic interoperability facilitates the exact retrieval of information. Originality/value The proposed model enhances management of the syntactic and semantic interoperability of the CRIS system designed. In a real setting of use it achieves very positive results.
El objeto de este trabajo es elaborar y evaluar los resultados de la implementación de un software basado en ontologías, capaz de generar resúmenes automáticos en el campo de la Ingeniería de Puertos y Costas. Para el desarrollo de la herramienta se emplean diversas técnicas emanadas del análisis de discurso así como técnicas cognitivas, que permiten generar reglas para el tratamiento de los textos. También se apela a la construcción de una ontología que facilite los procesos de etiquetado a partir de las potencialidades de Resource Description Framework y Extensible Makup Language. Se construye un conjunto de agentes que actúa sobre la ontología, de la cual se declaran sus principales elementos. Como producto generado se presenta Puertotex, un software para la construcción de resúmenes automáticos basado en ontologías. La evaluación de los resúmenes generados refleja la calidad del sistema, que tiene como única limitación su capacidad para trabajar con el dominio objeto de investigación.
Resumen: Se analizan los referentes teóricos y conceptuales de la evaluación de ontologías para conocer los procedimientos utilizados en la evaluación de estos sistemas y establecer nuevas pautas para calibrar el sistema ontológico empleado por el programa Satcol, especializado en Ingeniería de Puertos y Costas. En el trabajo se describen las características de la ontología Onto-Satcol y se evalúa la misma mediante el uso de varios indicadores (léxicos, de recuperación de información y de la estructura sintáctica). Mediante un experimento llevado a cabo por 6 expertos, y con la ayuda de la herramienta Protex, se identifi can inconsistencias semánticas, estructurales y errores en la organización del conocimiento de dicha ontología.Palabras clave: Ontologías, evaluación de ontologías, ontologías de dominio, análisis, evaluación de ontologías, Ingeniería de puertos y costas. A model for ontology evaluation. The case of Onto-Satcol
Purpose -The purpose of this paper is to look into the latest advances in ontology-based text summarization systems, with emphasis on the methodologies of a socio-cognitive approach, the structural discourse models and the ontology-based text summarization systems. Design/methodology/approach -The paper analyzes the main literature in this field and presents the structure and features of Texminer, a software that facilitates summarization of texts on Port and Coastal Engineering. Texminer entails a combination of several techniques, including: socio-cognitive user models, Natural Language Processing, disambiguation and ontologies. After processing a corpus, the system was evaluated using as a reference various clustering evaluation experiments conducted by Arco (2008) and Hennig et al. (2008). The results were checked with a support vector machine, Rouge metrics, the F-measure and calculation of precision and recall. Findings -The experiment illustrates the superiority of abstracts obtained through the assistance of ontology-based techniques. Originality/value -The authors were able to corroborate that the summaries obtained using Texminer are more efficient than those derived through other systems whose summarization models do not use ontologies to summarize texts. Thanks to ontologies, main sentences can be selected with a broad rhetorical structure, especially for a specific knowledge domain.
Resumen: En este trabajo se presentan las principales tecnologías de la Web Semántica que pueden ser de utilidad para la gestión de fondos archivísticos. Se examinan diversos proyectos de ámbito internacional y local que parten de descripciones normalizadas ISAD-G para generar ontologías, así como la disponibilidad de LIAM (Linked Archival Metadata), que facilita la transformación de datos de archivo a formato RDF (Resource Description Framework). Por otra parte, se analiza cómo la gestión de datos enlazados permite la interoperabilidad entre sistemas de información y la búsqueda facetada a partir de fondos documentales almacenados, descritos en OWL (Ontology Web Language), SKOS (Simple Knowledge Organization System) y Dublin Core. Los autores proponen la utilización de un CMS (Content Management System) que gestione fondos de archivo, compatible con SIOC (Semantically-Interlinked Online Communities) y OAI-PMH (Open Archives InitiativeProtocol Metadata Harvesting), para facilitar el intercambio y la recuperación de información. En concreto, se detallan las tecnologías que se han utilizado para desarrollar CoroArchivo, sistema que además se evalúa con un experimento que realiza la creación automática de ontologías a partir de descripciones ISAD-G almacenadas en DSpace. La herramienta desarrollada permite realizar consultas federadas sustentadas en las clases de exclusión e igualdad del vocabulario OWL.Palabras clave: Datos enlazados; ontologías; archivos; repositorios; servicios de información; Drupal; búsqueda federada; DSpace. Management of archival materials with Linked Data and federated queriesAbstract: In this paper the major technologies of the Semantic Web which may be useful for archives management are summarized. Several local and international projects that generate ontologies from standardized descriptions based on ISAD-G are examined. It is also discussed LIAM (Linked Archival Metadata), that facilitates the transformation of archive records into RFD (Resource Description Framework) format. Furthermore, we analyze how Linked Data enables interoperability between information systems and faceted search of OWL (Ontology Web Language), SKOS (Simple Knowledge Organization System) and Dublin Core records. The authors propose the use of a CMS (Content Management System) compatible with SIOC (SemanticallyInterlinked Online Communities) and OAI-PMH (Open Archives Initiative -Protocol for Metadata Harvesting) for archive records to improve the exchange and retrieval of information. We specifically describe the technologies used for developing CoroArchivo, system assessed by an experiment that automatically generates ontologies from ISAD-G records stored in DSpace. The evaluation tool lets users perform federated queries based on the OWL vocabulary disjointness and equivalent classes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.