Delivered at the CRIS2014 Conference in Rome; published in Procedia Computer Science 33 (Jul 2014).Contains conference paper (8 pages) and presentation (12 slides).Research information, i.e., data about research projects, organisations, researchers or research outputs such as publications or patents, is spread across the web, usually residing on institutional and personal web pages or in semi-open databases and information systems. While there exists a wealth of unstructured information, the limited amounts of structured data often are exposed following proprietary or less-established schemas and interfaces. Therefore, a holistic view on research information across organisational and national boundaries is not feasible and information is inconsistent and incomplete. On the other hand, web crawling and information extraction techniques have matured throughout the last decade, allowing for automated approaches of harvesting, extracting and consolidating research information into a more coherent knowledge graph. In particular the Linked Data community has provided a range of techniques,
schemas and vocabularies which allow to represent and interlink research information in a more coherent manner. In this work, we give an overview of the current state of the art in research information sharing on the web and present initial ideas towards a more holistic approach for boot-strapping research information from available web sources
INTRODUCCIÓN: EL CONCEPTO DE LA MEDIOESTRUCTURAEn la presente comunicación yamos a tratar de caracterizar la interrelación entre el significado de los elementos lexicos y su potencial combinatorio sintagmatico. Partimos de la hipótesis de que la correlación del significado eon el nivel sintagmatico nos permite elaborar criterios que posibilitan la descripción de la llamada medioestructura semdntica (Wotjak, en prensa). Como medioestructura designamos el conjunto de las microestructuras eon la misma forma morfofonológica o, desde un punto de vista mas bien lexicografico, el conjunto de las acepciones que aparece bajo un lema determinado. Las microestructuras constituyen "invariantes semanticas, sememizadas, sistemicas usualizadas [...] y socializadas, es decir, compartidas virtualmente por todos los hablantes de una misma comunidad linguistica y/o comunicativa [...]" (Wotjak, en prensa). Consideramos las micro estructuras como unidades basicas de la descripción lexica. Equivalen a las unida-
Based on the example of the State and University Library Bremen (SuUB) we will outline in this paper, how academic libraries with digitization activities (hereinafter referred to as digitizing libraries) could establish even closer ties to CLARIN in the future. After describing SuUB's past and current CLARIN-related activities (especially full-text transfers to a CLARIN-D centre) we suggest that this collaboration could be expanded by providing advice and training for researchers of the Digital Humanities as potential CLARIN users. Equally important from our point of view is the discussion about future structural options on the level of research infrastructures. We suggest a collaboration between digitizing libraries to jointly agree upon standards of data quality, file formats, interfaces and web services. We discuss the foundation of local CLARIN contact points to pass scholars and researchers on to the respective contact or service of CLARIN. The relevance to CLARIN activities, resources, tools or services is described at the end of each respective section. From the conclusions, the reader will notice: It is the right time for change.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.