Purpose Publishing research data for reuse has become good practice in recent years. However, not much is known on how researchers actually find said data. In this exploratory study, we observe the information-seeking behaviour of social scientists searching for research data to reveal impediments and identify opportunities for data search infrastructure.Methods We asked 12 participants to search for research data and observed them in their natural environment. The sessions were recorded. Afterwards, we conducted semi-structured interviews to get a thorough understanding of their way of searching. From the recordings, we extracted the interaction behaviour of the participants and analysed the spoken words both during the search task and the interview by creating affinity diagrams.Results We found that literature search is more closely intertwined with dataset search than previous literature suggests. Both the search itself and the relevance assessment are very complex, and many different strategies are employed, including the creatively “misuse” of existing tools, since no appropriate tools exist or are unknown to the participants.Conclusion Many of the issues we found relate directly or indirectly to the application of the FAIR principles, but some, like a greater need for dataset search literacy, go beyond that. Both infrastructure and tools offered for dataset search could be tailored more tightly to the observed work processes, particularly by offering more interconnectivity between datasets, literature, and other relevant materials.
In this paper we present the results of a user study on exploratory search activities in a social science digital library. We conducted a user study with 32 participants with a social sciences background -16 postdoctoral researchers and 16 students -who were asked to solve a task on searching related work to a given topic. The exploratory search task was performed in a 10-minutes time slot. The use of certain search activities is measured and compared to gaze data recorded with an eye tracking device. We use a novel tree graph representation to visualise the users' search patterns and introduce a way to combine multiple search session trees. The tree graph representation is capable to create one single tree for multiple users and to identify common search patterns. In addition, the information behaviour of students and postdoctoral researchers is being compared. The results show that search activities on the stratagem level are frequently utilised by both user groups. The most heavily used search activities were keyword search, followed by browsing through references and citations, and author searching. The eye tracking results showed an intense examination of documents metadata, especially on the level of citations and references. When comparing the group of students and postdoctoral researchers we found significant differences regarding gaze data on the area of the journal name of the seed document. In general, we found a tendency of the postdoctoral researchers to examine the metadata records more intensively with regards to dwell time and the number of fixations. By creating combined session trees and deriving subtrees from those, we were able to identify common patterns like economic (explorative) and exhaustive (navigational) behaviour. Our results show that participants utilised multiple search strategies starting from the seed document, which means, that they examined different paths to find related publications.
In this paper, we investigate the retrievability of datasets and publications in a real-life Digital Library (DL). The measure of retrievability was originally developed to quantify the influence that a retrieval system has on the access to information. Retrievability can also enable DL engineers to evaluate their search engine to determine the ease with which the content in the collection can be accessed. Following this methodology, in our study, we propose a system-oriented approach for studying dataset and publication retrieval. A speciality of this paper is the focus on measuring the accessibility biases of various types of DL items and including a metric of usefulness. Among other metrics, we use Lorenz curves and Gini coefficients to visualize the differences of the two retrievable document types (specifically datasets and publications). Empirical results reported in the paper show a distinguishable diversity in the retrievability scores among the documents of different types.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.