Structured data such as databases, spreadsheets and web tables is becoming critical in every domain and professional role. Yet we still do not know much about how people interact with it. Our research focuses on the information seeking behaviour of people looking for new sources of structured data online, including the task context in which the data will be used, data search, and the identification of relevant datasets from a set of possible candidates. We present a mixed-methods study covering in-depth interviews with 20 participants with various professional backgrounds, supported by the analysis of search logs of a large data portal. Based on this study, we propose a framework for human structured-data interaction and discuss challenges people encounter when trying to find and assess data that helps their daily work. We provide design recommendations for data publishers and developers of online data platforms such as data catalogs and marketplaces. These recommendations highlight important questions for HCI research to improve how people engage and make use of this incredibly useful online resource.
Large amounts of data are becoming increasingly available online. In order to benefit from it we need tools to retrieve the most relevant datasets that match ones data needs. Several vocabularies have been developed to describe datasets in order to increase their discoverability, but for data publishers is costly to cumbersome to annotate them using all, leading to the question of what properties are more important. In this work we contribute with a systematic study of the patterns and specific attributes that data consumers use to search for data and how it compares with general web search. We performed a query log analysis based on logs from four national open data portals and conducted a qualitative analysis of user data requests for requests issued to one of them. Search queries issued on data portals differ from those issued to web search engines in their length, topic, and structure. Based on our findings we hypothesise that portals search functionalities are currently used in an exploratory manner, rather than to retrieve a specific resource. In our study of data requests we found that geospatial and temporal attributes, as well as information on the required granularity of the data are the most common features. The findings of both analyses suggest that these features are of higher importance in dataset retrieval in contrast to general web search, suggesting that efforts of dataset publishers should focus on generating dataset descriptions including them.
Summarising data as text helps people make sense of it. It also improves data discovery, as search algorithms can match this text against keyword queries. In this paper, we explore the characteristics of text summaries of data in order to understand how meaningful summaries look like. We present two complementary studies: a data-search diary study with 69 students, which offers insight into the information needs of people searching for data; and a summarisation study, with a lab and a crowdsourcing component with overall 80 data-literate participants, which produced summaries for 25 datasets. In each study we carried out a qualitative analysis to identify key themes and commonly mentioned dataset attributes, which people consider when searching and making sense of data. The results helped us design a template to create more meaningful textual representations of data, alongside guidelines for improving data-search experience overall.
No abstract
Universities, like cities, have embraced novel technologies and data-based solutions to improve their campuses with ‘smart’ becoming a welcomed concept. Campuses in many ways are small-scale cities. They increasingly seek to address similar challenges and to deliver improved experiences to their users. How can data be used in making this vision a reality? What can we learn from smart campuses that can be scaled up to smart cities? A short research study was conducted over a three-month period at a public university in the United Kingdom, employing stakeholder interviews and user surveys, which aimed to gain insight into these questions. Based on the study, the authors suggest that making data publicly available could bring many benefits to different groups of stakeholders and campus users. These benefits come with risks and challenges, such as data privacy and protection and infrastructure hurdles. However, if these challenges can be overcome, then open data could contribute significantly to improving campuses and user experiences, and potentially set an example for smart cities.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.