2011
DOI: 10.1177/1473871611415994
|View full text |Cite
|
Sign up to set email alerts
|

Research directions in data wrangling: Visualizations and transformations for usable and credible data

Abstract: In spite of advances in technologies for working with data, analysts still spend an inordinate amount of time diagnosing data quality issues and manipulating data into a usable form. This process of ‘data wrangling’ often constitutes the most tedious and time-consuming aspect of analysis. Though data cleaning and integration arelongstanding issues in the database community, relatively little research has explored how interactive visualization can advance the state of the art. In this article, we review the cha… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

7
183
0
1

Year Published

2012
2012
2023
2023

Publication Types

Select...
6
2

Relationship

2
6

Authors

Journals

citations
Cited by 278 publications
(191 citation statements)
references
References 53 publications
(61 reference statements)
7
183
0
1
Order By: Relevance
“…Researchers have also advocated the use of visualization across more phases of the analysis life-cycle [22]. Our analysis corroborates this suggestion.…”
Section: Related Worksupporting
confidence: 81%
See 1 more Smart Citation
“…Researchers have also advocated the use of visualization across more phases of the analysis life-cycle [22]. Our analysis corroborates this suggestion.…”
Section: Related Worksupporting
confidence: 81%
“…Such data wrangling, munging, or cleaning [22] involves parsing text files, manipulating data layout and integrating multiple data sources. This process, whether managed by IT staff or by analysts, was often time consuming and tedious.…”
Section: Wranglingmentioning
confidence: 99%
“…The main functions that can be utilized is the family of functions, get_stations, get_timeseries, get_data, etc., to easily download JSON and TXT files as tidy data frames (Wickham 2014). The internal databases of the package can be used to run queries on the available stations and time series, reducing the time needed for downloading and data wrangling (Kandel et al 2011), as these data are rarely modified.…”
Section: Discussionmentioning
confidence: 99%
“…Firstly, primarily as a visual data cleaning tool it has commonalities with the recent direction in visualisation research towards data wrangling [3]. This is the visually-aided process of transforming raw or problematic data into a more usable form, which includes data cleaning.…”
Section: Related Workmentioning
confidence: 99%
“…In their further work, [3] outlines a number of directions in which data wrangling can be applied, and amongst these are visualising raw data, removal of errors, and visualization of missing data, all of which apply to our situation. There is no requirement for any structural data transformation, as in our case the output data necessarily has to be in the same format as the input data.…”
Section: Related Workmentioning
confidence: 99%