Despite being popular end-user tools, spreadsheets suffer from the vulnerability of error-proneness. In software engineering, testing has been proposed as a way to address errors. It is important therefore to know whether spreadsheet users also test, or how do they test and to what extent, especially since most spreadsheet users do not have the training, or experience, of software engineering principles. Towards this end, we conduct a two-phase mixed methods study. First, a qualitative phase, in which we interview 12 spreadsheet users, and second, a quantitative phase, in which we conduct an online survey completed by 72 users. The outcome of the interviews, organized into four different categories, consists of an overview of test practices, perceptions of spreadsheet users about testing, a set of preventive measures for avoiding errors, and an overview of maintenance practices for ensuring correctness of spreadsheets over time. The survey adds to the findings by providing quantitative estimates indicating that ensuring correctness is an important concern, and a major fraction of users do test their spreadsheets. However, their techniques are largely manual and lack formalism. Tools and automated supports are rarely used.
Abstract-Spreadsheets are popular end-user computing applications and one reason behind their popularity is that they offer a large degree of freedom to their users regarding the way they can structure their data. However, this flexibility also makes spreadsheets difficult to understand. Textual documentation can address this issue, yet for supporting automatic generation of textual documentation, an important pre-requisite is to extract metadata inside spreadsheets. It is a challenge though, to distinguish between data and metadata due to the lack of universally accepted structural patterns in spreadsheets. Two existing approaches for automatic extraction of spreadsheet metadata were not evaluated on large datasets consisting of user inputs. Hence in this paper, we describe the collection of a large number of user responses regarding identification of spreadsheet metadata from participants of a MOOC. We describe the use of this large dataset to understand how users identify metadata in spreadsheets, and to evaluate two existing approaches of automatic metadata extraction from spreadsheets. The results provide us with directions to follow in order to improve metadata extraction approaches, obtained from insights about user perception of metadata. We also understand what type of spreadsheet patterns the existing approaches perform well and on what type poorly, and thus which problem areas to focus on in order to improve.
Automatically inferred invariants have been found to be successful in detecting regression faults in traditional software, but their application has not been explored in the context of spreadsheets. In this paper, we investigate the effectiveness of automatically inferred invariants in detecting regression faults in spreadsheets. We conduct an exploratory empirical study on eight spreadsheets taken from VEnron and EUSES corpora. We apply automatic invariant inference to them, create tests based on the inferred invariants, and finally seed the sheets with faults. Results indicate that the effectiveness of the inferred invariants, in terms of accuracy of fault detection, largely varies from spreadsheet to spreadsheet. The effectiveness is found to be affected by the formulas and data contained in the spreadsheets, and also by the type of faults to be detected.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.