Context: Computational materials science (CMS) focuses on
in silico experiments to compute the properties of known and
novel materials, where many software packages are used in the community.
The NOMAD Laboratory1 offers to store the input and output files in its
FAIR data repository. Since the file formats of these software packages
are non-standardized, parsers are used to provide the results in a
normalized format. Objective: The main goal of this article is
to report experience and findings of using grammar-based fuzzing on
these parsers. Method: We have constructed an input grammar for
four common software packages in the CMS domain and performed an
experimental evaluation on the capabilities of grammar-based fuzzing to
detect failures in the NOMAD parsers. Results: With our
approach, we were able to identify three unique critical bugs concerning
the service availability, as well as several additional syntactic,
semantic, logical, and downstream bugs in the investigated NOMAD
parsers. We reported all issues to the developer team prior to
publication. Conclusion: Based on the experience gained, we can
recommend grammar-based fuzzing also for other research software
packages to improve the trust level in the correctness of the produced
results.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.