2022
DOI: 10.25080/majora-212e5952-01b
|View full text |Cite
|
Sign up to set email alerts
|

Design of a Scientific Data Analysis Support Platform

Abstract: Software data analytic workflows are a critical aspect of modern scientific research and play a crucial role in testing scientific hypotheses. A typical scientific data analysis life cycle in a research project must include several steps that may not be fundamental to testing the hypothesis, but are essential for reproducibility. This includes tasks that have analogs to software engineering practices such as versioning code, sharing code among research team members, maintaining a structured codebase, and track… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
1
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
1
1

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 9 publications
0
1
0
Order By: Relevance
“…Various solutions have been implemented to address some aspects of research experiment management, and in prior work, the authors explored some of these solutions and detailed a set of attributes to consider for comparing such systems: orchestration, parameterization, caching, provenance, portability, reporting, and scalability (Martindale et al, 2022). We briefly explore three of those here, Kedro (Linux Foundation AI & Data, 2019) and MLFlow (LLC, 2018), as well as Pachyderm (Pachyderm, 2014) to highlight what distinguishes Curifactory and the types of problems it is suited for.…”
Section: Statement Of Needmentioning
confidence: 99%
“…Various solutions have been implemented to address some aspects of research experiment management, and in prior work, the authors explored some of these solutions and detailed a set of attributes to consider for comparing such systems: orchestration, parameterization, caching, provenance, portability, reporting, and scalability (Martindale et al, 2022). We briefly explore three of those here, Kedro (Linux Foundation AI & Data, 2019) and MLFlow (LLC, 2018), as well as Pachyderm (Pachyderm, 2014) to highlight what distinguishes Curifactory and the types of problems it is suited for.…”
Section: Statement Of Needmentioning
confidence: 99%
“…Curifactory [MHSA22] is a Python API and CLI tool for organizing, tracking, reproducing, and exporting computational research experiments and data analysis workflows. It is intended primarily for smaller teams conducting research, rather than productionlevel or large-scale ML projects.…”
Section: Curifactorymentioning
confidence: 99%