2015
DOI: 10.1007/978-3-319-16462-5_12
|View full text |Cite
|
Sign up to set email alerts
|

Looking Inside the Black-Box: Capturing Data Provenance Using Dynamic Instrumentation

Abstract: Abstract. Knowing the provenance of a data item helps in ascertaining its trustworthiness. Various approaches have been proposed to track or infer data provenance. However, these approaches either treat an executing program as a black-box, limiting the fidelity of the captured provenance, or require developers to modify the program to make it provenance-aware. In this paper, we introduce DataTracker, a new approach to capturing data provenance based on taint tracking, a technique widely used in the security an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
33
0
1

Year Published

2015
2015
2020
2020

Publication Types

Select...
6
1

Relationship

2
5

Authors

Journals

citations
Cited by 34 publications
(35 citation statements)
references
References 26 publications
0
33
0
1
Order By: Relevance
“…3) Dynamic Instruction-Level Solution -DataTracker: DataTracker [12] is a tool that captures provenance using Dynamic Taint Analysis (DTA). The analysis is applied as Dynamic Binary Instrumentation (DBI) using the Intel Pin [13] and libdft [14] frameworks.…”
Section: B Provenance Collection Methods and Reportersmentioning
confidence: 99%
See 1 more Smart Citation
“…3) Dynamic Instruction-Level Solution -DataTracker: DataTracker [12] is a tool that captures provenance using Dynamic Taint Analysis (DTA). The analysis is applied as Dynamic Binary Instrumentation (DBI) using the Intel Pin [13] and libdft [14] frameworks.…”
Section: B Provenance Collection Methods and Reportersmentioning
confidence: 99%
“…In [12], sets of <file descriptor, offset> pairs are used for tracking the provenance of each memory location. In this work, we instead opted to use bitsets-where each bit represents a file descriptor.…”
Section: B Provenance Collection Methods and Reportersmentioning
confidence: 99%
“…Finally in the reproducibility discussion, three papers were presented: [85] is an approach to capturing data provenance based on taint tracking. [89] generates electronic notebook documentation from multienvironment workflows by using provenance represented in the W3C PROV model.…”
Section: Provenancementioning
confidence: 99%
“…In noWorkflow [24] authors analyse Python scripts to extract function-call hierarchies, which they use to create a view over provenance traces collected by run-time instrumentation of the functions' reads and writes to the file system. In [25] authors employ a taint tracking framework, which instruments programme executions and records which computations are affected by tainted data sources. Here the authors use names of files read by programmes as taint marks and show how such an approach can create fine grained lineage among files, where the programme during its execution writes to one file the data read from another file.…”
Section: Related Workmentioning
confidence: 99%