“…Dynamic malware analysis systems like Anubis [8], CWSandbox [50] and others [16,22,27,36,42] have proven invaluable in generating ground truth characterizations of malware behavior. The anti-malware community regularly applies these ground truths in scientific experiments, for example to evaluate malware detection technologies [2,10,17,19,24,26,30,33,44,48,[52][53][54], to disseminate the results of large-scale malware experiments [6,11,42], to identify new groups of malware [2,5,38,41], or as training datasets for machine learning approaches [20,34,35,38,40,41,47,55]. However, while analysis of malware execution clearly holds importance for the community, the data collection and subsequent analysis processes face numerous potential pitfalls.…”