Roberto Baldoni scite author profile

Many security and software testing applications require checking whether certain properties of a program hold for any possible usage scenario. For instance, a tool for identifying software vulnerabilities may need to rule out the existence of any backdoor to bypass a program's authentication. One approach would be to test the program using different, possibly random inputs. As the backdoor may only be hit for very specific program workloads, automated exploration of the space of possible inputs is of the essence. Symbolic execution provides an elegant solution to the problem, by systematically exploring many possible execution paths at the same time without necessarily requiring concrete inputs. Rather than taking on fully specified input values, the technique abstractly represents them as symbols, resorting to constraint solvers to construct actual instances that would cause property violations. Symbolic execution has been incubated in dozens of tools developed over the last four decades, leading to major practical breakthroughs in a number of prominent software reliability applications. The goal of this survey is to provide an overview of the main ideas, challenges, and solutions developed in the area, distilling them for a broad audience.If you are considering citing this survey, we would appreciate if you could use the following BibTeX entry: http://goo.gl/Hf5Fvc 0:2 R. Baldoni et al.

show abstract

Survey of machine learning techniques for malware analysis

Ucci

Aniello

Baldoni

2019

Computers & Security

395

189

View full text Add to dashboard Cite

Coping with malware is getting more and more challenging, given their relentless growth in complexity and volume. One of the most common approaches in literature is using machine learning techniques, to automatically learn models and patterns behind such complexity, and to develop technologies to keep pace with malware evolution. This survey aims at providing an overview on the way machine learning has been used so far in the context of malware analysis in Windows environments, i.e. for the analysis of Portable Executables. We systematize surveyed papers according to their objectives (i.e., the expected output), what information about malware they specifically use (i.e., the features), and what machine learning techniques they employ (i.e., what algorithm is used to process the input and produce the output). We also outline a number of issues and challenges, including those concerning the used datasets, and identify the main current topical trends and how to possibly advance them. In particular, we introduce the novel concept of malware analysis economics, regarding the study of existing trade-offs among key metrics, such as analysis accuracy and economical costs.

show abstract

SAFE: Self-Attentive Function Embeddings for Binary Similarity

et al. 2019

View full text Add to dashboard Cite

The binary similarity problem consists in determining if two functions are similar by only considering their compiled form. Advanced techniques for binary similarity recently gained momentum as they can be applied in several fields, such as copyright disputes, malware analysis, vulnerability detection, etc., and thus have an immediate practical impact. Current solutions compare functions by first transforming their binary code in multi-dimensional vector representations (embeddings), and then comparing vectors through simple and efficient geometric operations. However, embeddings are usually derived from binary code using manual feature extraction, that may fail in considering important function characteristics, or may consider features that are not important for the binary similarity problem. In this paper we propose SAFE, a novel architecture for the embedding of functions based on a self-attentive neural network. SAFE works directly on disassembled binary functions, does not require manual feature extraction, is computationally more efficient than existing solutions (i.e., it does not incur in the computational overhead of building or manipulating control flow graphs), and is more general as it works on stripped binaries and on multiple architectures. We report the results from a quantitative and qualitative analysis that show how SAFE provides a noticeable performance improvement with respect to previous solutions. Furthermore, we show how clusters of our embedding vectors are closely related to the semantic of the implemented algorithms, paving the way for further interesting applications (e.g. semantic-based binary function search) 1 https://www.statista.com/statistics/266210/number-of-available-applications-in-the-google-play-store/ 2 https://www.cvedetails.com/browse-by-date.php 1 arXiv:1811.05296v3 [cs.CR]

show abstract

Adaptive online scheduling in storm

2013

View full text Add to dashboard Cite

Today we are witnessing a dramatic shift toward a data-driven economy, where the ability to efficiently and timely analyze huge amounts of data marks the difference between industrial success stories and catastrophic failures. In this scenario Storm, an open source distributed realtime computation system, represents a disruptive technology that is quickly gaining the favor of big players like Twitter and Groupon. A Storm application is modeled as a topology, i.e. a graph where nodes are operators and edges represent data flows among such operators. A key aspect in tuning Storm performance lies in the strategy used to deploy a topology, i.e. how Storm schedules the execution of each topology component on the available computing infrastructure. In this paper we propose two advanced generic schedulers for Storm that provide improved performance for a wide range of application topologies. The first scheduler works offline by analyzing the topology structure and adapting the deployment to it; the second scheduler enhance the previous approach by continuously monitoring system performance and rescheduling the deployment at run-time to improve overall performance. Experimental results show that these algorithms can produce schedules that achieve significantly better performances compared to those produced by Storm's default scheduler. Copyright © 2013 ACM

show abstract

A communication-induced checkpointing protocol that ensures rollback-dependency trackability

Baldoni

Hélary

Mostéfaoui

et al.

View full text Add to dashboard Cite

Considering an application in which processes take local checkpoints independently (called basic checkpoints), this paper develops a protocol that forces them to take some additional local checkpoints (called forced checkpoints) in order that the resulting checkpoint and communication pattern satisfies the Rollback Dependency Trackability (RDT) property. This property states that all dependencies between local checkpoints are on-line trackable by using a transitive dependency vectol:Compared to other protocols ensuring the RDT property, the proposed protocol is less conservative in the sense that it takes less additional local checkpoints. It attains this goal by a subtle tracking of causal dependencies on already taken checkpoints; this tracking is then used to prevent the occurrence of hidden dependencies. As indicated by simulation study, the proposed protocol compares favorably with other protocols; moreovec it additionally associates on-the-jly with each local checkpoint G the minimum global checkpoint to which C belongs.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Roberto Baldoni

A Survey of Symbolic Execution Techniques

Survey of machine learning techniques for malware analysis

SAFE: Self-Attentive Function Embeddings for Binary Similarity

Adaptive online scheduling in storm

A communication-induced checkpointing protocol that ensures rollback-dependency trackability

Contact Info

Product

Resources

About