The binary similarity problem consists in determining if two functions are similar by only considering their compiled form. Advanced techniques for binary similarity recently gained momentum as they can be applied in several fields, such as copyright disputes, malware analysis, vulnerability detection, etc., and thus have an immediate practical impact. Current solutions compare functions by first transforming their binary code in multi-dimensional vector representations (embeddings), and then comparing vectors through simple and efficient geometric operations. However, embeddings are usually derived from binary code using manual feature extraction, that may fail in considering important function characteristics, or may consider features that are not important for the binary similarity problem. In this paper we propose SAFE, a novel architecture for the embedding of functions based on a self-attentive neural network. SAFE works directly on disassembled binary functions, does not require manual feature extraction, is computationally more efficient than existing solutions (i.e., it does not incur in the computational overhead of building or manipulating control flow graphs), and is more general as it works on stripped binaries and on multiple architectures. We report the results from a quantitative and qualitative analysis that show how SAFE provides a noticeable performance improvement with respect to previous solutions. Furthermore, we show how clusters of our embedding vectors are closely related to the semantic of the implemented algorithms, paving the way for further interesting applications (e.g. semantic-based binary function search) 1 https://www.statista.com/statistics/266210/number-of-available-applications-in-the-google-play-store/ 2 https://www.cvedetails.com/browse-by-date.php 1 arXiv:1811.05296v3 [cs.CR]
In this paper we investigate the use of graph embedding networks, with unsupervised features learning, as neural architecture to learn over binary functions. We propose several ways of automatically extract features from the control flow graph (CFG) and we use the structure2vec graph embedding techniques to translate a CFG to a vectors of real numbers. We train and test our proposed architectures on two different binary analysis tasks: binary similarity, and, compiler provenance. We show that the unsupervised extraction of features improves the accuracy on the above tasks, when compared with embedding vectors obtained from a CFG annotated with manually engineered features (i.e., ACFG proposed in [39]). We additionally compare the results of graph embedding networks based techniques with a recent architecture that do not make use of the structural information given by the CFG, and we observe similar performances. We formulate a possible explanation of this phenomenon and we conclude identifying important open challenges.
Shape formation (or pattern formation) is a basic distributed problem for systems of computational mobile entities. Intensively studied for systems of autonomous mobile robots, it has recently been investigated in the realm of programmable matter, where entities are assumed to be small and with severely limited capabilities. Namely, it has been studied in the geometric Amoebot model, where the anonymous entities, called particles, operate on a hexagonal tessellation of the plane and have limited computational power (they have constant memory), strictly local interaction and communication capabilities (only with particles in neighboring nodes of the grid), and limited motorial capabilities (from a grid node to an empty neighboring node); their activation is controlled by an adversarial scheduler. Recent investigations have shown how, starting from a well-structured configuration in which the particles form a (not necessarily complete) triangle, the particles can form a large class of shapes. This result has been established under several assumptions: agreement on the clockwise direction (i.e., chirality), a sequential activation schedule, and randomization (i.e., particles can flip coins to elect a leader).In this paper we obtain several results that, among other things, provide a characterization of which shapes can be formed deterministically starting from any simply connected initial configuration of n particles. The characterization is constructive: we provide a universal shape formation algorithm that, for each feasible pair of shapes (S 0 , S F ), allows the particles to form the final shape S F (given in input) starting from the initial shape S 0 , unknown to the particles. The final configuration will be an appropriate scaled-up copy of S F depending on n.If randomization is allowed, then any input shape can be formed from any initial (simply connected) shape by our algorithm, provided that there are enough particles.Our algorithm works without chirality, proving that chirality is computationally irrelevant for shape formation. Furthermore, it works under a strong adversarial scheduler, not necessarily sequential.We also consider the complexity of shape formation both in terms of the number of rounds and the total number of moves performed by the particles executing a universal shape formation algorithm. We prove that our solution has a complexity of O(n 2 ) rounds and moves: this number of moves is also asymptotically optimal.
In the graph exploration problem, a team of mobile computational entities, called agents, arbitrarily positioned at some nodes of a graph, must cooperate so that each node is eventually visited by at least one agent. In the literature, the main focus has been on graphs that are static; that is, the topology is either invariant in time or subject to localized changes.The few studies on exploration of dynamic graphs have been almost all limited to the centralized case (i.e., assuming complete a priori knowledge of the changes and the times of their occurrence).We investigate the decentralized exploration of dynamic graphs (i.e., when the agents are unaware of the location and timing of the changes) focusing, in this paper, on dynamic systems whose underlying graph is a ring.We first consider the fully-synchronous systems traditionally assumed in the literature; i.e., all agents are active at each time step. We then introduce the notion of semi-synchronous systems, where only a subset of agents might be active at each time step (the choice of the subset is made by an adversary); this model is common in the context of mobile agents in continuous spaces but has never been studied before for agents moving in graphs. Our main focus is on the impact that the level of synchrony as well as other factors such as anonymity, knowledge of the size of the ring, and chirality (i.e., common orientation) have on the solvability of the problem, focusing on the minimum number of agents necessary.We draw an extensive map of feasibility, and of complexity in terms of minimum number of agent movements. All our sufficiency proofs are constructive, and almost all our solution protocols are asymptotically optimal. Graph ExplorationThe problem of graph exploration requires a team of mobile computational entities, usually called agents or robots, located at the nodes of a graph and capable of moving from node to neighbouring node, to explore the graph, with the requirement that each node is eventually visited by at least one agent.This classical problem has been extensively investigated, starting from the pioneering work of Shannon [44]. In the vast literature on the subject (e.g., see [1,17,18,23,28,32,43]), a wide spectrum of different assumptions have been made and examined e.g. with regard to: the computational power of the agent(s); the structure of the graph and its properties; the level of topological knowledge available to the agents; whether or not the network is anonymous (i.e., the nodes lack distinct identifiers); whether the nodes can be marked (e.g., by leaving a pebble); the level of synchronization. In case of multiple agents, different types of means of communication have been considered, including: face-to-face, where agents communicate when they are on * University of Ottawa, gdiluna@uottowa.ca, g.a.the same node; wireless, where the agents are able to communicate even when they are on different nodes; with whiteboard, where the agents communicate by writing on and reading from whiteboards present in each node.Regardless ...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.