Formal methods for software development have made great strides in the last two decades, to the point that their application in safety-critical embedded software is an undeniable success. Their extension to non-critical software is one of the notable forthcoming challenges. For example, C programmers regularly use inline assembly for low-level optimizations and system primitives. This usually results in rendering state-ofthe-art formal analyzers developed for C ineffective. We thus propose TINA, the first automated, generic, verification-friendly and trustworthy lifting technique turning inline assembly into semantically equivalent C code amenable to verification, in order to take advantage of existing C analyzers. Extensive experiments on real-world code (including GMP and ffmpeg) show the feasibility and benefits of TINA.Assuming no side-effect beyond those mentioned in output operands'. 1 # 54 "/usr/include/i386-linux-gnu/sys/select.h" 2 typedef long int __ fd _ mask; 3 4 # 64 "/usr/include/i386-linux-gnu/sys/select.h" 5 typedef struct { 6 __ fd _ mask __ fds _ bits[1024 / (8 * sizeof( __ fd _ mask))]; 7 } fd _ set; 8 9 # 1074 "socklib.c" 10 int udpc _ prepareForSelect 11 (int * socks, int nr, fd _ set * read _ set) 12 { 13 / * [...] * / 14 int maxFd; 15 do { 16 int __ d0, __ d1; 17 __ asm __ __ volatile __ 18 ("cld; rep; " "stosl" 19: "=c" ( __ d0), "=D" ( __ d1) 20 : "a" (0), 21 "0" (sizeof(fd _ set) / sizeof( __ fd _ mask)), 22 void sub _ median _ pred _ mmxext(uint8 _ t * dst, uint8 _ t const * src1,
Inline assembly is still a common practice in lowlevel C programming, typically for efficiency reasons or for accessing specific hardware resources. Such embedded assembly codes in the GNU syntax (supported by major compilers such as GCC, Clang and ICC) have an interface specifying how the assembly codes interact with the C environment. For simplicity reasons, the compiler treats GNU inline assembly codes as blackboxes and relies only on their interface to correctly glue them into the compiled C code. Therefore, the adequacy between the assembly chunk and its interface (named compliance) is of primary importance, as such compliance issues can lead to subtle and hard-to-find bugs. We propose RUSTIn A, the first automated technique for formally checking inline assembly compliance, with the extra ability to propose (proven) patches and (optimization) refinements in certain cases. RUSTIn A is based on an original formalization of the inline assembly compliance problem together with novel dedicated algorithms. Our prototype has been evaluated on 202 Debian packages with inline assembly (2656 chunks), finding 2183 issues in 85 packages -986 significant issues in 54 packages (including major projects such as ffmpeg or ALSA), and proposing patches for 92% of them. Currently, 38 patches have already been accepted (solving 156 significant issues), with positive feedback from development teams.
Inline assembly is still a common practice in lowlevel C programming, typically for efficiency reasons or for accessing specific hardware resources. Such embedded assembly codes in the GNU syntax (supported by major compilers such as GCC, Clang and ICC) have an interface specifying how the assembly codes interact with the C environment. For simplicity reasons, the compiler treats GNU inline assembly codes as blackboxes and relies only on their interface to correctly glue them into the compiled C code. Therefore, the adequacy between the assembly chunk and its interface (named compliance) is of primary importance, as such compliance issues can lead to subtle and hard-to-find bugs. We propose RUSTINA, the first automated technique for formally checking inline assembly compliance, with the extra ability to propose (proven) patches and (optimization) refinements in certain cases. RUSTINA is based on an original formalization of the inline assembly compliance problem together with novel dedicated algorithms. Our prototype has been evaluated on 202 Debian packages with inline assembly (2656 chunks), finding 2183 issues in 85 packages -986 significant issues in 54 packages (including major projects such as ffmpeg or ALSA), and proposing patches for 92% of them. Currently, 38 patches have already been accepted (solving 156 significant issues), with positive feedback from development teams.1 Microsoft inline assembly is different and has no interface, see Sec. IX-C. 2 From the llvm-dev mailing list [3]: "GCC-style inline assembly is notoriously hard to write correctly".3 Note that syntactically incorrect assembly instructions are caught during the translation from assembly to machine code.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.