Abstract. This paper describes highly-optimized AES-{128, 192, 256}-CTR assembly implementations for the popular ARM Cortex-M3 and M4 embedded microprocessors. These implementations are about twice as fast as existing implementations. Additionally, we provide the fastest bitsliced constant-time and masked implementations of AES-128-CTR to protect against timing attacks, power analysis and other (first-order) sidechannel attacks. All implementations, including an architecture-specific instruction scheduler and register allocator, which we use to minimize expensive loads, are released into the public domain.
Recently a lot of attention is paid to the search for efficiently implementable MDS matrices for lightweight symmetric primitives. Most previous work concentrated on locally optimizing the multiplication with single matrix elements. Separate from this line of work, several heuristics were developed to find shortest linear straightline programs. Solving this problem actually corresponds to globally optimizing multiplications by matrices. In this work we combine those, so far largely independent lines of work. As a result, we achieve implementations of known, locally optimized, and new MDS matrices that significantly outperform all implementations from the literature. Interestingly, almost all previous locally optimized constructions behave very similar with respect to the globally optimized implementation. As a side effect, our work reveals the so far best implementation of the Aes Mix- Columns operation with respect to the number of XOR operations needed.
Masking is the best-researched countermeasure against side-channel analysis attacks. Even though masking was introduced almost 20 years ago, its efficient implementation continues to be an active research topic. Many of the existing works focus on the reduction of randomness requirements since the production of fresh random bits with high entropy is very costly in practice. Most of these works rely on the assumption that only so-called online randomness results in additional costs. In practice, however, it shows that the distinction between randomness costs to produce the initial masking and the randomness to maintain security during computation (online) is not meaningful. In this work, we thus study the question of minimum randomness requirements for first-order Boolean masking when taking the costs for initial randomness into account. We demonstrate that first-order masking can in theory always be performed by just using two fresh random bits and without requiring online randomness. We first show that two random bits are enough to mask linear transformations and then discuss prerequisites under which nonlinear transformations are securely performed likewise. Subsequently, we introduce a new masked AND gate that fulfills these requirements and which forms the basis for our synthesis tool that automatically transforms an unmasked implementation into a first-order secure masked implementation. We demonstrate the feasibility of this approach by implementing AES in software with only two bits of randomness, including the initial masking. Finally, we use these results to discuss the gap between theory and practice and the need for more accurate adversary models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.