Santosh Pande scite author profile

We present a novel approach which combines compiler, instruction set, and microarchitecture support to turn off functional units that are idle for long periods of time for reducing static power dissipation by idle functional units using power gating [2,9]. The compiler identifies program regions in which functional units are expected to be idle and communicates this information to the hardware by issuing directives for turning units off at entry points of idle regions and directives for turning them back on at exits from such regions. The microarchitecture is designed to treat the compiler directives as hints ignoring a pair of off and on directives if they are too close together. The results of experiments show that some of the functional units can be kept off for over 90% of the time at the cost of minimal performance degradation of under 1%.

show abstract

Storage assignment optimizations to generate compact and efficient code on embedded DSPs

Rao

Pande

1999

View full text Add to dashboard Cite

DSP architectures typically provide dedicated memory address generation units and indirect addressing modes with auto-increment and auto-decrement that subsume address arithmetic calculation.The heavy use of auto-increment and auto-decrement indirect addressing require DSP compilers to perform a careful placement of variables in storage to minimize address arithmetic instructions to generate compact and efficient DSP code. Liao et al. formulated the problem of storage assignment as the simple offset assignment problem (SOA) and the general offset assignment problem (GOA), and proposed heuristic solutions.The storage allocation of variables critically depends on the sequence of variable accesses. In this paper we present techniques to optimize the access sequence of variables by applying algebraic transformations (such as commutativity and associativity) on expression trees to obtain the least cost offset assignment. We develop a new formulation of this problem as the least cost access sequence problem (LCAS). Based on the proposed framework, we develop heuristic algorithms that determine empirically near-optimal solutions resulting in fewer address arithmetic instructions. We have implemented the proposed heuristic algorithms by extending the storage assignment optimization in the SPAM compiler back-end targeted for the TMS320C25 DSP. In the case of SOA, experimental results for programs from the DSPstone benchmark suite show an average improvement of 3.36% in static code size and an average relative speed-up of 7.28% over results obtained using existing SOA algorithms.The average code size reduction over code compiled with a naive storage assignment algorithm is 7.04%. The proposed framework has also been applied to the GOA problem and shows average code size reductions of 2.04% over results obtained using existing GOA algorithms , and average code size reductions of 10.84% over a naive GOA algorithm. Code size reduction and improvement in dynamic instruction counts could be valuable given limited memory and real-time response requirements placed on embedded systems.

show abstract

Hardware assisted control flow obfuscation for embedded processors

Zhuang

Zhang

Lee

et al. 2004

View full text Add to dashboard Cite

+With more applications being deployed on embedded platforms, software protection becomes increasingly important. This problem is crucial on embedded systems like financial transaction terminals, pay-TV access-control decoders, where adversaries may easily gain full physical accesses to the systems and critical algorithms must be protected from being cracked. However, as this paper points out that protecting software with either encryption or obfuscation cannot completely preclude the control flow information from being leaked. Encryption has been widely studied and employed as a traditional approach for software protection, however, the control flow information is not 100% hidden with solely encrypting the code. On the other hand, pure software-based obfuscation has been proved inefficient to protect software due to its lack of theoretical foundation and considerable performance overhead introduced by complicated transformations. Moreover, even though obfuscation can prevent static reverse engineering, attacker can still successfully bypass the obfuscation by monitoring the dynamic program execution.To address all of these shortcomings, this paper presents a hardware assisted obfuscation technique that is capable of obfuscating the control flow information dynamically. Dynamic obfuscation changes memory access sequence on-the-fly and conceals recurrent instruction access sequences from being identified. Our scheme makes it provably difficult for the attacker to extract any useful information. Our results show that a highlevel security protection is possible with only minor performance penalty. Finally, we show that our scheme can be implemented on embedded systems with very little hardware overhead.

show abstract

Anomalous path detection with hardware support

Zhang

Zhuang

Pande

et al. 2005

View full text Add to dashboard Cite

Embedded systems are being deployed as a part of critical infrastructures and are vulnerable to malicious attacks due to internet accessibility. Intrusion detection systems have been proposed to protect computer systems from unauthorized penetration. Detecting an attack early on pays off since further damage is avoided and in some cases, resilient recovery could be adopted. This is especially important for embedded systems deployed in critical infrastructures such as Power Grids etc. where a timely intervention could save catastrophes. An intrusion detection system monitors dynamic program behavior against normal program behavior and raises an alert when an anomaly is detected. The normal behavior is learnt by the system through training and profiling.However, all current intrusion detection systems are purely software based and thus suffer from large performance degradation due to constant monitoring operations inserted in application code. Due to the potential performance overheads, software based solutions cannot monitor program behavior at a very fine level of granularity, thus leaving potential security holes as shown in the literature. Another important drawback of such methods is that they are unable to detect intrusions in near real time and the time lag could prove disastrous in real time embedded systems. In this paper, we propose a hardware-based approach to verify program execution paths of target applications dynamically and to detect anomalous executions. With hardware support, our approach offers multiple advantages over software based solutions including minor performance degradation, much stronger detection capability (a larger variety of attacks get detected) and zero-latency reaction upon an anomaly for near real time detection and thus much better security.

show abstract

F2C2-STM: Flux-Based Feedback-Driven Concurrency Control for STMs

Ravichandran

Pande

2014

View full text Add to dashboard Cite

Software Transactional Memory (STM) systems provide an easy to use programming model for concurrent code and have been found suitable for parallelizing many applications providing performance gains with minimal programmer effort. With increasing core counts on modern processors one would expect increasing benefits. However, we observe that running STM applications on higher core counts is sometimes, in fact, detrimental to performance. This is due to the larger number of conflicts that arise with a larger number of parallel cores. As the number of cores available on processors steadily rise, a larger number of applications are beginning to exhibit these characteristics.In this paper we propose a novel dynamic concurrency control technique which can significantly improve performance (up to 50%) as well as resource utilization (up to 85%) for these applications at higher core counts. Our technique uses ideas borrowed from TCP's network congestion control algorithm and uses selfinduced concurrency fluctuations to dynamically monitor and match varying concurrency levels in applications while minimizing global synchronization. Our flux-based feedback-driven concurrency control technique is capable of fully recovering the performance of the best statically chosen concurrency specification (as chosen by an oracle) regardless of the initial specification for several real world applications. Further, our technique can actually improve upon the performance of the oracle chosen specification by more than 10% for certain applications through dynamic adaptation to available parallelism.We demonstrate our approach on the STAMP benchmark suite while reporting significant performance and resource utilization benefits. We also demonstrate significantly better performance when comparing against state of the art concurrency control and scheduling techniques. Further, our technique is programmer friendly as it requires no changes to application code and no offline phases.

show abstract

Compiler optimizations for real time execution of loops on limited memory embedded systems

Anantharaman

Pande²

View full text Add to dashboard Cite

BlankIt library debloating: getting what you want instead of cutting what you don’t

Porter

Mururu

Barua

et al. 2020

View full text Add to dashboard Cite

Modern software systems make extensive use of libraries derived from C and C++. Because of the lack of memory safety in these languages, however, the libraries may suffer from vulnerabilities, which can expose the applications to potential attacks. For example, a very large number of return-oriented programming gadgets exist in glibc that allow stitching together semantically valid but malicious Turing-complete and -incomplete programs. While CVEs get discovered and often patched and remedied, such gadgets serve as building blocks of future undiscovered attacks, opening an ever-growing set of possibilities for generating malicious programs. Thus, significant reduction in the quantity and expressiveness (utility) of such gadgets for libraries is an important problem.In this work, we propose a new approach for handling an application's library functions that focuses on the principle of łgetting only what you want.ž This is a significant departure from the current approaches that focus on łcutting what is unwanted. ž Our approach focuses on activating/deactivating library functions on demand in order to reduce the dynamically linked code surface, so that the possibilities of constructing malicious programs diminishes substantially. The key idea is to load only the set of library functions that will be used at each library call site within the application at runtime. This approach of demand-driven loading relies on an input-aware oracle that predicts a near-exact set of library

show abstract

Compilation techniques for parallel systems

et al. 1999

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Santosh Pande

Optimizing Static Power Dissipation by Functional Units in Superscalar Processors

Storage assignment optimizations to generate compact and efficient code on embedded DSPs

Hardware assisted control flow obfuscation for embedded processors

Anomalous path detection with hardware support

F2C2-STM: Flux-Based Feedback-Driven Concurrency Control for STMs

Compiler optimizations for real time execution of loops on limited memory embedded systems

BlankIt library debloating: getting what you want instead of cutting what you don’t

Compilation techniques for parallel systems

Contact Info

Product

Resources

About