Kyohei Yamaguchi scite author profile

2013

It is difficult to improve the single-thread performance of a processor in memory-intensive programs because processors have hit the memory wall, i.e., the large speed discrepancy between the processors and the main memory. Exploiting memory-level parallelism (MLP) is an effective way to overcome this problem. One scheme for exploiting MLP is aggressive out-of-order execution. To achieve this, large instruction window resources (i.e., the reorder buffer, the issue queue, and the load/store queue) are required; however, simply enlarging these resources degrades the clock cycle time. While pipelining these resources can solve this problem, this leads to instruction issue delays, which prevents instructionlevel parallelism (ILP) from being exploited effectively. As a result, the performance of compute-intensive programs is degraded dramatically.This paper proposes an adaptive dynamic instruction window resizing scheme that enlarges and pipelines the window resources only when MLP is exploitable, and shrinks and de-pipelines the resources when ILP is exploitable. Our scheme changes the size of the window resources by predicting whether MLP is exploitable based on the occurrence of last-level cache misses. Our scheme is very simple and hardware change is accommodated within the existing processor organization, it is thus very practical. Evaluation results using the SPEC2006 benchmark programs show that, for all programs, our dynamic instruction window resizing scheme achieves performance levels similar to the best performance achieved with fixed-size resources. On average, our scheme produces a performance improvement of 21% in comparison with that of a conventional processor, with an additional cost of only 6% of the conventional processor core or 3% of the entire processor chip, thus achieving a significantly better cost/performance ratio that is far beyond the level * Presently with Transportation Bureau, City of Nagoya † Presently with Renesas Electronics Corporation that can be achieved based on Pollack's law. The evaluation results also show an 8% better energy efficiency in terms of 1/EDP (energy-delay product).

Evaluation of issue queue delay: Banking tag RAM and identifying correct critical path

2011

Quencher-free molecular beacon tethering 7-hydroxycoumarin detects targets through protonation/deprotonation

Kashida

Bioorganic & Medicinal Chemistry

Hara

et al. 2012

MLP-Aware Dynamic Instruction Window Resizing in Superscalar Processors for Adaptively Exploiting Available Parallelism

IEICE Trans. Inf. & Syst.

2014

SUMMARYSingle-thread performance has not improved much over the past few years, despite an ever increasing transistor budget. One of the reasons for this is that there is a speed gap between the processor and main memory, known as the memory wall. A promising method to overcome this memory wall is aggressive out-of-order execution by extensively enlarging the instruction window resources to exploit memory-level parallelism (MLP). However, simply enlarging the window resources lengthens the clock cycle time. Although pipelining the resources solves this problem, it in turn prevents instruction-level parallelism (ILP) from being exploited because issuing instructions requires multiple clock cycles. This paper proposed a dynamic scheme that adaptively resizes the instruction window based on the predicted available parallelism, either ILP or MLP. Specifically, if the scheme predicts that MLP is available during execution, the instruction window is enlarged and the window resources are pipelined, thereby exploiting MLP. Conversely, if the scheme predicts that less MLP is available, that is, ILP is exploitable for improved performance, the instruction window is shrunk and the window resources are de-pipelined, thereby exploiting ILP. Our evaluation results using the SPEC2006 benchmark programs show that the proposed scheme achieves nearly the best performance possible with fixed-size resources. On average, our scheme realizes a performance improvement of 21% over that of a conventional processor, with additional cost of only 6% of the area of the conventional processor core or 3% of that of the entire processor chip. The evaluation results also show 8% better energy efficiency in terms of 1/EDP (energy-delay product).

Optimality of passivity-based controls for distributed port-Hamiltonian systems

Nishida

IFAC Proceedings Volumes

Sakamoto

2013

Development of a Control System and Interface Design Based on an Electric Wheelchair

Woo

Ohyama

2021

JACIII

Recently, personal mobility has been researched and developed to make short-distance travel within the community more comfortable and convenient. However, from the viewpoint of personal mobility, there are problems such as difficulty in picking up items while shopping when operating the joystick for shopping and the inability to use hands freely. Accordingly, because the speed of personal mobility can be controlled by foot stepping like an accelerator pedal, we developed an electric wheelchair system that can control the speed by pedal operation. Furthermore, we developed a control system that considers the ride quality using an electric wheelchair with pedal control. In this study, the proposed method is detailed in three parts. Firstly, to develop the pedal mechanism, a potentiometer was used to detect the angle of the pedal mechanism, and a spring mechanism was designed for return to its original position after the pedal was pushed. Next, we propose a feedback control system that considers the ride quality of the operator. In addition, we integrated the system with a smart device-based robot system to realize the mobility as a service (MaaS). Finally, we present several examples of the system and discuss the applicability of the proposed system.

Delay Evaluation of Issue Queue in Superscalar Processors with Banking Tag RAM and Correct Critical Path Identification

IEICE Trans. Inf. & Syst.

2012

SUMMARYThis paper evaluates the delay of the issue queue in a superscalar processor to aid microarchitectural design, where quick quantification of the complexity of the issue queue is needed to consider the tradeoff between clock cycle time and instructions per cycle. Our study covers two aspects. First, we introduce banking tag RAM, which comprises the issue queue, to reduce the delay. Unlike normal RAM, this is not straightforward, because of the uniqueness of the issue queue organization. Second, we explore and identify the correct critical path in the issue queue. In a previous study, the critical path of each component in the issue queue was summed to obtain the issue queue delay, but this does not give the correct delay of the issue queue, because the critical paths of the components are not connected logically. In the evaluation assuming 32-nm LSI technology, we obtained the delays of issue queues with eight to 128 entries. The process of banking tag RAM and identifying the correct critical path reduces the delay by up to 20% and 23% for 4-and 8-issue widths, respectively, compared with not banking tag RAM and simply summing the critical path delay of each component.

Swing-up and Stabilization Control of a Flexible Rotational Inverted Pendulum by Nonlinear Frequency Optimal Control

Transactions of the Society of Instrument and Control Engineers

Sakamoto

2015

Flexible rotational inverted pendulum is an extension of Furuta inverted pendulum in which pendulum is no longer a rigid body and vibrates due to the elasticity of the flexible beam. In this paper, we design a nonlinear controller based on nonlinear frequency optimal control for swinging up and stabilization of the flexible beam by a single feedback control. This controller is solved via the stable manifold approach which is recently proposed for solving the Hamilton-Jacobi equation.