Shuichi Sakai scite author profile

A highly parallel (more than a thousand) datapoW machine EM-4 is now under development. The EM-4 &sign principle is to construct a high performance computer using a compact architecture by overcoming several defects of dataflow machines. Constructing the EM-4, it is essential to fabricate a processing element (PE) on a single chip for reducing operation speed, system size, design complexhy and cost. In the EM-4. the PE . called EMC-R, has been specially designed using a 50,OOOgate gate array chip. This paper focuses on an architecture of the EMC-R. The distinctive features of it are: a strongly connected arc datafiow model; a direct matching scheme; a RISC-based design; a deadlock-free on-chip packet switch; and an integration of a packet-based circular pipeline and a register-based advanced control pipeline. These features are intensively examined, and the instruction set architecture and the conftguration architecture which exploit them are &scribed.

show abstract

An object detection method for describing soccer games from video

Utsumi¹,

Miura²,

Ide³

et al.

View full text Add to dashboard Cite

Register Cache System Not for Latency Reduction Purpose

Shioya

Horio

Goshima

et al. 2010

View full text Add to dashboard Cite

A register cache has been proposed to solve the problems of the huge register files of recent superscalar processors. The register cache reduces the effective access latency of the register file for IPC improvement, simplifies the bypass network, and reduces the ports of the main register file. Though the primary purpose of the previous works is to improve IPC, the misses on the register cache may degrade the IPC. We propose Non-Latency-Oriented Register Cache System (NORCS). Though the effects of NORCS are the same as the conventional systems, it is free from register cache miss penalties that the conventional systems suffer from. In NORCS, the register cache itself is not different from that of the conventional systems. The difference is that the instruction pipeline has stages to read the main register file, which all instructions go through regardless of register cache hit / miss. Therefore, the instruction pipeline of NORCS is not immediately disturbed by the register cache misses. For a realistic 4-way superscalar processor, NORCS can simplify the bypass network to the same complexity as a 1-cycle-latency register file, and reduce the ports of the main register file from 12 to 4. CACTI simulation shows that the area and power consumption are reduced to 24.9% and 31.9% compared to the baseline model without register cache. Though these results are not different from the conventional systems, IPCs differ greatly. IPC of the conventional system decreases to 83.1% because of the cache miss penalties, while that of NORCS is retained at 98.0%.

show abstract

Structural analysis of cooking preparation steps in Japanese

Hamada

Ide

Sakai

et al. 2000

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Shuichi Sakai

An architecture of a dataflow single chip processor

An object detection method for describing soccer games from video

Register Cache System Not for Latency Reduction Purpose

Structural analysis of cooking preparation steps in Japanese

Contact Info

Product

Resources

About