[1988] Proceedings of the 21st Annual Workshop on Microprogramming and Microarchitecture - MICRO '21 1988
DOI: 10.1109/micro.1988.639255
|View full text |Cite
|
Sign up to set email alerts
|

Hardware Support For Large Atomic Units in Dynamically Scheduled Machines

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

1994
1994
2014
2014

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 42 publications
(13 citation statements)
references
References 3 publications
0
13
0
Order By: Relevance
“…Second, DIF does not use an out-of-order engine but only a simple "primary engine" alongside its VLIW engine, and envisions that most code should execute on the VLIW engine. Atomic block-based cores: Melvin et al [39,38], and later Hao et al [24] and Sprangle et al [51], propose a core design in which the compiler provides atomic blocks at the ISA level. These works note multiple advantages of using atomic blocks: the core has a higher instruction fetch rate, and can also use a small local register file to reduce register pressure on a global register file [51].…”
Section: Related Workmentioning
confidence: 99%
“…Second, DIF does not use an out-of-order engine but only a simple "primary engine" alongside its VLIW engine, and envisions that most code should execute on the VLIW engine. Atomic block-based cores: Melvin et al [39,38], and later Hao et al [24] and Sprangle et al [51], propose a core design in which the compiler provides atomic blocks at the ISA level. These works note multiple advantages of using atomic blocks: the core has a higher instruction fetch rate, and can also use a small local register file to reduce register pressure on a global register file [51].…”
Section: Related Workmentioning
confidence: 99%
“…In addition, it requires dynamic scheduling hardware in the main data path of the machine, which can have a negative effect on the clock cycle time. Franklin and Smotherman [13] proposed the use of a fill unit [25] to compact a dynamic stream of scalar instructions. Their fill unit accepts decoded instructions from the machine decoder, compacts them into a long instruction (the term used in the rest of this paper to refer to VLIW instructions), and saves this into a shadow cache.…”
Section: Tackling the Vliw Object Code Compatibility Problemmentioning
confidence: 99%
“…Blocks of instructions are preprocessed before being put in the trace cache, which greatly simplifies processing after they are fetched. Preprocessing can include capturing data dependence relationships, combining and reordering instructions, or determining instruction resource requirements 5 -all of which can be reused. To support precise interrupts, information about the original instruction order must also be saved with the trace.…”
Section: Instruction Preprocessingmentioning
confidence: 99%