If you want to change the background color click one of the buttons

This page will make more sense if you read Chapter 4 of HENNESSY & PATTERSON before hand.

PIPELINE VARIETY

(There are a vast amount of resources regarding Pentium Pro Processors and pipelining, we have a special link to some of these pages. LINK.)

THE AIM

The aim of Pentium Pro Processor was to utilize the CPU to full extent. While CPU speeds have increased 10-fold over the past 10 years, the speed of main memory devices has only increased by 60 percent (1). This increasing memory latency, relative to the CPU core speed, is a fundamental problem that the Pentium Pro processor set out to solve. There are various ways to speed up the CPU, and some of the methods are discussed on Chapter 4-5 of H&H. One approach would be to place the burden of this problem onto the chipset
but a high-performance CPU that needs very high speed, specialized, support components is not a good solution for a volume production system (1). One way to achieve this goal would be to increase the size of
the L2 cache to reduce the miss ratio. While effective, this is a very expensive solution in terms of cost and
space. So how did the Pentium Pro Processor solve this problem ?

The Pentium Solution.

As discussed in the Introduction, the Pentium Pro Processor designers achieved this goal by implementing Dynamic Execution into the Pentium Pro.

                                                                FIG 1. The P6 is implemented as three independent engines that communicate
                                                                                                        using an instruction pool.

The old Pentium processor's superscalar microarchitecture (EXPECT THIS ON FINAL), with its ability to
to execute two instructions per clock, would be difficult to exceed without a new approach. The new approach used by the Pentium Pro processor removes the constraint of linear instruction sequencing (kind of like DLX) between the traditional "fetch" and "execute" phases, and opens up a wide instruction window using an instruction pool, hence Dynamic Execution. Look at page 354 of H&P (Historical Perspective and References) to see major advances in compiler technology, advanced pipelining, multiple-issue processors.

The following steps describe the Pentium Pro processor pipeline.

FETCH/DECODE UNIT: An IN-ORDER unit that takes as input the user program instruction stream
from the instruction pool (look figure above) and decodes them into a series of micro-operations (uops) that represent the dataflow. This is the stage where speculation is used to determine if whether to execute the next instruction or not.

DISPATCH/EXECUTE UNIT: An OUT-OF-ORDER unit that accepts the dataflow stream, schedules execution of the uops subject to data dependencies and resource availability and temporarily stores the results of these speculative executions.

THE RETIRE UNIT: An in-order unit that know how and when to store ("retire") the temporary, speculative results to permanent architectural state.

THE BUS INTERFACE UNIT: A partially ordered unit responsible for connecting the three internal units to the real world. The bus interface unit communicates directly with the L2 cache.

SOME USEFULL LINKS.
Dynamic Execution.
EXTENSIVE CPU STUFF.
CPU BENCHMARK.
CPU FAQ.
Pentium(R) Pro Processor Technical Glossary
Pentium Pro FAQ.
More Pentium Pro Stuff.
SuperPipelining