CMSC411 - Chapter 1 Notes

Chapter 1

1.1 Introduction (page 2)

Computer technology is growing fast
RISC (Reduced Instruction Set Computer)

1.2 The Changing Face of Computing and the Task of the Computer Designer (page 4)

Three different computing markets: Desktop Computing, Servers, Embedded Computers
Clock rate is not the best way to measure CPU performance
availability - the system can reliably and effectively provide a service
reliability - the system never fails
Instruction Set Architecture - the actual programmer-visible instruction set which serves as the boundary between the software and the hardware.

1.3 Technology Trends (page 11)

Integrated circuit logic technology - transistors
Semiconductor DRAM - dynamic random-access memory
Magnetic disk technology - hard drives and stuff
Network technology - switches and bandwidth
Distributing the power, removing the heat, and preventing hot spots have become increasingly difficult challenges.

1.4 Cost, Price, and Their Trends (page 14)

Wafer - a bunch of processors on a circular, 8" chip
Cost of integrated circuit = cost of die + cost of testing die + cost of packaging and final test

Cost of die = cost of wafer

1.5 Measuring & Reporting Performance (page 24)

Execution time - the time between the start and execution of an event. Also called response time. This term is used a lot when talking about personal computers.
Throughput - total amount of work done in a given time. This term is used a lot when talking about servers.
Because performance and execution time are reciprocals:

ExecutionTime_y

Programs used to evaluate performance: real applications, modified (or scripted) applications, kernels, toy benchmarks, synthetic benchmarks
SPEC (Standard Performance Evaluation Corporation) is a company that tries to create standardized benchmark application suites. http://www.spec.org
Ziff Davis: http://www.etestinglabs.com/benchmarks/
Transaction processing benchmarks (for servers): http://www.tpc.org
It is important to be able to reproduce performance results. When reporting performance results you should list everything another experimenter would need to duplicate the results.
Weighted Execution Time is a way of measuring performance. You give weights to the different programs a test runs so that way if one program is used more than another then a more realistic summary of the output will be achieved. It's like how tests, quizzes, homework, and class participation are all weighted in calculating your performance, i.e. your grade, for that class. Good for predictions, not good for comparisons.
Normalized execution time is when you take a geometric mean of of your execution times, and then take the average of those. The geometric mean is used as the normalizing ratio. Good for comparisons, not good for predictions.

1.6 Quantitative Principles of Computer Design (page 39)

Make the common case fast
Amdahl's Law (page 40)

Speedup = Performance for entire task using the enhancement when possible

Speedup = Execution time for entire task without using the enhancement

( Fraction_enhanced )

Execution time_new = Execution Time_new * ( (1-Fraction_enhanced) + -------------------- )

_enhanced

Speedup_overall = Execution time_old 1

_new

_enhanced

The CPU Performance Equation (page 42)

CPU time = CPU clock cycles for a program * Clock cycle time
CPU time = CPU clock cycles for a program / Clock rate
IC - instruction count
CPI - clocks per instruction
CC - clock cycle time (clock rate)
IPC - instructions per clock (inverse of CPI)
CPU time = IC * CC * CPI
CPU time = ( IC * CC ) / CR

It's hard to change one parameter in complete isolation from others because the basic technologies involved in changing each characteristic are interdependent

Clock cycle time - based on hardware technology and organization
CPI - based on organization and instruction set architecture
Instruction count - based on instruction set architecture and compiler technology

Principle of Locality

Programs tend to reuse data and instructions they have used recently. A widely held rule of thumb is that a program spends 90% of it's execution time in only 10% of the code.
Temporal locality - Recently accessed items are likely to be accessed in the near future.
Spatial locality - Items whose addresses are near one another tend to be referenced close together in time.

1.7 Putting it All Together: Performance and Price Performance

Boring!

1.8 Another View: Power Consumption and Efficiency as the Metric

Just because a CPU is fast, that doesn't mean that it will be the most efficient in terms of power consumption

1.9 Fallacies and Pitfalls

The relative performance of two processors with the same instruction set architecture cannot be judged by clock rate or by the performance of a single benchmark suite. They may have different pipeline structures or memory systems.
Benchmarks do not remain valid indefinitely.
Don't neglect Amdahl's Law anymore than you would with Murphy's Law.
Synthetic benchmarks do not always predict performance for real programs. They are not real programs and so they may not reflect program behavior for factors not measured. See Whetstone and Dhrystone.

1.10 Concluding Remarks

Make the common case fast

1.11 Historical Perspective and References

Everything is John von Neumann's fault

Alex Baglione @UMCP
CMSC 411 Summer 2002
Hennessey and Patterson, Computer Architecture: A Quantitative Approach; Third Edition
Chapter 1 Notes - for ~~educational~~ agricultural use only

Web Accessibility