Performance Prediction for Large Scale
Data Intensive Applications on Large Scale Parallel Machines

In recent years, I/O-intensive parallel programs have emerged as one of the leading consumers of cycles on parallel machines. Examples of I/O-intensive applications include satellite data processing, medical image databases, high performance relational databases, data mining, and detailed scientific modeling of complex phenomena. It is critical that future parallel machines be designed to accommodate the characteristics of I/O-intensive applications. To be able to do this, hardware designers need tools to accurately predict and analyze the performance of alternative designs for these applications. Conversely, application developers need tools to predict the performance of their applications on existing and future parallel machines in a straightforward way, so that they can be assured of good performance not just on existing machines, but also for the foreseeable future. Performance prediction of applications on parallel machines is a widely studied area. Previous work in this area has mainly focused on performance prediction of compute intensive scientific applications. Performance prediction for data intensive (I/O-intensive) applications on existing and future parallel machines poses several challenges. The vast amount of data processed by these applications requires expensive hardware configurations and renders virtually impossible direct experimentation on the target machine. It also rules out the use of detailed simulation techniques, because of long running times for simulations of large parallel configurations and large datasets. The complexity of these applications hinders the application of analytical methods.

In this work, we are developing a simulation-based framework to predict the performance of data intensive applications on existing and future parallel machines. Our framework consists of two components; application emulators and a suite of simulators. Application emulators accurately capture the behavior of data intensive applications and enable experimentation with critical application components (e.g., input data partitioning, data declustering, processing structure, etc.) easily and flexibly. Our suite of simulators model the I/O and communication subsystems of the parallel machine at a sufficiently detailed level for accuracy in predicting application performance, while providing relatively coarse grain models of the execution of instructions within each processor. We have developed application emulators for three I/O-intensive applications, two satellite data processing applications and a medical image database system for large scale parallel machines. We have also developed a suite of simulation models that are both sufficiently accurate and execute quickly, so are capable of simulating parallel machine configurations of up to thousands of processors on a high-performance workstation. These simulators model the I/O and communication subsystems of the parallel machine at a sufficiently detailed level for accuracy in predicting application performance, while providing relatively coarse grain models of the execution of instructions within each processor. We introduce a new technique, loosely coupled simulation, that embeds the processing structure of the application in the form of a simple dependency graph into the simulator while preserving the application workload. This technique allows accurate, yet relatively inexpensive performance prediction for very large scale parallel machines.

Related Information:
Presentations:

 

Questions? Email us!

Last Updated:  06/07/99