Tuning the Performance of
I/O-Intensive Parallel Applications Anurag Acharya, Mustafa Uysal, Robert Bennett, Assaf Mendelson, Fourth Annual Workshop on I/O in Parallel and Distributed Systems, Philadelphia, Pennsylvania, May 27 1996 Abstract Getting good I/O performance from parallel programs is a critical problem for many application domains. In this paper, we report our experience tuning the I/O performance of four application programs from the areas of sensor data processing and linear algebra. After tuning, three of the four applications achieve effective I/O rates of over 100MB/s, on 16 processors. The total volume of I/O required by the programs ranged from about 75MB to over 200GB. We report the lessons learned in achieving high I/O performance from these applications, including the need for code restructuring, local disks on every node and overlapping I/O with computation. We also report our experience on achieving high performance on peer-to-peer configurations. Finally, we comment on the necessity of complex I/O interfaces like collective I/O and strided requests to achieve high performance. Postscript (compressed 137K) A previous version appeared as CRPC TR95632-S (compressed 153 K) |
|
Last Updated: 03/01/99