Click here to remove the abstracts
University of Maryland Department of Computer Science and UMIACS Technical Report Numbers CS-TR-3070.1 and UMIACS-TR-93-36.1, Dec 1993.
There exists a large class of scientific applications that are composed of irregularly coupled regular mesh (ICRM) computations. These problems are often referred to as block structured or multiblock, problems and include the Block Structured Navier-Stokes solver developed as NASA Langley called TLNS3D.
Primitives are presented that are designed to help users to efficiently program such problems on distributed memory machines. These primitives are also designed for use by compilers for distributed memory multiprocessors. Communications patterns are captured at runtime, and the appropriate send and received messages are automatically generated. The primitives are also useful for parallelizing regular computations, since block structured computations also require all the runtime support necessary for regular computations.
To appear in IEEE Transactions on Parallel and Distributed Systems in 1995.
Scientific and engineering applications often involve structured meshes. These meshes may be nested (for multigrid codes) and/or irregularly coupled (called multiblock or irregularly coupled regular mesh problems). In this paper, we present a combined runtime and compile-time approach for parallelizing these applications on distributed memory parallel machines in an efficient and machine-independent fashion. We have designed and implemented a runtime library which can be used to port these applications on distributed memory machines. The library is currently implemented on several different systems. Since the design of the library is machine independent, it can be easily ported on other distributed memory machines and environments which support message passing. To further ease the task of application programmers, we have developed methods for integrating this runtime library with compilers for HPF-like parallel programming languages. We discuss how we have integrated this runtime library with the Fortran 90D compiler being developed at Syracuse University. We present experimental results to demonstrate the efficacy of our approach. We have experimented with a multiblock Navier-Stokes solver template and a multigrid code. Our experimental results show that our primitives have low runtime communication overheads. Further, the compiler parallelized codes perform within 20% of the code parallelized by manually inserting calls to the runtime library.
Published in Supercomputing '93 in Nov 1993.
Scientific and engineering applications often involve structured meshes. These meshes may be nested (for multigrid or adaptive codes) and/or irregularly coupled (called Irregularly Coupled Regular Meshes). We have designed and implemented a runtime library for parallelizing this general class of applications on distributed memory parallel machines in an efficient and machine independent manner. In this paper we present how this runtime library can be integrated with compilers for High Performance Fortran (HPF) style parallel programming languages. We discuss how we have integrated this runtime library with the Fortran 90D compiler being developed at Syracuse University and provide experimental data on a block structured Navier-Stokes solver template and a small multigrid example parallelized using this compiler and run on an Intel iPSC/860. We show that the compiler parallelized code performs within 20% of the code parallelized by inserting calls to the runtime library manually.
Published in Computing Systems in Engineering in 1992.
This paper describes a set of primitives (PARTI) developed to efficiently execute unstructured and block structured problems on distributed memory parallel machines. We present experimental data from a 3-D unstructured Euler solver run on the Intel Touchstone Delta to demonstrate the usefulness of out methods.
To appear in 1995 International Parallel Processing Symposium in Apr 1995.
For better utilization of computing resources, it is important to consider parallel programming environments in which the number of available processors varies at runtime. In this paper, we discuss runtime support for data parallel programs in such an adaptive environment. Executing data parallel programs in an adaptive environment requires determining new loop bounds and communication patterns for the new set of processors. We have developed a runtime library to provide this support. We discuss how the runtime library can be used by compilers to generate code for an adaptive environment. We also present performance results for a multiblock Navier-Stokes solver run on a newtork of workstations using PVM for message passing. Our experiments show that if the number of processors is not varied frequently, the cost of data redistribution is not signifigant compared to the time required for the actual computations.
To appear in the Proceedings of the Seventh SIAM Conference on Parallel Processing for Scientific Computing in Feb 1995.
One class of scientific and engineering applications involves structured meshes. One example of a code in this class is a flame modelling code developed at the Naval Research LAboratory (NRL). The numerical model used in the NRL flame code is predominantly based on structured finite volume methods. The chemistry process of the reactive flow is modeled by a system of ordinary differential equations which is solved independantly at each grid point. Thus, though the model uses a mesh structure, the workload at each grid point can vary considerably. It is this feature that requires the use of both structured and unstructured methods in the same code. We have applied the Multiblock PARTI and CHAOS runtime support libraries to parallelize the NRL flame code with minimal changes to the sequential code. We have also developed parallel algorithms to carry out dynamic load balancing. It has been observed that the overall performance scales reasonably up to 256 Paragon processors and that the total runtime on a 256-node Paragon is about half that of a single processor Cray C90.