Department of Computer Science  
University of Maryland  
College Park, Maryland 20742
Click here to remove the abstracts.

R. P. Nance, R. G. Wilmouth, B. Moon, H. A. Hassan, J. Saltz

AIAA Journal of Thermophysics and Heat Transfer, Pages 471-477, Volume 9, Number. 3, July 1995

University of Maryland Technical Report: CR-TR-3425 and UMIACS-TR-95-25

This paper describes a parallel implementation of the direct simulation Monte Carlo method. Runtime library support is usedfor scheduling and execution of communication between nodes, and domain decomposition is perfonned dynamically to maintain a favorable load balance. Perfonnance tests are conducted using the code to evaluate various remapping and remapping- interval policies, and it is shown that a one-dimensional chain-partitioning method works bestfor the problems considered. The parallel code is then used to simulate the Math 20 nitrogen flow over a finite-thickness flat plate. It will be shown that the parallel algorithm produces results which are very similar to previous DSMC results, despite the increased resolution available. However, it yields significantlyfaster execution times than the scalar code, as well as very good load-balance and scalability characteristics.

B. Moon, M. Uysal, J. Saltz

Proceedings of the Ninth International Parallel Processing Symposium, Pages 812-819, April 1995

Current research in parallel programming is focused on closing the gap between globally indexed algorithms and the separate address spaces of processors on distributed memory multicomputers. A set of index translation schemes have been implemented as a part of CHAOS runtime support library, so that the library functions can be used for implementing a global index space across a collection of separate local index spaces. These schemes include two software-cached translation schemes aimed at adaptive irregular problems as well as a distributed translation table technique for statically irregular problems. To evaluate and demonstrate the efficiency of the software-cached translation schemes, experiments have been performed with as adaptively irregular loop kernel and a full-fledged 3D DSMC code from NASA Langely on the Intel Paragon and Cray T3D. This paper also discusses and analyzes the operational conditions under which each scheme can produce optimal performance.

Bongki Moon and Joel Saltz.

Proceedings of the Scalable High Performance Computing Conference 1994, Pages 176-183, May 1994

In highly adaptive irregular problems such as many Particle-In-Cell (PIC) codes and Direct Simulation Monte Carlo (DSMC) codes, data access patterns may vary from time step to time step. This fluctuation may hinder efficient utilization of distributed memory parallel computers because of the resulting overhead for data redistribution and dynamic load balancing. This may hinder efficient utilization of runtime pre-processing because the pre-processing requirements are sensitive to perturbations in the data access patterns. To efficiently parallelize such adaptive irregular problems on distributed memory parallel computers, several issues such as effective methods for domain partitioning, efficient index dereferencing and fast data transportation must be addressed. This paper presents efficient runtime support methods for such problems. These new runtime support primitives have recently been implemented and added to the CHAOS library. A new domain partitioning algorithm is introduced A simple one-dimensional domain partitioning method is implemented and compared with unstructured mesh partitioners such as recursive coordinate bisection and recursive inertial bisection. A remapping decision policy has been investigated for dynamic load balancing on 3-dimensional DSMC codes. Performance results are presented.

 

 

[Applications | High Performance I/O | Compilers | Tools]