Trace Distribution
Compressed, ASCII format (543KB)
|
Applications for Measurement and Benchmarking of I/O on Parallel Computers
This application tries to extract association rules from retail data -- in
particular, buying patterns that characterize the shopping behavior of retail
customers. This application performs I/O using synchronous read() operations.
Detailed description of this application can be found in:
Andreas Mueller. Fast Sequential and Parallel Algorithms for Association
Rule Mining: A Comparison. Technical Report, CS-TR-3515, University of Maryland,
College Park, August 1995.
Input Dataset
We have used a database consisting of 50 million transactions, with an average
transaction size of 10 items and maximal potentially frequent set size of 3.
The synthetic data was generated based on the following retail data model:
R. Agrawal and R. Srikant. Fast Algorithms for Mining Association Rules
in Large Databases. Proc. of 20th Int'l Conf. on Very Large Databases (VLDB),
Santiago, Chile, September 1994.
The dataset size for this program was 4 GB and was partitioned into 8 files, one
per processor.
Workload
We used "Find all rules" query that extracts all the possible association rules
in the transaction database.
|