Trace Distribution
Compressed, ASCII format (66.2MB)
Related Information
E. Katz, M. Butler, and R. McGrath.
A Scalable HTTP server: The NCSA prototype.
Computer Networks and ISDN systems, pages 240-249, November 1994.
|
Applications for Measurement and Benchmarking of I/O on Parallel Computers
This application uses a parallel web-server based on the round-robin DNS
scheme described by Katz. Similar schemes are used by most busy
commercial web sites. We used the Apache 1.2 server as the base web
server which is replicated on the participating hosts. This application
uses multiple processes per processor to implement multiple threads of
control. Over the period of a day, it creates a large number of processes
(about 2000), most of which terminate relatively soon. At any given time,
there are no more than ten active processes.
Input Dataset
We used NASA Kennedy Space Center's httpd logs for August 1995 to create
the document hierarchy as well as to drive the application. To account
for the explosive growth in web accesses since 1995, we collapsed the
request stream for the entire month to a single day - taking care to
preserve the time-of-day variations. That is, we merged the 4 days worth
of data into a single day, preserving the timestamp of each request. The
size of the dataset served was 524~MB which is stored in 13,457 files.
Workload
There were seven participating hosts in the experiment. We used four
different hosts as clients to drive the experiment. Each client was
responsible for making all the HTTP requests to a single server. Client
requests are made using the HTTP 1.1 protocol, and servers delivered the
data to the clients based on the size of the request. Clients were connected
to the servers via an ATM switch, hence the timestamps of the requests as
seen by the Web server were accurate. The experiment was run over a 24 hour
period and a total of about 1.5 million HTTP requests were served,
delivering over 36 GB of data.
|