Parallel Web Server
Description
This application uses a parallel web-server based on the round-robin
DNS scheme described by Katz [1]. Similar schemes are used by most
busy commercial web sites. We used the
Apache 1.2 server as the base web server which is replicated on
the participating hosts. This application uses multiple processes per
processor to implement multiple threads of control. Over the period of
a day, it creates a large number of processes (about 2000), most of
which terminate relatively soon. At any given time, there are no more
than ten active processes.
Input Dataset
We used NASA Kennedy Space Center's httpd logs for August 1995 to
create the document hierarchy as well as to drive the application. To
account for the explosive growth in web accesses since 1995, we
collapsed the request stream for the entire month to a single day -
taking care to preserve the time-of-day variations. That is, we merged
the 4 days worth of data into a single day, preserving the timestamp
of each request. The size of the dataset served was 524~MB which is
stored in 13,457 files.
Workload
There were seven participating hosts in the experiment. We used four
different hosts as clients to drive the experiment. Each client was
responsible for making all the HTTP requests to a single
server. Client requests are made using the HTTP 1.1 protocol, and
servers delivered the data to the clients based on the size of the
request. Clients were connected to the servers via an ATM switch,
hence the timestamps of the requests as seen by the Web server were
accurate. The experiment was run over a 24 hour period and a total of
about 1.5 million HTTP requests were served, delivering over 36 GB of
data.
Traces
You can download the trace files in the following formats:
References
[1] E. Katz, M. Butler, and R. McGrath. A Scalable HTTP server:
The NCSA prototype. Computer Networks and ISDN systems, pages
240-249, November 1994.
Last updated on Tue May 27 12:37:44 EDT 1997
by Mustafa Uysal (uysal@cs.umd.edu ).