Modeling and Performance Evaluation Methodology

The basic approaches to performance evaluation of computer systems include analytical modeling, simulation, and measurements. My work focuses on analytical modeling. An example model is a multiclass multiserver queueing system whose attributes include the system's workload in the form of N job arrival processes each with its own arrival rate, the system's resources in the form of a queue and K servers each with its own service rate, and a service scheduling discipline. In general, arrival and service rates may be a function of the current model state, and not all servers need be activated at all times (where activation of more servers improves the performance but may increase cost). Given such models, common performance metrics of interest include expected job response time, server utilization, throughput, and so on.

Performance evaluation of real systems is complex, suffers from scalability problems (such as the so-called ``state space explosion'' problem), and usually requires advanced model solution techniques. Often, such techniques are based on exploitation of ``special structure'' in the models. With large and complex models, these special structures are usually too expensive to expose automatically (as that may involve searching through a combinatorial number of permutations). My work on analytical modeling focuses on discovery of special structure in models which results in order(s) of magnitude improvement in computational efficiency.

Threshold-based Systems

We have developed Markovian models and efficient solutions of systems that employ threshold-based techniques with hysteresis behavior to dynamically adjust the amount of resources available for job service [A4, D10, D20]. A threshold-based approach is useful in systems which incur significant server setup, usage, and removal costs. It is also a good approach to dynamic management of a pool of resources between multiple workload classes using the same system. The hysteresis is needed to guard against momentary fluctuations in workload. For instance, such techniques are used in Novell file servers in managing their buffer pool. However, performance prediction for threshold-based systems is often difficult as such systems exhibit anomalous response time behavior, as observed in our work as well as that of others.

There has been much interest in developing efficient performance evaluation techniques for such systems. However, past results have been obtained only for fairly restricted models which include limitations on one or more of the following: (a) number of servers in the system (often restricted to 2), (b) size of the waiting room, (c) heterogeneity of servers, (d) number of job classes (usually 1), and (e) absence of hysteresis. We have developed a divide-and-conquer type methodology (using stochastic complementation) which allows extremely efficient computation of performance metrics for a variety of threshold-based models with hysteresis without such restrictions, handling heterogeneous servers, bulk arrivals, non-instantaneous server activation, and multi-class workloads, all without restrictions on the number of servers or the size of the waiting room.

For a variety of single-class models [A4, D10] we derived either: (1) an exact closed-form expression, or (2) an exact algorithmic solution, or (3) provable (through sample path analysis) and tight performance bounds, which, for all practical purposes, is as good as having an exact solution (e.g., experiments in [D10] show less than 1.5% spread between upper and lower bounds with an order of magnitude improvement in computational cost over the exact solution). Moreover, allowing non-instantaneous resource activation in a model, while difficult to analyze accurately, is important because activation time is often the distinguishing characteristic of a resource management policy. No other work addresses this problem with more than 2 servers.

The efficiency is obtained by breaking the model into smaller pieces, solving each piece separately, and then using these solutions in constructing the exact solution to the original model. For cases where it was not possible to apply the divide-and-conquer approach directly, we developed modified models which were suited for the divide-and-conquer approach and produced provable and tight bounds on performance metrics of the original model [D10]. Due to the divide-and-conquer nature of our approach, the reduction in computation grows as the number of servers in the system grows.

Based on the success with single-class models, we have developed an efficient method for multi-class models [D20]. In systems with multiple job classes it is desirable to find good approaches to sharing resources without statically partitioning them among the job classes. Threshold-based techniques constitute one category of such approaches. There have been no prior analytical results for the multi-class version of this model, which corresponds to an important characteristic of real systems.

Solving this model exactly is difficult because it is infinite in multiple dimensions and appears to lack sufficient structure. We developed [D20] an approximate iterative solution method. This method ``breaks up'' the original N-class model into N single class models, which are ``coupled'' through a set of blocking probabilities that express the interaction between classes. Empirical comparison against simulation indicates that this method is fast and fairly accurate. Most of our test cases are within 5% of the simulation results with more than two orders of magnitude improvement in computation time. Although this is an approximate solution technique, our experiments indicate that good threshold settings result in much higher performance improvements (e.g., 50%) than the loss in accuracy due to the approximation. Hence, our solution technique is a promising approach to evaluating multi-class threshold-based systems.

Mixed-workload Multimedia Systems

This work focuses on modeling and analysis of storage servers that serve a variety of applications requesting video, image, audio, and text data with vastly different performance and quality-of-service (QoS) requirements as well as resource demand characteristics. For instance, the QoS requirement of continuous media (CM) workload (e.g., video) is to meet specific deadlines in data delivery with high probability (e.g., greater than .99), while the non-CM workload (e.g., images and text) requires fast response time with no errors. Hence, we model multimedia servers which share resources dynamically among the different types of workloads while satisfying their performance requirements.

We first designed a family of disk scheduling algorithms [A12, D16] in order to explore the possible design choices and tradeoffs of interest. Such algorithms (ours as well as those of other researchers) often use cycle-based scheduling of the CM workload which results in models that are non-Markovian (because each chunk of CM workload must be completed by the end of a specified time unit) and non-work-conserving (because early service carries penalties in the form of increased buffer space requirements). These algorithms mostly differ in: (1) how they ``interleave'' service of CM and non-CM workloads, (2) how they order the service of non-CM requests, and (3) how work-conserving they are with respect to CM workload.

We developed a family of analytical non-Markovian models which represent this class of scheduling algorithms and have tractable analytical solutions [A5]. Prior approaches to modeling such systems include M/G/1 queue with vacation models. Our model, which can be viewed as a polling-type system, more faithfully captures the behavior of scheduling algorithms, is extensible to a wider class of such algorithms, and allows computation of a larger number of important performance metrics, for both types of workloads.

In this work [A5], each scheduling algorithm being modeled is captured through construction of appropriate service time distributions for CM and non-CM workloads, where the CM workload distribution is then matched with an Erlang and the non-CM with a load-dependent exponential. (Parameters for these distributions can be obtained by benchmarking the disk and extracting the seek function, transfer times, and so on.) Hence, all the events in the model are exponential, except for the deterministic time unit during which a chunk of the CM workload should be completed. Our solution of the models is based on an existing methodology, by de Souza e Silva, H.R. Gail, and R.R. Muntz¹, of using embedded Markov chains to analyze non-Markovian stochastic processes. It involves a combination of steady state and transient analysis.

¹E. de Souza e Silva, H.R. Gail, and R.R. Muntz, ``Efficient Solutions for a Class of Non-Markovian Models'', in Proceedings of the 2nd International Workshop on the Numerical Solution of Markov Chains, 1994.

[Last updated Sat Jul 29 2000]