In this synopsis, we summarize the presentations and the discussion of open problems.
THE NATURE OF THE INTERNET
Andrew Odlyzko (AT&T) opened the session by drawing the contrast between modeling of traditional networks (such as telephone and electrical networks) and modeling of the Internet. He noted that telephone service networks are well defined: for good service, we require open channels, no interruptions, and low noise. Modeling is well understood, and the characteristics are relatively static. One of the few significant changes in U.S. voice traffic is that Mother's Day has been replaced by television call-ins (such as "Who Wants to Be a Millionaire") as the driver of peak phone usage.
The Internet, in contrast, while driven by a simple protocol, has dynamics that are much more complex. Walter Willinger (AT&T) and others helped to explain the basics of Internet terminology and operations. The host computers (sometimes called nodes) and the routers (switches) can be considered as the nodes of a graph. The edges in the graph are links between nodes, some physical and some wireless. Within the Internet there are autonomous systems, which are subnets under control of a single authority.
The Internet is constantly changing and growing, and the nodes and messages are both heterogeneous.
Message traffic is governed by several protocols. The underlying protocol layer is the Internet protocol (IP). Odlyzko likened it to sending postcards that may or may not reach their destination, and some likelihood of delivery failure is tolerated. There are usually multiple paths through the network, and routing between a particular source-destination pair can change suddenly, although it tends to stay constant. Long messages are partitioned into several packets for transmission, with packet size determined by the physical layer (the bandwidth of the line or wireless link). A typical web page might contain 20,000 bytes, which, for example, might be partitioned into 15 packets for transmission. The distribution of sizes has three peaks, at around 50, 500, and 1500 bytes.
On top of this layer are other protocols, the most popular being TCP. In this protocol, the router starts sending message packets, initially at a low rate. If the recipient sends acknowledgement, then the rate (the "window" or number of packets allowed to be pending acknowledgement) is gradually increased, but if no acknowledgement is received within a reasonable amount of time, then the packet is considered lost, the rate is cut in half, and the packets are retransmitted.
Although TCP is the most common protocol, others are also used. For example, UDP is commonly used in applications such as voice and video, where the retransmission and the frequent drops in data rate of TCP would not be tolerable. UDP does not guarantee delivery of packets, and losses are dealt with in other ways. For example, we can encode message packets into multiple classes, so that we can construct a picture at low resolution, received by TCP, with other packets sent UDP.
Thus, the control of traffic in the Internet is distributed, and feedback affects traffic. Empirical results show that traffic is bursty and traffic has fractal behavior over a wide range of timescales. This was discovered by analysis of data at Bellcore in 1992, but verified by a number of studies subsequently. The cause of this self-similarity is the heavy tails on the distribution of lengths of messages. (In a heavy tail distribution, x^(-k), where k is a positive number, is the probability that the length of the message is at least x.) This type of distribution is easy for Markovian queuing theory to handle, but not for non-Markovian.
PREDICTIVE MODELS OF THE INTERNET
Currently, it is within reach to make good temporal models to study performance at a single node or router. It would be desirable to make spatio-temporal models to study performance of the network as a whole.
Problem 1: Create spatio-temporal-layered models that work with various communication levels (application, transport, physical layer, etc.) and have useful predictive behavior about the Internet.
So far, modeling of a single TCP session has been done as a stationary ergotic loss process. One recent model assumes that a single queue is the bottleneck, with several TCP sessions coming through, creating a fluid-like model. A first-in-first-out (FIFO) queuing policy cannot be modeled yet. Willinger also noted that there has been some study of a new protocol: randomly drop packets at a linearly increasing rate when the queue length is too big, even if buffers are available. Mitra noted that if cost is additive, and treatment at next node not dependent on treatment at past ones, the process is a Markov decision process. Stochastic differential equations would also be an approach to modeling router behavior.
If the characteristics of routers (e.g., buffer capacity, out-degree, protocols) are determined by poor models, then they are likely either to be more expensive than necessary or to yield poor performance. The cost implications are significant, due to the number of routers and the expense of wasting wireless bandwidth.
NEW PROTOCOL DESIGNS
There was some controversy over whether study of protocols such as TCP was worth a mathematician's time, or whether instead the effort should be on improved protocols that were also easier to model (Jitendra Malik), but Willinger believed that other protocols would lead to similar models and that TCP would be around for many years. One participant noted the need for control theory expertise in the creation of new protocols, and Malik proposed a related problem:
Problem 2: Is there an effective scalable protocol with no global coordination for network traffic? What are necessary and sufficient conditions for scalability? How should we measure complexity (Hamming window size) of a protocol?
SECURITY ON THE INTERNET AND UNDERSTANDING TRAFFIC PATTERNS
Sugih Jamin (University of Michigan) discussed some Internet security issues. The Internet was designed to be resilient, available, and scalable. Only recently has security become a priority. There are two kinds of attacks: those against information content (violating the secrecy the integrity, or the authentication of content) and those against the infrastructure (by intrusion or denial-of-service). There are well developed tools to deal with attacks on information, including encryption, electronic signatures, and firewalls. But there are no really effective countermeasures for denial-of-service.
Denial of service attacks are accomplished by inundating a victim computer with so many bogus message packets that useful work is brought to a halt. This can be accomplished by taking over other machines, termed zombies, which then all send messages. For instance, they can all request a connection with the victim by sending a synchronization message (a SYN flood) with a "spoofed" (false) return address. Alternatively, they can request that the victim echo a message to a spoofed address, a "smurfing" attack. Currently, the only defenses are good computer hygiene to keep your machine from being targeted as a zombie, and prosecution of offenders under the criminal justice system (for example, the US Computer Fraud and Abuse Act). Neither of these options prevents harm to the victim of these distributed denial-of-service (DDOS).
Problem 3: Can we detect correlated traffic streams across the Internet in real time, transparent to legitimate uses, so that we can detect DDOS attacks before they shut down victims?
In addition, there are even reasons for detecting legitimate overloads, such as the run that shut down the Victoria's Secret website, so that at least some customers can be served.
One possible approach to the problem is to model the traffic incoming to a server as a hidden Markov model and try to find the parameters that characterize DDOS attacks, but this can only be done at the autonomous system level. John Baras (University of Maryland) noted that IBM has a software system that can be downloaded, for analyzing log files in order to detect DDOS (similar to a system for identifying subtexts, subsequences in DNA). But thus far no general solution is available.
Problem 4: Define and detect the signatures of various applications such as Napster and DDOS.
Napster is a way of harvesting audio files from a wide distribution of servers, the machines of all customers of Napster. It has created tremendous traffic on some autonomous systems, such as university computer systems, and in some places has been banned.
NETWORK TOMOGRAPHY
Donald Towsley (University of Massachusetts) posed an open problem concerning "network tomography":
Problem 5: From observations of point-to-point traffic characteristics, can we reconstruct characteristics of the network?
Towsley has done some work on the topic, and discussion was lively as to whether the problem was more analogous to electrical impedance tomography (discussed in an AMS monograph) or to discrete Radon transform tomography. Signal paths are sometimes uncertain (as in electrical impedance tomography), but the discrete nature of the problem is more akin to Radon transform tomography, leading Tony Chan to ask whether there is a radon transform theory for graphs.
There are relevant results from statisticians at Bell Labs on identifiability (Scott Vander Wiel, Bin Yu , Chin Chou, Andrew Davis). Chung, Garrett, Graham, and Shallcross have shown that it is NP-complete to reconstruct a parsimonious graph consistent with given distance (or packet round-trip) measurements.
THE MODELING PROBLEM
Ingrid Daubechies (Princeton) and David Donoho (Stanford) had the difficult task of synthesizing the presentations and discussion into open problems. Daubechies posed additional interconnected problems, some of them close to the problems listed above.
Problem 6: Develop interesting graph theoretical results and multiresolution approaches to the layered Internet. In the multiresolution approach, each node has structure. It would be desirable to make layers within the Internet correspond to the multiresolution, but a different decomposition might be more fruitful. The topology of the network and the bandwidth are both relevant, and Tony Chan noted that there may be connections with the automatic aggregation procedures in algebraic multigrid. There are opportunities for experts in graph theory, statistics, numerical analysis, and graphics to collaborate with network experts.
Problem 7: Statisticians and probabilists need to build "structural" models, simpler than the TCP model, trying to mimic the important characteristic of the Internet that the loss probability is a function of the entire network. There are opportunities for experts in statistics, probability, mathematical modeling, and statistical mechanics to collaborate with network experts.
One simple model that Willinger and Daubechies are studying is a Poisson distribution of packets from each task, with the probability of transmission through a router being a function of traffic at that router at the previous time ("previous" perhaps meaning the minimum time between transmission and acknowledgement of a message). Simulations show that this has some of the characteristics of real networks, but questions remain concerning the form of the function.
Problem 8: Statisticians and probabilists could study data to characterize "invariants" of Internet traffic in order to be able to evaluate models. This is partly a data mining problem.
Donoho summarized the underlying themes as discovering invariants, discovering anomalies, and designing the Internet in a rational way. He noted some cultural differences between network researchers and mathematical scientists: the short time scale in which problems are interesting, massive statistical data sets that can never be completely understood, and the lack of generations of older results to build upon. But he noted how fascinating the Internet is to a statistician accustomed to dealing with naturally-occurring systems: the Internet is a humanly designed system whose performance, at least in principle, can be fully measured. He can imagine larger systems evolving in the near future: sensors on every car or on every person would result in a network with some similar characteristics. The traffic from such ad hoc networks would be poorly structured, just as Internet traffic is, so the Internet is a particularly apt model system. Problems of this sort should also influence statistical curricula, where instruction often focuses on how to get maximal information from small data sets rather than how to extract information from huge amounts of ill-structured data.
The modeling attempts also point up a compelling need for new approaches to stochastic modeling. and to the need for possible involvement from operations research and economic analysis.
In the future, the Internet may provide various classes of services, at different prices, creating whole new modeling problems:
Problem 9: What is simplest mechanism, with the least cooperation from routers (in order to ensure scalability), for computing shadow costs (lagrange multipliers) upon which we can base traffic pricing and priorities?
Some work on such issues has been done by Lowe, a Berkeley group, and Lion and Eckhert.
SUMMARY
As Ingrid Daubechies noted,
work in this field must be interdisciplinary and there will not
be much progress from mathematicians working in isolation.
The NSF Information Technology Initiative is one funding
program that recognizes this reality and encourages such
collaboration.
Copyright 2000, Dianne P. O'Leary
oleary@cs.umd.edu
I'm grateful for the comments of several participants that improved a
draft of this document, but all remaining errors are my
fault alone.
Return to
workshop homepage.