Contents

Faculty Profile

This appendix summarizes the research interests of our faculty. While the research program is highly interdisciplinary, we have organized this section according to the major core computer science research areas in which we have critical mass. These include:

Algorithms and Theory of Computation

Artificial Intelligence

Databases

Computer Vision and Graphics

Scientific Computing

Programming Languages and Software Engineering

Human-Computer Interactions

Systems and Networks

As some faculty conduct research in multiple committees, and as there is widespread collaboration within and between areas, some faculty's research is split into multiple parts below for clarity.

Algorithms and Theory of Computing

Faculty: Gasarch, JaJa, Katz, Kruskal, Khuller, Mount, Smith, Srinivasan and Vishkin

This group's research interests are broadly in the areas of algorithms, logic, complexity theory, cryptography, computational geometry, data structures, learning theory and parallel computing. Several faculty in this group have been working with other faculty in application areas such as graphics, security, networking, image processing and information retrieval. Over the last five years, we have hired two new faculty (Katz and Srinivasan), and published numerous journal and conference papers, including ten papers in the most prestigious and highly competitive FOCS and STOC conferences. In addition, many papers have been published in other highly selective conferences such as SODA, ICALP, CRYPTO and Computational Geometry. Many of our theory faculty have given invited talks at conferences and other Universities about our research.

For the first time, Maryland has two faculty members in the ACM STOC 2003committee. In addition, faculty have chaired the committees for several other conferences recently (ACM CG Conference,Workshop on Algorithms Engineering and Experiments, as well as the Workshop on Approximation Algorithms).

Bill Gasarch studies primarily complexity theory and games. Complexity theory's ultimate goal is, given a problem, determine how hard it is under a variety of measures. This involves logic and combinatorics. Games are of interest for math and computer science education as they can be used to illustrate many concepts in a fun way.

Joseph JaJa's research focuses on development of efficient algorithms and indexing schemes for large scale spatial OLAPs, and for designing efficient scalable algorithms that are portable across different multiprocessor architectures, and on development of efficient data structures and algorithms for range search queries on spatio-temporal data.

Jonathan Katz studies cryptography, network/computer security, and distributed algorithms. One recent representative project is the design of efficient cryptographic algorithms which remain secure in the face of key exposure. He has explored cryptosystems in which the secret key evolves over time, and for which exposure(s) of the secret key corresponding to some time periods have minimal effect of the security of the system at all other times. As part of this work, he has constructed the first forward-secure public-key encryption scheme (answering a problem which had been open for 5 years), introduced the notion of "key-insulated security" and designed encryption/signature schemes secure in this sense, and designed the first intrusion-resilient public-key encryption scheme. Other work in this area has included the development of new threshold cryptosystems and consideration of "forward-secrecy" in password-authenticated key exchange.

Samir Khuller studies efficient algorithms for solving NP-hard problems. These algorithms produce sub-optimal solutions. The goal is to show that the cost of the solution produced is guaranteed to be close to optimal. Such analysis often leads to a better understanding of the problem as well as the development of general algorithmic methods for optimization. The specific problems have to do with issues of load balancing in storage servers, as well as data movement and re-organization in Storage Area Networks, which are very large storage servers that are widely used for serving multimedia data. Algorithms have been developed and analysed both from the worst case point of view, as well as the average behavior as observed experimentally.

Several methods have been developed and patent applications are currently pending. In addition, Dr. Khuller has been exploring data movement applications over the Internet as well, via the use of network flow based methods in the context of the Bistro Project (a system to upload data over wide area networks such as the Internet).

Clyde Kruskal studies parallel computers and parallel algorithms. In addition, he has been working on discrete geometry problems related to coloring the plane.

David Mount investigates the design of efficient algorithms and data structures for geometric problems. His principal interest has been problems with applications in areas such as image processing, pattern recognition, information retrieval, and computer graphics. The major thrust of his recent work has been in approximation algorithms. This work is motivated by applications in which existing exact algorithms are unacceptably slow or use excessive space. He has shown that by accepting a small error relative to exact solutions, it is possible to achieve significantly faster algorithms using relatively simple algorithms and data structures. Examples of problems for which such efficient solutions have been found include nearest-neighbor searching in multidimensional Euclidean spaces, clustering multidimensional data sets, performing pattern matching and registration in digital images, and robustly fitting lines and curves to noisy data. These projects have resulted in theoretical advancements, and they have also been publicly released as software systems, such as ANN, a C++ library for approximate nearest neighbor searching.

Carl Smith studies algorithmic learning theory, the process whereby computers learn from examples; discovery science, the process by which computers can automatically discover interesting facts from huge data sets; and models of computation, a study of computation from a variety of view points.

Aravind Srinivasan conducts research on the design and (theoretical/experimental) analysis of algorithms with applications in networking, combinatorial optimization, information retrieval, and related areas, with an emphasis on probabilistic methods. He has developed rigorous approaches for the design and analysis of approximation algorithms through randomization; the focus here has been on efficiently solving a ``relaxation" of a given (difficult) combinatorial optimization problem, and using randomization to restore the violated constraints. This has led to the current-best approximation algorithms for low-congestion packet routing and scheduling, scheduling broadcasts in pull-based systems, minimal-redundancy storage in distributed networks, etc. He has also developed algorithms for security and multicast in peer-to-peer networks. His interests also include distributed algorithms: a representative recent result is on fast distributed algorithms for constructing ``backbones" for communication in wireless ad hoc networks.

Uzi Vishkin studies parallel algorithms and architectures. The ongoing PRAM-On-Chip project is a direct outgrowth of theoretical work. The project offers a concrete agenda for challenging the 1946 von-Neumann architecture through streamlining the massive knowledge base developed by the parallel algorithms community with the roadmap for CMOS VLSI.

Artificial Intelligence

Faculty: Dorr, Getoor, Hendler, Nau, Perlis, Reggia, Subrahmanian

The AI research group at the University of Maryland has major strengths in several diverse areas. These include multilingual processing. AI planning, software agents, ontologies, logic in AI, machine learning, biologically-inspired systems, and uncertainty management.

Bonnie Dorr and her colleagues focus on several areas of broadscale multilingual processing, e.g., machine translation, scalable translingual document detection, and cross-language information retrieval. They have investigated the problem of creating new statistical models that are linguistically informed, leading to higher quality output for a wide range of languages while still being practical to train and use. She has developed the technique ``divergence unraveling", i.e., the detection and resolution of language-pair mismatches where the realization of a concept is distributed differently in different languages. This technique is currently being explored as a vehicle for translating into English from languages as diverse as Arabic, Chinese, and Spanish. This approach (and sub-components thereof) is being tested in a wide range of applications including statistical machine translation, interactive cross-lingual browsing, and headline generation.

Dana Nau, Jim Hendler and V. Subrahmanian are working in AI planning. The motivation for this research has been to develop synergy between the theory and practice of planning: better theories of planning can lead to more useful planning algorithms for practical applications, and experience developing planning systems for practical applications can lead to better theories of planning. This work has produced award-winning results in many different areas. For example, development of planning techniques for the Bridge Baron computer program enabled it to win the 1997 world championship of computer bridge, our SHOP2 planning system won one of the top four awards at the AIPS-2002 International Planning Competition, and our paper on the complexity of plan adaptation using derivational analogy won the award for the best research paper at the ECCBR-2002 conference.

V. Subrahmanian and Jim Hendler are conducting research on software agents. The primary goal is to develop an infrastructure for the creation and management of large scale agent applications. The IMPACT project has developed such a platform where agents can be built on top of legacy code. This work has also developed a firm theoretical foundation for such agents, and extended this theory to handle agents that reason about time, uncertainty, about the beliefs and possible actions of other agents, and security and survivability properties of agents.

This work is closely tied to the ontology and uncertainty work listed below, and the planning work listed above. IMPACT has been used in several DoD applications.

Jim Hendler is one of the creators of the Semantic Web vision and is leading an active research group working on ontologies. This vision focuses on making more of the content on the web machine-readable, and indexing that content against ontologies that enable humans and machines to more precisely define shared terminologies. Beyond ontology, the Semantic Web focuses on logics, proofs, and the development of models whereby Semantic Web agents will be able to determine the trustworthiness of information found on the web and/or communicated from other agents.

Don Perlis works in the areas of common sense reasoning, cognitive modeling, and computational theories of the conscious mind. The underlying methods that he brings to bear on all of these are principally those of metareasoning and time-sensitive reasoning. Together these two provide very powerful tools for tackling many real-world aspects of intelligence, including reasoning in the presence of contradictory information, and with applications to natural-language human-computer dialog, planning with tight deadlines, and recognition and repair of mistakes.

Lise Getoor is investigating the use of structured statistical models for discovering link patterns in graphs, extracting information from semistructured sources and schema discovery and reformulation. She has done work in learning statistical models from both relational and semistructured data. She has successfully shown their effectiveness for selectivity estimation for database query optimization. She has also shown how simple models of link distributions in graphs can be used to improve classification accuracy. She is currently investigating richer models of link distributions, with applications to mining graphs such as the world wide web, citation graphs and social networks. She is working with researchers at ISI/USC on the problem of wrapper induction using structured statistical models. She is also currently involved in a NIMA project on visualizing changing patterns in large graphs.

Jim Reggia and his students are working on three aspects of biologically-inspired computation and AI. First, in computational neuroscience, his research group is studying neural models with a goal of understanding their self-organization, response to sudden damage, the processing of temporal sequences, and neural network learning in general. Second, in evolutionary computation, this group has been using genetic algorithms and related methods to evolve neural networks (as a means of better understanding biological neural circuits and architectures) and multi-agent communication systems in artificial worlds. Finally, cellular automata models of self-replicating machines have been developed to study the fundamental information processing principles involved in self-replication, and how self-replication might arise spontaneously from non-replicating components.

V. Subrahmanian is interested in uncertainty management. He has developed logic programming languages to handle uncertainty, as well as uncertainty and time. In addition, he has developed extensions of the relational algebra to handle uncertain reasoning in relational DBMSs and object oriented DBMSs. Database support for temporal uncertainty has also been studied. More recently, Lise Getoor and V. Subrahmanian developed XML models that incorporate uncertainty, and Dana Nau and V. Subrahmanian developed methods both to plan in the presence of uncertainty as well as to provide database support for monitoring plans, and creating new plans in the presence of existing corpora of plans.

Databases

Faculty: Chawathe, Getoor, Raschid, Roussopoulos, Samet, Subrahmanian

Sudarshan Chawathe's primary research interest is semistructured data, which is data whose structure is irregular, incomplete, and frequently changing. Examples of such data include structured documents (e.g., memos, legal briefs, forms) and data obtained by integrating disparate information sources (e.g., Web sites). With the adoption of XML, semistructured data is growing in both quantity and variety. Semistructured data (and its XML serialization) is modeled as an edge- and node-labeled rooted graph. His SEuS data mining system for efficiently determining frequent substructures in graph data is based on using a concise summary of data as a filter to prune the search for frequent structures. This approach is several orders of magnitude faster than comparable methods, and can efficiently process gigabytes of disk-resident data. He is also working on methods for processing streaming data, which is data that is accessible only in a serial order determined by the data source. Each data item is presented only once. It is not possible to seek forward or backward in the data stream, and data cannot be recalled unless explicitly buffered. He has developed automaton-based methods for evaluating XPath queries on streaming XML data, and implemented them in the XSQ system. He is currently working on expanding this work to more powerful query languages for streaming data. He is also working on differencing, summarization, and visualization of graph-structured data.

Louiqa Raschid's interests include architectures for wide-area computation with heterogeneous information servers; publishing and locating sources based on quality and content metadata using the Web and XML; and query planning and optimization. Wide-area applications utilize the wide-area network to connect hundreds of servers with thousands of clients. Such applications face significant challenges. This dynamic network may result in a wide variability in end-to-end latency. Similarly, as cached resources become obsolete, the staleness of delivered information may vary. Her research is undertaking a comprehensive study of the changing behavior of resources. Her objective is to develop appropriate resource profiles to characterize this behavior and to use these profiles to customize service and information delivery to clients. She is also working on managing the rapidly growing and diverse datasets available to the biological enterprise. Such data presents significant opportunities and challenges for data integration and seamless access. Her research is applying prior expertise with developing data integration architectures based on wrappers and mediators to provide seamless access to heterogeneous Web accessible sources. She is developing techniques from areas such as query optimization, adaptive query evaluation, machine learning and schema mapping and integration of heterogeneous databases, to solve problems of data integration.

Nick Roussopoulos' recent research focuses on data warehousing, dynamic Web content, and network intrusion detection. This work has developed storage architectures and indexing for efficient computation and management of data cubes generated from very large multidimensional data sets. Data cubes provide summaries and aggregations of all possible views of data, and their sizes grow exponentially. The latest result of this storage technology, the Dwarf Cube, obtains a data-reduction reduction up to one million to one. This technology is being evaluated by the largest database companies. The WebViews project aims at improving the performance of database-backed Web servers which are commonly used to generate dynamic content on the Web today. Such content drains Web server resources and inhibits scalability. This work has shown that by using our WebView technology, servers can be scaled up to two orders of magnitude without sacrificing the timeliness of the served information. Nick Roussopoulos is also studying data acquisition for network intrusion detection. The objective of his effort is to design an efficient, adaptive, and decentralized vent data acquisition system for tactical intrusion detection.

Techniques are being developed in three basic areas: (a) online correlation and value dependency discovery over stream data using a limited amount of memory and instructions per data item, (b) compression of data cubes for log data, and (c) delivery of compressed aggregated/correlated data under limited bandwidth.

Hanan Samet's primary research focuses on spatial databases. A significant part of this research effort deals with the integration of spatial and nonspatial data into a database management system. The SAND Spatial Browser is an example of such work which enables users to pose queries that combine spatial and nonspatial data, with the spatial components of the queries being specified graphically. A Java-based version of the SAND Spatial Browser that is designed to enable its use over the internet has been made available to the outside world, and is also being investigated in a peer-to-peer environment. The ultimate goal of the SAND Spatial Browser project is the creation of a spatial spreadsheet. One of the most prominent features of the SAND Browser is the ability to perform ranking. This is an operation that enables us to retrieve data in the order of their distance (spatial) from another spatial object and we can also restrict the domain from which the data is drawn. The approach used differs from conventional approaches in the sense that it is incremental. In essence, once we have obtained the k nearest items, if we want the next nearest, then we can obtain it directly. This approach works for arbitrary geometric objects rather than being restricted to points, and also works in a metric space. It has been shown that this algorithm is I/O optimal. This algorithm has also been used in the VASCO System of Java applets which enables the visualization of over 30 different spatial data structures and algorithms for basic spatial operations that make use of them.

V.S. Subrahmanian is working on several major topics in databases as well as AI. First, he is developing ontology-based models for semantic integration of diverse, distributed heterogeneous databases. Second, he is developing models of databases (relational, object oriented and semistructured) to represent and manipulate time, uncertainty and spatial data. Third, he is developing models of databases to support sophisticated AI planning applications. Fourth, he is developing extensions of the relational algebra to query and summarize multimedia data sets. Finally, he is focusing on efficiently scaling heterogeneous databases via a variety of techniques.

Computer Vision, Computational Geometry and Graphics

Faculty: Aloimonos, Davis, Jacobs, Varshney, Samet

Faculty in this area perform research in computer vision, computer graphics and data structures for geometric objects. The Department has traditionally been very strong in the area of Computer Vision. The Computer Vision Laboratory is one of the oldest and largest laboratories of its kind in the world, performing research in almost all computational aspects of visual processing, both in theory and applications. In addition, the group draws strength from its association with other faculty, such as David Mount (computer Science) who works on Computational Geometry and Rama Chellappa (Electrical Engineering) who works in Signal Processing and many aspects of Computer Vision.

Yiannis Aloimonos' research during the last five years has concentrated on the problem of visual motion, and specifically on the recovery of three-dimensional models of the world from multiple views. One basic result obtained is the relationship of the error in building 3D models from video to the field of view of the camera. As a result sensors with a full field of view can better estimate 3D motion and thus acquire better 3D models. This has led to several efforts in building a variety of panoramic sensors. In Aloimonos' lab, the geometry of eye design has led to an understanding of and developed two principles governing eye design as it relates to the ability to recover 3D models from the processing of the images acquired by the eye. An outcome of this study was the understanding of geometric constraints characterizing the moving plenoptic function. Finally, a new mathematical framework was introduced recently under the heading of Harmonic Computational Geometry as a tool for addressing the correspondence problem (matching) by relating properties of the signal to the 3D geometry. This work bridges the gap between harmonic analysis (signal processing) and geometry.

Larry Davis's principal research area is computer vision, with a focus on visual surveillance. He and his colleagues have been developing new vision algorithms and systems for detection and tracking of people from collections of fixed surveillance cameras, and for analysis of their actions and interactions. His most recent work in surveillance involved the development of kernel density estimation techniques for detection using background models and tracking using combined spatial-color models; multiperspective Bayesian methods for 3D detection and tracking of people in cluttered environments; methods for fitting 3D density models of articulated objects to 3D volumetric reconstructions; and various methodologies for detection of people from moving camera platforms.

David Jacobs has primarily worked on object recognition, given that he sees perceptual organization, object identification, and visual memory as all part of this process. His work on perceptual organization has focused on developing Markov models of shape to assist perceptual completion, and studying human perceptual organization. Work on object identification has focused on understanding how changing lighting effects the appearance of objects, and on how changing pose affects their apparent shape. Work on visual memory has attempted to model human memory, and build retrieval systems for 3D shapes. Some key results on these problems over the last five years are: analytic explanation for the fact that the images of a Lambertian object lie in a low-dimensional linear subspace, and the experimental use of this fact in a face recognition system; development of a photometric stereo algorithm for images with complex, unknown lighting; methods of comparing images that judge how similar they are in a way that is insensitive to lighting changes; a novel model of human word memory, with experiments on human subjects to judge its applicability; methods for finding the pose of objects using correspondences between regions, rather than local geometric features; an affine structure-from-motion system that handles points that appear or disappear from view in the middle of the motion sequence; and development of a search engine that uses shape similarity to find 3D computer graphics models.

A. Varshney's research in the Graphics and Visual Informatics Laboratory deals with a range of issues in visual computing. These include multiresolution modeling and rendering including view-dependent and view-independent simplification hierarchies as well as topology-preserving and topology-reducing hierarchies. The goal here is graphics acceleration with minimal visual degradation as well as enabling 3D graphics over low-bandwidth networks. Recent work also includes developing a lighting model to incorporate subsurface scattering effects within the local illumination framework. In the area of computational biology, he is developing visual informatics tools and technologies that will give scientists deeper insights in understanding the relationships between form and function in various biological proteins. He is developing new methods for efficiently computing and displaying electrostatic potentials by explicitly generating and incorporating the solvent interface. Among many factors involved in protein-protein interactions, shape complementarity is of major importance. The goal of ongoing research in shape complementarity is to develop fast and reliable methods for finding docking sites and corresponding transformations to align the two molecules into complementary conformations. In research on display technologies, he has built a tiled-display system that achieves geometric alignment for 3D graphics applications by pre-warping 3D objects. An ultrasonic tracker is used for user interactions with the displayed objects.

Hanan Samet has developed a map recognition system that combines image, locational, and nonspatial data in a database. The image data in his case consists of a tourist map of Finland and the goal is to locate the various tourist sites by use of their representative symbols on the map. This works involved preprocessing the data and inserting it into the database so that it can be retrieved. Two approaches are used. The first classifies all of the symbols a priori and inserts a classification into the database. The second, stores a feature vector for each symbol. Thus the distinction is somewhat like static versus dynamic binding. Hanan Samet has also developed a method of graphically specifying the queries that uses pictorial query trees. The leaves of a pictorial query tree correspond to individual pictorial queries that specify which objects should appear in the target image, spatial constraints on the distance between them and their relative position, and the minimum required certainty of matching between query-image objects and database-image objects. This work has been generalized to enable users to specify the query shapes instead of being restricted to predefined symbols such as those found in a map's legend. In particular, the ability to handle rectangle, polygon, ellipse, and B-spline shapes was added. The retrieval process makes use of an abstraction of the contour of the shape which is invariant against translation, scale, rotation, and starting point that is based on the use of Fourier descriptors. These abstractions are used in a system to locate logos in an image database.

Scientific Computing

Faculty: Elman, O'Leary, Stewart

Numerical analysis at Maryland has been an extraordinarily strong area since the 1960s and remains quite strong. Werner Rheinboldt, an internationally known expert in the numerical solution of nonlinear equations, was director of the Computer Center and a founding member of the Computer Science Department. Our focus involves algorithms and applications in the fields of numerical linear algebra, partial differential equations, and optimization. Our faculty are active participants in the Applied Mathematics Program at Maryland, which was ranked 11th in the 2002 US News & World Report review of graduate programs. The Mathematics Department at Maryland also supports numerical analysis, with a focus primarily on numerical solution of partial differential equations. Faculty there includes John Osborn, C. David Levermore, Ricardo Nochetto, Tobias von Petersdorff, Bo Li, and Jian Guo Liu. Our relations are close, with most important decisions being made by a joint field committee. We run a weekly seminar in cooperation with Mathematics, IPST, and UMIACS, and we compensate for our small size by hosting visitors and postdocs. We have research funding from ONR and NSF. During the last 5 years we have received a best paper award, an outstanding poster award, and Stewart was the recipient of the Bauer prize from the Technical University of Munich. We are members of editorial boards of the top journals in the area

· Linear Algebra and Its Applications,

· SIAM Journal on Matrix Analysis and Applications,

· Siam Review,

· Numerische Mathematik,

· Mathematics of Computation

· SIAM Journal on Scientific Computing

Pete Stewart has published two volumes of a projected 5-volume series on Matrix Algorithms: Basic Decompositions (1998) and Eigensystems, (2001). We have published 38 journal articles during 1998-2002, with 6 more scheduled for publication.

Howard Elman, Diane O'Leary and Pete Stewart are all investigating various aspects of

Krylov subspace methods for solving linear equations and eigenvalue problems. Krylov subspace methods (e.g., the conjugate gradient and GMRES methods for solving linear systems and the

Lanczos and Arnoldi methods for the eigenproblem) are the workhorse algorithms for large matrices, and our group has contributed to the understanding of these methods. Elman and O'Leary, have studied problems for which GMRES makes no progress in its initial iterations. The tool for analysis was a nonlinear system of equations, the stagnation system, that characterizes this behavior, and we developed several new results on when and why matrices stagnate. O'Leary and Stewart are pursuing some new phenomena in the convergence of Krylov subspaces with error. These new techniques promise to drastically reduce the work involved in some variants of these methods

Elman, O'Leary, and colleagues have developed a multigrid algorithm for numerical solution of acoustic scattering problems that overcomes the breakdown of the standard algorithm as the

wavenumber in the Helmholtz equation increases. They also developed efficient numerical methods for solving the Helmholtz equation when the data or the forcing function is stochastic.

O'Leary, Stewart, and their students have studied algorithms for computing regularized solutions to ill-posed problems, with emphasis on the restoration of blurred images. Recent work has focused on blind deconvolution (where the blurring function is not known exactly) and on tools for computing and displaying the uncertainty in reconstructed images (joint work with James Nagy at Emory).

Howard Elman and his collaborators have developed new algorithms for solving the incompressible Navier-Stokes equations. The approach is based on preconditioning methodologies that take advantage of the saddle point structure of the problem and account for couplings of physical quantities (velocities and pressure) while maintaining computational efficiency.

Diane O'Leary and her collaborators have worked on document retrieval and summarization through linear algebra. For retrieval, she developed the semi-discrete matrix decomposition for use in latent semantic indexing (LSI). For summarization, she uses hidden Markov models plus a pivoted QR decomposition.

Pete Stewart and a student are developing a Fortran95 wrapper called Matwrap for matrix operations that should make it easy to turn code from matrix oriented languages, such as MATLAB, into highly efficient programs.

Programming Languages, Software Engineering

Faculty: Basili, Bederson, Foster, Guimbretiere, Hicks, Memon, Porter, Pugh, Shneiderman, Tseng, Zelkowitz

These three inter-related areas are covered by a single field committee and include a diverse set of activities from work on compiler optimization, to reconfiguring software on-the-fly, to the empirically evaluating software processes and products, to the development of visualization techniques. There are several faculty collaborations and research groups and laboratories, such as the Human Computer Interaction Laboratory and the Experimental Software Engineering Group.

Victor Basili's primary interest is an empirical understanding of the relationship between software processes and products. He has studied the application of techniques in software development organizations and used the data collected to build models that characterize, evaluate, predict and improve the techniques and their effects (Goal/Question/Metric Approach, Quality Improvement Paradigm). He is involved in the development of methods and tools that support the analysis, synthesis, and feedback of project information to support organizational learning (Experience Factory). Current activities involve the development of an experience base of empirical studies, the development of stakeholder-based models for dependable systems, and the evaluation of the maturity of technologies for application in building dependable systems. He has developed families of techniques for abstracting information from software artifacts of various kinds, including requirements, OO designs, and code.

Jeff Foster's research focuses on programming languages and program analysis with applications to software engineering. His goal is to help programmers increase the safety and reliability of software while simultaneously making programs easier to write and maintain. His most recent work proposes type qualifiers as a lightweight, specification-based mechanism for improving the quality of software. Using novel, constraint-based type inference algorithms allows type qualifier systems to scale to analysis of hundreds of thousands of lines of code. As part of this research, type qualifiers have been used to find security vulnerabilities in C programs and to find deadlocks in the Linux kernel.

Michael Hicks' research is oriented primarily towards learning how to develop more flexible, reliable, and secure software. His work bridges the areas of ``systems" and programming languages, in that he has frequently applied or developed language-based technology to solve systems problems, particularly in networking and distributed systems. His most recent research emphasis is the areas of on-the-fly software reconfigurability, programmable networking, garbage collection and memory management, and designing and implementing safe low-level programming languages.

Atif Memon's work focuses primarily on the development of techniques for state-based testing of event-driven systems such as network protocol systems, graphical user interfaces (GUIs), and web interfaces. He has developed techniques to test GUIs and used case studies to show that the GUI testing techniques are both practical and effective. He plans to conduct detailed experiments to provide further empirical evidence of their strengths and weaknesses of the techniques. To this end, he has packaged the techniques into a comprehensive tool for GUI testing, GUITAR. The tool will be available to researchers and practitioners and its deployment will provide opportunities for further research including the development of new testing paradigms for event-based systems. He is also investigating the tailoring of these techniques for protocol testing, web testing, security testing, and configuration space reduction.

Adam Porter's research interests include empirical methods for identifying and eliminating bottlenecks in industrial development processes, the experimental evaluation of fundamental software engineering hypotheses, and development of tools that demonstrably improve the software development process. The goal of a new 5-year, multi-million dollar NSF grant involving a multidisciplinary team from five universities and research institutions is to enable the dynamic analyses of software systems, around-the-worldand around-the-clock, leveraging fielded resources during local, off-peak hours. To do this the team is devising analysis techniques that are both highly distributed and lightweight from the perspective of individual system users and that are incremental and adaptive in the sense that they change their behavior over time based on earlier results. This approach is expected to give software developers unprecedented insight into the behavior of their systems as they actually run in the field.

Bill Pugh works on tools for helping make software reliable, including static tools for finding errors in object oriented designs and runtime tools that provide a "flight recorder for software". Thus when crashes or undesired behavior occurs, there is more useful information than a stack trace or a core dump. He also works on frameworks and semantics for multithreaded, distributed and real-time middleware. This includes work on the semantics of Java threads and a pure Java message passing framework that allowed a Java implementation of the NAS CG benchmark to substantially outperform the Fortran/MPI implementation.

Chau-Wen Tseng works on software support for high-performance computing, specifically in developing compilation and run-time techniques to efficiently exploit architectural features found in modern microprocessors. His research focuses on two fundamental issues: parallelism and locality. By applying sophisticated analyses and transformations, compilers can automatically generate programs with good performance for many advanced architectures. He has developed optimization techniques in a variety of areas, ranging from reducing inter-processor communication for message-passing multiprocessors to exploiting multi-level caches on high-performance microprocessors. Improvements are demonstrated through prototype implementations and experimental results from representative programs and standard benchmark suites.

Marvin Zelkowitz is looking at the problem of understanding new software development technologies and specifically how those technologies get transferred into industrial practices. All too often new technology is promoted by hype with little empirical data supporting it. A scientific approach to technology validation often needs experimental approaches to validating this new technology. Recently completed activities includes 25 years of experiences with the NASA GSFC Software Engineering Laboratory from 1976 through 2001 as well as a recently completed study of return on investment from independent verification and validation (IV\&V) in the NASA space shuttle program. A new 5-year research project involves understanding high dependability within the NASA domain. He is also Chief Scientist of the Fraunhofer Center Maryland, where he interacts with the staff there on related empirical studies.

Human-Computer Interaction

Faculty: Bederson, Guimbretiere, Shneiderman

Ben Shneiderman works on the development and application of information visualization tools, including treemaps and starfield displays. Many of these tools have had commercial success in a variety of applications, e.g., treemaps have been applied to the stock market and business analysis, starfield displays spawned a commercial product called Spotfire. He continues to develop new algorithms for treemaps, refining the techniques of dynamic queries and extending them to new domains such as time series data. Current projects include TimeSearcher, a general purpose tool for exploration and pattern identification in time series data, and Microarray, a set of experiments used to examine changes in gene expression over time where data sets are analyzed using clusters, self-organizing maps, heat maps, and other standard microarray analysis tools. TimeSearcher is based on the use of timeboxes - rectangular, direct-manipulation queries - to support interactive exploration via dynamic queries and provides overviews of query results and drag-and-drop support for query-by-example.

Francois Guimbretiere is investigating novel interaction techniques for interactive surfaces. While his previous work focused on large vertical interactive surfaces like the white-board like Stanford Interactive Mural, his current project will explore horizontal interactive surfaces like digital tables, tablet computers and digital paper systems such as the Anoto pen. Over the next few years, he will implement and compare interfaces designed for each of these systems to understand how to narrow the gap between computers and paper.

Ben Bederson works on interaction and visualization techniques, focusing on three areas: Zoomable User Interfaces (ZUIs), mobile devices, and interfaces for children. ZUIs are dynamic contextual information displays that use spatial representations and smooth zooming for navigation. They use animated zooming to present information in context. ZUIs have been applied to tree browsing, web history navigation, presentations, and photo browsing. There are general-purpose ZUI toolkits (Jazz, Piccolo) that are broadly used. ZUIs are useful in many contexts where there is more information than fits on the screen, e.g., mobile devices with small displays. He has applied ZUIs and other techniques to PDAs for applications such as calendars, menu selection, and photo browsing. He has applied ZUIs as a base technology to a broad set of applications for children - the most recent of which is the International Children's Digital Library, a collaborative effort to provide access to thousands of books from around the world to children.

Systems and Networks

Faculty: Agrawala, Arbaugh, Bhattacharjee, Hollingsworth, Iftode,

Keleher, Shankar, Sussman

Faculty in this group are involved in many major aspects of systems and network research. Systems faculty have a well established tradition of collaborative work on projects that address system design and methodologies, system building, prototype implementations and testbeds, modeling and performance evaluation, measurements and empirical studies, and basic and fundamental studies. The current research efforts involve a variety of systems including distributed systems, pervasive systems, mobile computing, GRID computing, resource-aware computing, high performance computing, operating systems, real time systems, wireless networking, network protocols, and system security.

Ashok Agrawala and Udaya Shankar are studying primarily networking systems and network performance, including location-based systems, andhave developed systems like NetDyn and Rover. Netdyn has been used to monitor end-to-end behavior of packet losses and round trip delays and has uncovered irregular behavior of network components such as routers. As a part of project Mentor, scalable monitoring techniques based on Active Networking have been developed, along with new fast-timescale control techniques. Current results include new migration algorithms for MPLS networks, and distributed monitoring and rerouting of flows around network ``hot spots''. Rover is a software infrastructure that treats time and location as first class entities. It uses signal strength-based location-determination techniques to create radio-maps, which are used to support location-dependent computing. The base Rover protocols have been implemented and demonstrated over PDAs and laptops using 802.11 networks. A system, DRACO, based on Rover design but having the rapid deployment capabilities is being developed for support of first responders. This work involvesdeveloping techniques and architecture for location-based services and spontaneous services in mobile platforms, particularly the integration of WLAN and traditional wired networking.

Udaya Shankar is also studying performance of large networking systems involving development of efficient techniques for performance evaluation. This approach is based on the Z-iteration method, which is applicable to time-dependent queueing systems and yields time evolution of various instantaneous probabilistic measures (eg, blocking probabilities, average number of customers at resource and in service, etc) several orders faster than numerical or simulation approaches. He is developing a compositional design and analysis framework for the analysis and testing of correctness properties of concurrent systems, including real-time and security properties. The approach used is layered compositionality and assertional techniques. The testing framework is in Java and is being applied to undergraduate networking classes.

Bobby Bhattacharjee and Pete Keleher are studying multi-party security. They have developed new techniques for securing large group communications over the Internet. The work includes scalable techniques for re-keying of TerraDir, which is a distributed peer-to-peer directory protocol that can be used as the basis for implementing customized directories for Internet applications. Bobby Bhattacharjee is also working on other network security projects.

For example, NICE is a cooperative framework for scalably implementing distributed applications over the Internet. Applications in NICE are cooperative: they devote a part of their own resources to be used by any member of a cooperative group. The goal of NICE is to show that cooperative applications can achieve overall better performance than applications that do not cooperate. They have developed a set of protocols for application-layer multicast, application-layer distance estimation, and secure multicast within the NICE framework.

Jeff Hollingsworth is studying high performance computing. Two of his projects are systems called Dyninst and Active Harmony. The normal cycle of developing a program is to edit source code, compile it, and then execute the resulting binary. With Dynaist, an Application Program Interface (API) has been created to permit the insertion of code into a running program. The goal of this API is to provide a machine independent interface to permit the creation of tools and applications that use runtime code patching. With Active Harmony, a software architecture that supports distributed execution of computational objects in dynamic execution environments, with automatic application adaptation and shared data interfaces, is being studied.

The unique aspect of the Active Harmony work is the emphasis on adapting to heterogeneous and changing environments. The primary result of this research will be an infrastructure and a set of algorithms that permit global resource optimization under changing conditions.

Alan Sussman is also working in the area of high performance computing. Some of his work focuses on runtime and compiler support for data-intensive applications. The goal of this research is to build a common set of software tools and infrastructure that can support the development of many classes of parallel and distributed data intensive applications. He is addressing the problem of coordinating the various components of the application, namely computation, I/O and interprocessor communication. He has been investigating these issues in both tightly coupled parallel environments and the distributed heterogeneous Computational Grid environment across multiple application domains, and has built both an object-oriented framework and a component-basedsoftware environment for creating high performance data intensive applications.

Sussman is also studying interoperability of data parallel programs. While in sequential programs applications can use simple abstractions for moving data between address spaces, such as sockets, pipes or shared memory segments, no such facilities have existed for parallel programs. He has been working on a meta-library approach to solving the problem, and has built prototype software that enables exchange of data between separate (sequential or parallel) programs, and can also be used to allow data transfers between data managed by different data parallel regions in the same program.

Bill Arbaugh is working primarily in the area of computer security. One of his projects is wireless mobility and security. In this effort the areas being investigated include fast hand-off's, probabilistic based ad-hoc routing, and ad-hoc service discovery. The research to date has resulted in several widely used software artifacts, and widely read technical reports and publications.Bill Arbaugh is also working on platform security and configuration management.

Rather than take the standard approach to securing platforms, i.e., trusted operating systems, this work is focusing on improving systems management by providing a dynamic and independent auditing capability that is OS independent.

Liviu Iftode works in the area of operating systems and distributed computing. His Split-OS is new operating system architecture for the next generation of internet servers built as clusters of intelligent devices. The lab hosts several projects related to Split-OS covering networking and file system issues as well as highly-available services. In another project called Smart Messages, he is developing a system architecture to support the computation diffusion on ad-hoc networks. He is also actively pursuing research in the area of massive networks of embedded devices such as networks of sensors.

Contents