- SWORD: Workload-aware Data Placement and Replica Selection for Cloud Data Management Systems;
Ashwin Kumar Kayyoor, Abdul Quamar, Amol Deshpande, Samir Khuller;
VLDB Journal (Special Issue on Data-Intensive Cloud Infrastructure), 23(6): 845-870, 2014.
- PStore: An Efficient Storage Framework for Managing Scientific Data;
Souvik Bhattacherjee, Amol Deshpande, and Alan Sussman;
SSDBM 2014.
In this paper, we present the design, implementation, and evaluation of PStore, a no-overwrite storage framework for managing large volumes of array data
generated by scientific simulations. PStore comprises of two modules, a data ingestion module and a query processing module, that respectively address two of the
key challenges in scientific simulation data management. The data ingestion module is geared toward handling the high volumes of simulation data generated at a
... [more]
In this paper, we present the design, implementation, and evaluation of PStore, a no-overwrite storage framework for managing large volumes of array data
generated by scientific simulations. PStore comprises of two modules, a data ingestion module and a query processing module, that respectively address two of the
key challenges in scientific simulation data management. The data ingestion module is geared toward handling the high volumes of simulation data generated at a
very rapid rate, which often makes it impossible to offload the data onto storage devices; the module is responsible for selecting an appropriate compression
scheme for the data at hand, chunking the data, and then compressing it before sending it to the storage nodes. On the other hand, the query processing module is
in charge of efficiently executing different types of queries over the stored data; in this paper, we specifically focus on slice (also called range) queries.
PStore provides a suite of compression schemes that leverage existing techniques while extending some of them to provide support for diverse scientific
simulation data. To efficiently execute queries over such compressed data, PStore adopts and extends a two-level chunking scheme by incorporating the effect of
compression, and hides expensive disk latencies for long running range queries by exploiting chunk prefetching. In addition, we also parallelize the query
processing module to further speed up execution. We evaluate PStore on a 140 GB dataset obtained from real-world simulations using the regional climate model
CWRF. In this paper, we use both 3D and 4D datasets and demonstrate high performance through extensive experiments.
- EAGr: Supporting Continuous Ego-centric Aggregate Queries over Large Dynamic Graphs;
Jayanta Mondal and Amol Deshpande;
SIGMOD 2014 (also CoRR Technical Report arXiv:1404.6570).
In this work, we present EAGr, a system for supporting large numbers of
continuous neighborhood-based ("ego-centric") aggregate queries over large,
highly dynamic, and rapidly evolving graphs. Examples of such queries include
... [more]
In this work, we present EAGr, a system for supporting large numbers of
continuous neighborhood-based ("ego-centric") aggregate queries over large,
highly dynamic, and rapidly evolving graphs. Examples of such queries include
computation of personalized, tailored trends in social networks, anomaly/event
detection in financial transaction networks, local search and alerts in
spatio-temporal networks, to name a few. Key challenges in supporting such
continuous queries include high update rates typically seen in these
situations, large numbers of queries that need to be executed simultaneously,
and stringent low latency requirements. We propose a flexible, general, and
extensible in-memory framework for executing different types of ego-centric
aggregate queries over large dynamic graphs with low latencies. Our framework
is built around the notion of an aggregation overlay graph, a pre-compiled data
structure that encodes the computations to be performed when an update/query is
received. The overlay graph enables sharing of partial aggregates across
multiple ego-centric queries (corresponding to the nodes in the graph), and
also allows partial pre-computation of the aggregates to minimize the query
latencies. We present several highly scalable techniques for constructing an
overlay graph given an aggregation function, and also design incremental
algorithms for handling structural changes to the underlying graph. We also
present an optimal, polynomial-time algorithm for making the pre-computation
decisions given an overlay graph, and evaluate an approach to incrementally
adapt those decisions as the workload changes. Although our approach is
naturally parallelizable, we focus on a single-machine deployment and show that
our techniques can easily handle graphs of size up to 320 million nodes and
edges, and achieve update/query throughputs of over 500K/s using a single,
powerful machine.
- Optimization Techniques for "Scaling Down" Hadoop on Multi-Core, Shared-Memory Systems;
K. Ashwin Kumar, Jonathan Gluck, Amol Deshpande, Jimmy Lin;
EDBT 2014.
The underlying assumption behind Hadoop and, more generally, the need for distributed processing is that the data to be analyzed cannot be held in memory on a
single machine. Today, this assumption needs to be re-evaluated. Although petabyte-scale datastores are increasingly common, it is unclear whether ``typical''
analytics tasks require more than a single high-end server. Additionally, we are seeing increased sophistication in analytics, e.g., machine learning, which
... [more]
The underlying assumption behind Hadoop and, more generally, the need for distributed processing is that the data to be analyzed cannot be held in memory on a
single machine. Today, this assumption needs to be re-evaluated. Although petabyte-scale datastores are increasingly common, it is unclear whether ``typical''
analytics tasks require more than a single high-end server. Additionally, we are seeing increased sophistication in analytics, e.g., machine learning, which
generally operate over smaller and more refined datasets. To address these trends, we propose ``scaling down'' Hadoop to run on shared-memory machines. This
paper presents a
prototype runtime called Hone, intended to be both API and binary compatible with standard (distributed) Hadoop. That is, Hone can take an existing Hadoop jar
and run it, without modification, on a multi-core shared-memory machine. This allows us to take existing Hadoop algorithms and find the most suitable runtime
environment for execution on datasets of varying sizes. Our experiments show that Hone order of magnitude faster than Hadoop pseudo-distributed mode (PDM); on
dataset sizes that fit into memory, Hone outperforms a fully-distributed 15-node Hadoop cluster in some cases as well.
- Subgraph Pattern Matching over Uncertain Graphs with Identity Linkage Uncertainty;
Walaa Eldin Moustafa, Angelika Kimmig, Amol Deshpande, Lise Getoor;
ICDE 2014 (also CoRR Technical Report arXiv:1305.7006).
There is a growing need for methods which can capture uncertainties and answer queries over graph-structured data. Two common types of uncertainty are uncertainty over the attribute values of nodes and uncertainty over the existence of edges. In this paper, we
combine those with identity uncertainty. Identity uncertainty represents uncertainty over the mapping from objects mentioned in the data, or references, to the underlying real-world entities. We propose the notion of a probabilistic entity graph (PEG), a
probabilistic graph model that defines a distribution over possible graphs at the entity level. The model takes into account node attribute uncertainty, edge existence uncertainty, and identity uncertainty, and thus enables us to systematically reason about all
... [more]
There is a growing need for methods which can capture uncertainties and answer queries over graph-structured data. Two common types of uncertainty are uncertainty over the attribute values of nodes and uncertainty over the existence of edges. In this paper, we
combine those with identity uncertainty. Identity uncertainty represents uncertainty over the mapping from objects mentioned in the data, or references, to the underlying real-world entities. We propose the notion of a probabilistic entity graph (PEG), a
probabilistic graph model that defines a distribution over possible graphs at the entity level. The model takes into account node attribute uncertainty, edge existence uncertainty, and identity uncertainty, and thus enables us to systematically reason about all
three types of uncertainties in a uniform manner. We introduce a general framework for constructing a PEG given uncertain data at the reference level and develop highly efficient algorithms to answer subgraph pattern matching queries in this setting. Our
algorithms are based on two novel ideas: context-aware path indexing and reduction by join-candidates, which drastically reduce the query search space. A comprehensive experimental evaluation shows that our approach outperforms baseline implementations by orders
of magnitude.
- Approximation Algorithms for Stochastic Boolean Function Evaluation and Stochastic Submodular Set Cover;
Amol Deshpande, Lisa Hellerstein, and Devorah Kletenik;
SODA 2014 (also CoRR Technical Report arXiv:1303.0726).
Stochastic Boolean Function Evaluation is the problem of determining the value of a given Boolean function f on an unknown input x, when each bit of
x_i of x can only be determined by paying an associated cost c_i. The assumption is that x is drawn from a given product distribution, and the goal
is to minimize the expected cost. This problem has been studied in Operations Research, where it is known as "sequential testing" of Boolean
... [more]
Stochastic Boolean Function Evaluation is the problem of determining the value of a given Boolean function f on an unknown input x, when each bit of
x_i of x can only be determined by paying an associated cost c_i. The assumption is that x is drawn from a given product distribution, and the goal
is to minimize the expected cost. This problem has been studied in Operations Research, where it is known as "sequential testing" of Boolean
functions. It has also been studied in learning theory in the context of learning with attribute costs. We consider the general problem of developing
approximation algorithms for Stochastic Boolean Function Evaluation. We give a 3-approximation algorithm for evaluating Boolean linear threshold
formulas. We also present an approximation algorithm for evaluating CDNF formulas (and decision trees) achieving a factor of O(log kd), where k is
the number of terms in the DNF formula, and d is the number of clauses in the CNF formula. In addition, we present approximation algorithms for
simultaneous evaluation of linear threshold functions, and for ranking of linear functions.
Our function evaluation algorithms are based on reductions to the Stochastic Submodular Set Cover (SSSC) problem. This problem was introduced by
Golovin and Krause. They presented an approximation algorithm for the problem, called Adaptive Greedy. Our main technical contribution is a new
approximation algorithm for the SSSC problem, which we call Adaptive Dual Greedy. It is an extension of the Dual Greedy algorithm for Submodular Set
Cover due to Fujito, which is a generalization of Hochbaum's algorithm for the classical Set Cover Problem. We also give a new bound on the
approximation achieved by the Adaptive Greedy algorithm of Golovin and Krause.
- SPARSI: Partitioning Sensitive Data amongst Multiple Adversaries;
Theodoros Rekatsinas, Amol Deshpande, and Ashwin Machanavajjhala;
PVLDB 2013, to be presented at VLDB 2014 (also CoRR Technical Report arXiv:1302.6556).
We present SPARSI, a theoretical framework for partitioning sensitive data across multiple non-colluding adversaries. Most work in privacy-aware
data sharing has considered disclosing summaries where the aggregate information about the data is preserved, but sensitive user information is
protected. Nonetheless, there are applications, including online advertising, cloud computing and crowdsourcing markets, where detailed and
... [more]
We present SPARSI, a theoretical framework for partitioning sensitive data across multiple non-colluding adversaries. Most work in privacy-aware
data sharing has considered disclosing summaries where the aggregate information about the data is preserved, but sensitive user information is
protected. Nonetheless, there are applications, including online advertising, cloud computing and crowdsourcing markets, where detailed and
fine-grained user-data must be disclosed. We consider a new data sharing paradigm and introduce the problem of privacy-aware data partitioning,
where a sensitive dataset must be partitioned among k untrusted parties (adversaries). The goal is to maximize the utility derived by partitioning
and distributing the dataset, while minimizing the amount of sensitive information disclosed. The data should be distributed so that an adversary,
without colluding with other adversaries, cannot draw additional inferences about the private information, by linking together multiple pieces of
information released to her. The assumption of no collusion is both reasonable and necessary in the above application domains that require release
of private user information. SPARSI enables us to formally define privacy-aware data partitioning using the notion of sensitive properties for
modeling private information and a hypergraph representation for describing the interdependencies between data entries and private information. We
show that solving privacy-aware partitioning is, in general, NP-hard, but for specific information disclosure functions, good approximate solutions
can be found using relaxation techniques. Finally, we present a local search algorithm applicable to generic information disclosure functions. We
apply SPARSI together with the proposed algorithms on data from a real advertising scenario and show that we can partition data with no disclosure
to any single advertiser.
- Hone: "Scaling Down" Hadoop on Shared-Memory Systems;
K. Ashwin Kumar, Jonathan Gluck, Amol Deshpande, Jimmy Lin;
VLDB Demo 2013.
- Data Placement and Replica Selection for Improving Co-location in Distributed Environments;
K. Ashwin Kumar, Amol Deshpande, and Samir Khuller;
CoRR Technical Report arXiv:1302.4168.
Increasing need for large-scale data analytics in a number of application domains has led to a dramatic rise in the number of distributed data
management systems, both parallel relational databases, and systems that support alternative frameworks like MapReduce. There is thus an increasing
contention on scarce data center resources like network bandwidth; further, the energy requirements for powering the computing equipment are also
... [more]
Increasing need for large-scale data analytics in a number of application domains has led to a dramatic rise in the number of distributed data
management systems, both parallel relational databases, and systems that support alternative frameworks like MapReduce. There is thus an increasing
contention on scarce data center resources like network bandwidth; further, the energy requirements for powering the computing equipment are also
growing dramatically. As we show empirically, increasing the execution parallelism by spreading out data across a large number of machines may
achieve the intended goal of decreasing query latencies, but in most cases, may increase the total resource and energy consumption significantly. For
many analytical workloads, however, minimizing query latencies is often not critical; in such scenarios, we argue that we should instead focus on
minimizing the average query span, i.e., the average number of machines that are involved in processing of a query, through colocation of data items
that are frequently accessed together. In this work, we exploit the fact that most distributed environments need to use replication for fault
tolerance, and we devise workload-driven replica selection and placement algorithms that attempt to minimize the average query span. We model a
historical query workload trace as a hypergraph over a set of data items, and formulate and analyze the problem of replica placement by drawing
connections to several well-studied graph theoretic concepts. We develop a series of algorithms to decide which data items to replicate, and where to
place the replicas. We show effectiveness of our proposed approach by presenting results on a collection of synthetic and real workloads. Our
experiments show that careful data placement and replication can dramatically reduce the average query spans resulting in significant reductions in
the resource consumption.
- HiNGE : Enabling Temporal Network Analytics at Scale;
Udayan Khurana, and Amol Deshpande;
SIGMOD Demo 2013.
- SWORD: Scalable Workload-Aware Data Placement for Transactional Workloads;
Abdul Quamar, K.Ashwin Kumar and Amol Deshpande;
EDBT 2013.
[pdf]
In this paper, we address the problem of transparently scaling out transactional (OLTP)
workloads on relational databases, to support "database-as-a-service" in cloud computing
environment. The primary challenges in supporting such workloads include choosing how to
... [more]
In this paper, we address the problem of transparently scaling out transactional (OLTP)
workloads on relational databases, to support "database-as-a-service" in cloud computing
environment. The primary challenges in supporting such workloads include choosing how to
"partition" the data across a large number of machines, minimizing the number of
"distributed transactions", providing high data availability, and tolerating failures gracefully.
Capturing and modeling the transactional workload over a period of time, and then exploiting
that information for data placement and replication has been shown to provide
significant benefits in performance, both in terms of transaction latencies and overall
throughput. However, such workload-aware data placement approaches can incur
very high overheads, and further, may perform worse than naive approaches if the workload changes.
In this work, we propose SWORD, a scalable workload-aware data partitioning and placement approach for
OLTP workloads, that incorporates a suite of novel techniques to significantly reduce the overheads
incurred both during the initial placement, and during query execution at runtime.
We model the workload as a hypergraph over the data items, and propose using a "hypergraph compression"
technique to reduce the overheads of partitioning. To deal with workload changes,
we propose an incremental data repartitioning technique that modifies data placement in small steps without
resorting to complete workload repartitioning.
We have built a workload-aware "active replication" mechanism in
SWORD to increase availability and enable load balancing. We propose the use of "fine-grained quorums" defined at the level of
groups of tuples to control the cost of distributed updates, improve throughput, and provide adaptability to different workloads.
To our knowledge, SWORD is the first system that uses fine-grained quorums in this context. The results of our experimental evaluation on
SWORD deployed on an Amazon EC2 cluster show that our techniques result in orders-of-magnitude reductions in the partitioning and
book-keeping overheads, and improve
tolerance to failures and workload changes; we also show that choosing quorums based on the query access patterns enables us to better handle query workloads
with different read and write access patterns.
- Efficient Snapshot Retrieval over Historical Graph Data;
Udayan Khurana, and Amol Deshpande;
ICDE 2013 (also CoRR Technical Report arXiv:1207.5777).
We address the problem of managing historical data for large evolving information networks like
social networks or citation networks, with the goal to enable temporal and evolutionary queries and
analysis. We present the design and architecture of a distributed graph database system that stores
... [more]
We address the problem of managing historical data for large evolving information networks like
social networks or citation networks, with the goal to enable temporal and evolutionary queries and
analysis. We present the design and architecture of a distributed graph database system that stores
the entire history of a network and provides support for efficient retrieval of multiple graphs from
arbitrary time points in the past, in addition to maintaining the current state for ongoing updates.
Our system exposes a general programmatic API to process and analyze the retrieved snapshots. We
introduce DeltaGraph, a novel, extensible, highly tunable, and distributed hierarchical index
structure that enables compactly recording the historical information, and that supports efficient
retrieval of historical graph snapshots for single-site or parallel processing. Along with the
original graph data, DeltaGraph can also maintain and index auxiliary information; this
functionality can be used to extend the structure to efficiently execute queries like subgraph
pattern matching over historical data. We develop analytical models for both the storage space
needed and the snapshot retrieval times to aid in choosing the right parameters for a specific
scenario. In addition, we present strategies for materializing portions of the historical graph
state in memory to further speed up the retrieval process. Secondly, we present an in-memory graph
data structure called GraphPool that can maintain hundreds of historical graph instances in main
memory in a non-redundant manner. We present a comprehensive experimental evaluation that
illustrates the effectiveness of our proposed techniques at managing historical graph information.
- Efficiently Adapting Graphical Models for Selectivity Estimation;
Kostas Tzoumas, Amol Deshpande, and Christian Jensen;
VLDB Journal (Special Issue: Best Papers of VLDB 2011), 22(1): 3-27, February 2013.
Query optimizers rely on statistical models that succinctly describe the underlying data. Models are used to derive cardinality estimates for intermediate relations, which in turn guide the optimizer to choose the best query execution plan. The
quality of the resulting plan is highly dependent on the accuracy of the statistical model that represents the data. It is well known that small errors in the model estimates propagate
exponentially through joins, and may result in the choice of a highly sub-optimal query execution plan. Most commercial query optimizers make the attribute value independence assumption: all
... [more]
Query optimizers rely on statistical models that succinctly describe the underlying data. Models are used to derive cardinality estimates for intermediate relations, which in turn guide the optimizer to choose the best query execution plan. The
quality of the resulting plan is highly dependent on the accuracy of the statistical model that represents the data. It is well known that small errors in the model estimates propagate
exponentially through joins, and may result in the choice of a highly sub-optimal query execution plan. Most commercial query optimizers make the attribute value independence assumption: all
attributes are assumed to be statistically independent. This reduces the statistical model of the data to a collection of one-dimensional synopses (typically in the form of histograms), and it
permits the optimizer to estimate the selectivity of a predicate conjunction as the product of the selectivities of the constituent predicates. However, this independence assumption is more
often than not wrong, and is considered to be the most common cause of sub-optimal query execution plans chosen by modern query optimizers. We take a step towards a principled and practical
approach to performing cardinality estimation without making the independence assumption. By carefully using concepts from the field of graphical models, we are able to factor the joint
probability distribution over all the attributes in the database into small, usually two-dimensional distributions, without a significant loss in estimation accuracy. We show how to efficiently
construct such a graphical model from the database using only two-way join queries, and we show how to perform selectivity estimation in a highly efficient manner. We integrate our algorithms
into the PostgreSQL DBMS. Experimental results indicate that estimation errors can be greatly reduced, leading to orders of magnitude more efficient query execution plans in many cases.
Optimization time is kept in the range of tens of milliseconds, making this a practical approach for industrial-strength query optimizers.
- Parallel Pipelined Filter Ordering with Precedence Constraints;
Amol Deshpande, Lisa Hellerstein;
ACM Transactions on Algorithms 8, 4, Article 41 (September 2012), 28 pages..
In the parallel pipelined filter ordering problem, we are given a set of `n'
filters that run in parallel. The filters need to be applied to a stream of
elements, to determine which elements pass all filters. Each filter has a
... [more]
In the parallel pipelined filter ordering problem, we are given a set of `n'
filters that run in parallel. The filters need to be applied to a stream of
elements, to determine which elements pass all filters. Each filter has a
"rate limit" r_i on the number of elements it can process per unit time, and a
"selectivity" p_i, which is the probability that a random element will pass the
filter. The goal is to maximize throughput. This problem appears naturally in
a variety of settings, including parallel query optimization in databases and
query processing over Web services. We present an O(n^3) algorithm for the
above problem, given tree-structured precedence constraints on the filters.
This extends work of Condon et al. and Kodialam, who presented algorithms for
solving the problem without precedence constraints. Our algorithm is
combinatorial and produces a sparse solution. Motivated by join operators in
database queries, we also give algorithms for versions of the problem in which
"filter" selectivities may be greater than or equal to 1. We prove a strong
connection between the more classical problem of minimizing total work in
sequential filter ordering (A), and the parallel pipelined filter ordering
problem (B). More precisely, we prove that A is solvable in polynomial
time for a given class of precedence constraints if and only if B is as
well. This equivalence allows us to show that B is NP-Hard in the
presence of arbitrary precedence constraints (since A known to be NP-Hard
in that setting).
- Lightweight Graphical Models for Selectivity Estimation Without Independence Assumptions;
Kostas Tzoumas, Amol Deshpande, and Christian Jensen;
VLDB 2011.
As a result of decades of research and industrial development, modern query
optimizers are complex software artifacts. However, the quality of the query
plan chosen by an optimizer is largely determined by the quality of the
... [more]
As a result of decades of research and industrial development, modern query
optimizers are complex software artifacts. However, the quality of the query
plan chosen by an optimizer is largely determined by the quality of the
underlying statistical summaries. Small selectivity estimation errors,
propagated exponentially, can lead to severely sub-optimal plans. Modern
optimizers typically maintain one-dimensional statistical summaries and make the
attribute value independence and join uniformity assumptions for efficiently
estimating selectivities. Therefore, selectivity estimation errors in today's
optimizers are frequently caused by missed correlations between attributes. We
present a selectivity estimation approach that does not make the independence
assumptions. By carefully using concepts from the field of graphical models, we
are able to factor the joint probability distribution of all the attributes in
the database into small, usually two dimensional distributions. We describe
several optimizations that can make selectivity estimation highly efficient, and
we present a complete implementation inside PostgreSQL's query optimizer.
Experimental results indicate an order of magnitude better selectivity
estimates, while keeping optimization time in the range of tens of milliseconds.
- Sharing-Aware Horizontal Partitioning for Exploiting Correlations during Query Processing;
Kostas Tzoumas, Amol Deshpande, and Christian Jensen;
VLDB 2010.
[pdf]
Optimization of join queries based on average selectivities is sub-optimal in
highly correlated databases. In such databases, relations are naturally divided
into partitions, each partition having substantially different statistical
... [more]
Optimization of join queries based on average selectivities is sub-optimal in
highly correlated databases. In such databases, relations are naturally divided
into partitions, each partition having substantially different statistical
characteristics. It is very compelling to discover such data partitions during
query optimization and create multiple plans for a given query, one plan being
optimal for a particular combination of data partitions. This scenario calls for
the sharing of state among plans, so that common intermediate results are not
recomputed. We study this problem in a setting with a routing-based query
execution engine based on eddies. Eddies naturally encapsulate horizontal
partitioning and maximal state sharing across multiple plans. We define the
notion of a conditional join plan, a novel representation of the search space
that enables us to address the problem in a principled way. We present a
low-overhead greedy algorithm that uses statistical summaries based on graphical
models. Experimental results suggest an one order of magnitude faster execution
time over traditional optimization for high correlations, while maintaining the
same performance for low correlations.
- Minimizing Communication Cost in Distributed Multi-query Processing;
Jian Li, Amol Deshpande, and Samir Khuller;
ICDE 2009.
[pdf]
Increasing prevalence of large-scale distributed monitoring and computing
environments such as sensor networks, scientific federations, Grids etc., has
led to a renewed interest in the area of distributed query processing and
... [more]
Increasing prevalence of large-scale distributed monitoring and computing
environments such as sensor networks, scientific federations, Grids etc., has
led to a renewed interest in the area of distributed query processing and
optimization. In this paper we address a general, distributed multi-query
processing problem motivated by the need to minimize the communication cost in
these environments. Specifically we address the problem of optimally sharing
data movement across the communication edges in the communication network given
a set of overlapping queries and query plans for them (specifying the join
orders to be used to execute the queries). Most of the problem variations of our
general problem can be shown to be NP-Hard from a reduction from the Steiner
tree problem. However, we show that the problem can be solved optimally if the
communication network is a tree, and present a novel algorithm for finding the
optimal sharing plan. For general communication networks, we present efficient
approximation algorithms for several variations of the problem. Finally, we
present a preliminary experimental study over synthetic datasets showing both
the need for exploiting sharing of data movement and the effectiveness of our
algorithms at finding such plans.
- Algorithms for Distributional and Adversarial Pipelined Filter Ordering Problems;
Anne Condon, Amol Deshpande, Lisa Hellerstein, and Ning Wu;
ACM Transactions of Algorithms, 5(2), Article 24, March 2009..
Pipelined filter ordering is a central problem in database query optimization. The problem is to
determine the optimal order in which to apply a given set of commutative filters (predicates) to a
set of elements (the tuples of a relation), so as to find, as efficiently as possible, the tuples
... [more]
Pipelined filter ordering is a central problem in database query optimization. The problem is to
determine the optimal order in which to apply a given set of commutative filters (predicates) to a
set of elements (the tuples of a relation), so as to find, as efficiently as possible, the tuples
that satisfy all of the filters. Optimization of pipelined filter ordering has recently received
renewed attention in the context of environments such as the web, continuous high-speed data
streams, and sensor networks. Pipelined filter ordering problems are also studied in areas such as
fault detection and machine learning under names such as learning with attribute costs, minimum-sum
set cover, and satisficing search. We present algorithms for two natural extensions of the
classical pipelined filter ordering problem: (1) a "distributional type" problem where the filters
run in parallel and the goal is to maximize throughput, and (2) an "adversarial type" problem where
the goal is to minimize the expected value of "multiplicative regret". We present two related
algorithms for solving (1), both running in time O(n^2), which improve on the O(n^3 log(n))
algorithm of Kodialam. We use techniques from our algorithms for (1) to obtain an algorithm for (2).
- Flow Algorithms for Parallel Query Optimization;
Amol Deshpande, Lisa Hellerstein;
ICDE 2008.
[pdf] [talk]
In this paper we address the problem of minimizing the response time of a
multi-way join query using pipelined (inter-operator) parallelism, in a parallel
or a distributed environment. We observe that in order to fully exploit the
... [more]
In this paper we address the problem of minimizing the response time of a
multi-way join query using pipelined (inter-operator) parallelism, in a parallel
or a distributed environment. We observe that in order to fully exploit the
parallelism in the system, we must consider a new class of "interleaving" plans,
where multiple query plans are used simultaneously to minimize the response time
of a query (or maximize the tuple-throughput of the system). We cast the query
planning problem in this environment as a "flow maximization problem", and
present polynomial-time algorithms that (statically) find the optimal set of
plans to use for a large class of multi-way join queries. Our proposed
algorithms also naturally extend to query optimization over web services.
Finally we present an extensive experimental evaluation that demonstrates both
the need to consider such plans in parallel query processing and the
effectiveness of our proposed algorithms.
- Network-Aware Join Processing in Global-Scale Database Federations;
Xiaodon Wang, Randal Burns, Andreas Terzis, Amol Deshpande;
ICDE 2008.
[pdf]
We introduce join scheduling algorithms that employ a balanced network
utilization metric to optimize the use of all network paths in a global-scale
database federation. This metric allows algorithms to exploit excess capacity in
... [more]
We introduce join scheduling algorithms that employ a balanced network
utilization metric to optimize the use of all network paths in a global-scale
database federation. This metric allows algorithms to exploit excess capacity in
the network, while avoiding narrow, long-haul paths. We give a two-approximate,
polynomial-time algorithm for serial (left-deep) join schedules. We also present
extensions to this algorithm that explore parallel schedules, reduce resource
usage, and define trade-offs between computation and network utilization. We
evaluate these techniques within the SkyQuery federation of Astronomy databases
using spatial-join queries submitted by SkyQuery's users. Experiments show that
our algorithms realize near-optimal network utilization with minor computational
overhead.
- Adaptive Query Processing;
Amol Deshpande, Zack Ives, Vijayshankar Raman;
Foundations and Trends in Databases, 1(1), 2007.
(A "greener" version of the pdf.)
As the data management field has diversified to consider settings in which
queries are increasingly complex, statistics are less available, or data is
stored remotely, there has been an acknowledgment that the traditional
... [more]
As the data management field has diversified to consider settings in which
queries are increasingly complex, statistics are less available, or data is
stored remotely, there has been an acknowledgment that the traditional
optimize-then-execute paradigm is insufficient. This has led to a plethora of
new techniques, generally placed under the common banner of adaptive query
processing, that focus on using runtime feedback to modify query processing in a
way that provides better response time or more efficient CPU
utilization.
In this survey paper, we identify many of the common
issues, themes, and approaches that pervade this work, and the settings in which
each piece of work is most appropriate. Our goal with this paper is to be a
"value-add" over the existing papers on the material, providing not only a brief
overview of each technique, but also a basic framework for understanding the
field of adaptive query processing in general. We focus primarily on intra-query
adaptivity of long-running, but not full-fledged streaming, queries. We conclude
with a discussion of open research problems that are of high importance.
- Flow Algorithms for Two Pipelined Filter Ordering Problems;
Anne Condon, Amol Deshpande, Lisa Hellerstein, and Ning Wu;
PODS 2006.
[pdf] [talk]
Pipelined filter ordering is a central problem in database query optimization,
and has received renewed attention in recent years in the context of
environments such as the web, continuous high-speed data streams and sensor
... [more]
Pipelined filter ordering is a central problem in database query optimization,
and has received renewed attention in recent years in the context of
environments such as the web, continuous high-speed data streams and sensor
networks. In this paper we present efficient algorithms for two natural
extensions of the classical pipelined filter ordering problem: (1) a
"distributional type" problem in a parallel environment where the filters run in
parallel and the goal is to maximize the throughput of the system, and (2) an
"adversarial type" problem where the goal is to minimize the expected value of
"multiplicative regret". We show that both problems can be solved using similar
flow algorithms, which find an optimal ordering scheme in time "O(n^2)", where
"n" is the number of filters. Our algorithm for (1) improves on an earlier
"O(n^3 \log n)" algorithm of Kodialam.
- Exploiting Correlated Attributes in Acquisitional Query Processing;
Amol Deshpande, Carlos Guestrin, Sam Madden, Wei Hong;
ICDE 2005.
[pdf] [talk]
Sensor networks and other distributed information systems (such as the Web) must
frequently access data that has a high per-attribute "acquisition" cost, in
terms of energy, latency, or computational resources. When executing queries
... [more]
Sensor networks and other distributed information systems (such as the Web) must
frequently access data that has a high per-attribute "acquisition" cost, in
terms of energy, latency, or computational resources. When executing queries
that contain several predicates over such expensive attributes, we observe that
it can be beneficial to use correlations to automatically introduce low-cost
attributes whose observation will allow the query processor to better estimate
the selectivity of these expensive predicates. In particular, we show
how to build "conditional plans" that branch into one or more
sub-plans, each with a different ordering for the expensive query
predicates, based on the runtime observation of low-cost
attributes. We frame the problem of constructing the optimal
conditional plan for a given user query and set of candidate low-cost
attributes as an optimization problem. We describe an exponential
time algorithm for finding such optimal plans, and describe a
polynomial-time heuristic for identifying conditional plans that
perform well in practice. We also show how to compactly model
conditional probability distributions needed to identify correlations
and build these plans. We evaluate our algorithms against several
real-world sensor-network data sets, showing several-times performance
increases for a variety of queries versus traditional optimization techniques.
- Lifting the Burden of History from Adaptive Query Processing;
Amol Deshpande, Joseph M. Hellerstein;
VLDB 2004.
[pdf]
Adaptive query processing schemes attempt to reoptimize query plans
during the course of query execution. A variety of techniques for
adaptive query processing have been proposed, varying in the
... [more]
Adaptive query processing schemes attempt to reoptimize query plans
during the course of query execution. A variety of techniques for
adaptive query processing have been proposed, varying in the
granularity at which they can make decisions. The eddy
is the most aggressive of these techniques,
with the flexibility to choose tuple-by-tuple how to
order the application of operators.
In this paper we identify and address a fundamental limitation of the
original eddies proposal: the "burden of history" in routing. We
observe that routing decisions have long-term effects on the state of
operators in the query, and can severely constrain the ability of the
eddy to adapt over time.
We then propose a
mechanism we call STAIR that allows the query engine to manipulate
the state stored inside the operators and undo the
effects of past routing decisions. We demonstrate that eddies with
STAIR achieve both high adaptivity and good performance in the
face of uncertainty, outperforming prior eddy proposals by orders of magnitude.
- An Initial Study of Overheads of Eddies;
Amol Deshpande;
SIGMOD Record, March 2004.
[pdf]
An eddy is a highly adaptive query processing operator that continuously
reoptimizes a query in respose to changing runtime conditions. It does this by
treating query processing as routing of tuples through operators and making
... [more]
An eddy is a highly adaptive query processing operator that continuously
reoptimizes a query in respose to changing runtime conditions. It does this by
treating query processing as routing of tuples through operators and making
per-tuple routing decisions. The benefits of such adaptivity can be significant,
especially in highly dynamic environments such as data streams, sensor query
processing, web querying, etc. Various parties have asserted that the cost of
making per-tuple routing decisions is prohibitive. We have implemented eddies in
the PostgreSQL open source database system in the context of the TelegraphCQ
project. In this paper, we present an "apples-to-apples" comparison of
PostgreSQL query processing overhead with and without eddies. Our results show
that with some minor tuning, the overhead of the eddy mechanism is negligible.
- Cache-and-Query for Wide Area Sensor Databases;
Amol Deshpande, Suman Nath, Phil Gibbons, Srini Seshan;
SIGMOD 2003.
[pdf] [talk] (IrisNet Project Web Page)
Webcams, microphones, pressure gauges and other sensors provide exciting new
opportunities for querying and monitoring the physical world. In this paper we
focus on querying wide area sensor databases, containing (XML) data derived from
... [more]
Webcams, microphones, pressure gauges and other sensors provide exciting new
opportunities for querying and monitoring the physical world. In this paper we
focus on querying wide area sensor databases, containing (XML) data derived from
sensors spread over tens to thousands of miles. We present the first scalable
system for executing XPATH queries on such databases. The system maintains the
logical view of the data as a single XML document, while physically the data is
fragmented across any number of host nodes. For scalability, sensor data is
stored close to the sensors, but can be cached elsewhere as dictated by the
queries. Our design enables self-starting distributed queries that jump directly
to the lowest common ancestor of the query result, dramatically reducing query
response times. We present a novel query-evaluategather technique (using XSLT)
for detecting (1) which data in a local database fragment is part of the query
result, and (2) how to gather the missing parts. We define partitioning and
cache invariants that ensure that even partial matches on cached data are
exploited and that correct answers are returned, despite our dynamic
query-driven caching. Experimental results demonstrate that our techniques dramatically increase query throughputs and decrease query response
times in wide area sensor databases.
- Using State Modules for Adaptive Query Processing;
Vijayshankar Raman, Amol Deshpande, Joe Hellerstein;
ICDE 2003.
[pdf]
We present a query architecture in which join operators are decomposed into
their constituent data structures (State Modules, or SteMs), and data¤ow among
these SteMs is managed adaptively by an Eddy routing operator. Breaking the
... [more]
We present a query architecture in which join operators are decomposed into
their constituent data structures (State Modules, or SteMs), and data¤ow among
these SteMs is managed adaptively by an Eddy routing operator. Breaking the
encapsulation of joins serves two purposes. First, it allows the Eddy to observe
multiple physical operations embedded in a join algorithm, allowing for better
calibration and control of these operations. Second, the SteM on a relation
serves as a shared materialization point, enabling multiple competing access
methods to share results, which can be leveraged by multiple competing join
algorithms. Our architecture extends prior work significantly, allowing
continuously adaptive decisions for most major aspects of traditional query
optimization: choice of access methods and join algorithms, ordering of
operators, and choice of a query spanning tree. SteMs introduce significant
routing flexibility to the Eddy, enabling more opportunities for adaptation, but
also introducing the possibility of incorrect query results. We present
constraints on Eddy routing through SteMs that ensure correctness while
preserving a great deal of flexibility. We also
demonstrate the benefits of our architecture via experiments in the Telegraph dataflow system. We show that even a simple routing policy allows
significant ¤exibility in adaptation, including novel effects like the automatc "hybridization" of multiple algorithms for a single join.
- TelegraphCQ: An Architectural Status Report;
Sailesh Krishnamurthy, Sirish Chandrasekaran, Owen Cooper, Amol Deshpande, Michael J. Franklin, Joseph M. Hellerstein, Wei Hong, Samuel Madden, Frederick Reiss, Mehul A. Shah;
IEEE Data Engineering Bulletin 26(1): 11-18 (2003).
- TelegraphCQ: Continuous Dataflow Processing for an Uncertain World;
Sirish Chandrasekaran, Owen Cooper, Amol Deshpande, Mike Franklin, Joe Hellerstein, Wei Hong, Sailesh Krishnamurthy, Sam Madden, Vijayshankar Raman, Fred Reiss, Mehul Shah;
CIDR 2003.
- Decoupled Query Optimization in Federated Databases;
Amol Deshpande, Joe Hellerstein;
ICDE 2002.
[pdf] [talk]
We study the problem of query optimization in federated relational database
systems. The nature of federated databases explicitly decouples many aspects of
the optimization process, often making it imperative for the optimizer to
... [more]
We study the problem of query optimization in federated relational database
systems. The nature of federated databases explicitly decouples many aspects of
the optimization process, often making it imperative for the optimizer to
consult underlying data sources while doing costbased optimization. This not
only increases the cost of optimization, but also changes the trade-offs
involved in the optimization process significantly. The dominant cost in the
decoupled optimization process is the "cost of costing" that traditionally has
been considered insignificant. The optimizer can only afford a few rounds of
messages to the underlying data sources and hence the optimization techniques in
this environment must be geared toward gathering all the required cost
information with minimal communication. In this paper, we explore the design
space for a query optimizer in this environment and demonstrate the need for
decoupling various aspects of the optimization process. We present
minimum-communication decoupled variants of various query optimization
techniques, and discuss tradeoffs in their performance in this scenario. We have
implemented these techniques in
the Cohera federated database system and our experimental results, somewhat surprisingly, indicate that a simple two-phase optimization scheme performs
fairly well as long as the physical database design is known to the optimizer, though more aggressive algorithms are required otherwise.
- Independence is Good: Dependency-Based Histogram Synopses for High-Dimensional Data;
Amol Deshpande, Minos Garofalakis, Rajeev Rastogi;
SIGMOD 2001.
[pdf] [talk]
Approximating the joint data distribution of a multi-dimensional data set
through a compact and accurate histogram synopsis is a fundamental problem
arising in numerous practical scenarios, including query optimization and
... [more]
Approximating the joint data distribution of a multi-dimensional data set
through a compact and accurate histogram synopsis is a fundamental problem
arising in numerous practical scenarios, including query optimization and
approximate query answering. Existing solutions either rely on simplistic
independence assumptions or try to directly approximate the full joint data
distribution over the complete set of attributes. Unfortunately, both approaches
are doomed to fail for high-dimensional data sets with complex correlation
patterns between attributes. In this paper, we propose a novel approach to
histogram-based synopses that employs the solid foundation of statistical
interaction models to explicitly identify and exploit the statistical
characteristics of the data. Abstractly, our key idea is to break the synopsis
into (1) a statistical interaction model that accurately captures significant
correlation and independence patterns in data, and (2) a collection of
histograms on low-dimensional marginals that, based on the model, can provide
accurate approximations of the overall joint data distribution. Extensive
experimental results with several real-life data sets
verify the effectiveness of our approach. An important aspect of our general, model-based methodology is that it can be used to enhance the performance
of other synopsis techniques that are based on data-space partitioning (e.g., wavelets) by providing an effective tool to deal with the
"dimensionality curse".
- Adaptive Query Processing: Technology in Evolution;
Joe Hellerstein, Mike Franklin, Sirish Chandrasekaran, Amol Deshpande, Kris Hildrum, Sam Madden, Vijayshankar Raman, Mehul Shah;
IEEE Data Engineering Bulletin 23(2): 7-18 (2000).