|
Query Processing
Coordinator (QPC)
The Query Processing Coordinator (QPC),
shown in Figure 3, is the middle-tier component that controls the execution
of all queries received from the client applications. In effect, the QPC
is the integration server of the system, and therefore, it is
implemented a multi-threaded Java server. The QPC can be reached by
the clients through a well-known URL. QPC provides services such as
query parsing, query validation, query optimization, plan decomposition,
metadata management, query execution and coordination, and error
management. The QPC has a catalog which holds all the metadata about
the user-defined types, methods, and data sites available for use by the
users. All metadata in this catalog are encoded using XML and RDF (Resource
Description Framework), which provide MOCHA with a standardize and
platform-independent solution for metadata exchange. In addition to the
catalog, the QPC also has a code repository which stores the
compiled Java classes that implement the various user-defined data types
and operators (methods) that are available to the user.
Figure 3: Query Processing Coordinator
When QPC receives a query from a client, it carries
out the following tasks:
- Query Parsing - QPC parses the SQL string
representing the query and and generates a tree representing the
tables,
attributes and expression that form part of the query.
- Resource Discovery - QPC accesses the
catalog in search for the metadata describing the tables, types,
methods, and data sources that are needed to solve the query. At
this stage the query is validated to make sure it is syntactically and
semantically correct.
- Query Optimization - The query optimizer
embedded in the QPC generates an efficient execution plan to solve the
query. This optimizer is based on the System-R dynamic programming
paradigm, enhanced with various heuristics to help it place the
execution of user-defined methods at either the QPC or the DAPs
associated with the data sources to be accessed. Such heuristics
attempt to reduce the volume of data transmitted over the network
during query execution, and are based on a new metric that we call the
Volume Reduction Factor (VRF).
- Metadata and Control Exchange - The QPC
decomposes the query plan produced by the optimizer into the
components to be executed by each DAP and those to be executed by itself.
Then QPC sends each participating DAP the metadata and the
fragment of the plan each must execute. The metadata and query plan
are send to the DAPs as XML documents, which makes metadata and
control exchange in MOCHA platform-independent.
- Code Deployment Phase - From the metadata
and query plan, the QPC infers which are the Java classes needed to
handle the query, and where should these classes be deployed. Then,
QPC fetches all required Java classes from the code repository, and
ships them to the client and each DAP. We call this process, the code deployment phase, and is a
fully automatic process carried out by the QPC without any human
involvement. In effect, the QPC extends the capabilities of the
client, each participating DAP and itself, in order to adapt to
the requirements of the query being solved. For this reason we call
MOCHA a self-extensible middleware system.
- Query Execution - Once the code deployment
phase is completed, the QPC signals each DAP to start executing the
query plan that each one received. Each DAP accesses the data in the data
source and processes them according to the query plan. All these
results are then send to QPC for further processing, and then QPC
forwards the final values to the client application for visualization
purposes.
|