Seminar:
Speaker:Lisa Hellerstein,
Title:Flow
Algorithms for Two Pipelined Filtering Problems
Abstract: We give related
flow algorithms for two pipelined filter-ordering problems, both motivated by
query optimization in databases. This work is also connected
to sequential testing, minimum-sum set cover, and learning with attribute
costs. The first problem is a throughput maximization problem for filter
ordering. The second problem also concerns filter ordering, and can be
described in terms of finding an optimal strategy for a player in a particular
two-person zero-sum game.
In the first problem, there
are n predicates with independent selectivities p_1, ..., p_n. Each predicate s_i is evaluated by a separate processor
which can process r_i tuples
per second.
Each tuple
is routed through the processors until either
(1) it is found not to
satisfy a predicate or (2) it is found to satisfy all predicates. The problem
is to route tuples through processors so as to maximize the throughput of tuple
processing. We present an algorithm that solves this problem in time O(n^2), improving on an earlier O(n^3 \log n) algorithm of Kodialam.
In the second problem,
there are n predicates s_i, each with a cost c_i. The problem is to determine the (randomized) order in
which to apply the predicates to a single tuple until
the tuple has been found to
fail a predicate. The goal is to minimize multiplicative regret—the ratio
between the cost incurred in processing the tuple,
and the cost that would have been incurred if the predicate that the tuple fails (assuming there is one) had been tested first.
A variant of the algorithm for the first problem solves this problem in time O(n^2).
This is joint work with
Anne Condon, Amol Deshpande,
and Ning Wu.