Seminar:November 1, 2007

 

Speaker:Lisa Hellerstein, Polytechnic University, Brooklyn, NY

 

Title:Flow Algorithms for Two Pipelined Filtering Problems

 

Abstract: We give related flow algorithms for two pipelined filter-ordering problems, both motivated by query optimization in databases. This work is also connected to sequential testing, minimum-sum set cover, and learning with attribute costs. The first problem is a throughput maximization problem for filter ordering. The second problem also concerns filter ordering, and can be described in terms of finding an optimal strategy for a player in a particular two-person zero-sum game.

In the first problem, there are n predicates with independent selectivities p_1, ..., p_n. Each predicate s_i is evaluated by a separate processor which can process r_i tuples per second.

Each tuple is routed through the processors until either

(1) it is found not to satisfy a predicate or (2) it is found to satisfy all predicates. The problem is to route tuples through processors so as to maximize the throughput of tuple processing. We present an algorithm that solves this problem in time O(n^2), improving on an earlier O(n^3 \log n) algorithm of Kodialam.

 

In the second problem, there are n predicates s_i, each with a cost c_i. The problem is to determine the (randomized) order in which to apply the predicates to a single tuple until the tuple has been found to fail a predicate. The goal is to minimize multiplicative regret—the ratio between the cost incurred in processing the tuple, and the cost that would have been incurred if the predicate that the tuple fails (assuming there is one) had been tested first. A variant of the algorithm for the first problem solves this problem in time O(n^2).

This is joint work with Anne Condon, Amol Deshpande, and Ning Wu.