Correlation immune functions and learning

Correlation immune functions and learning

Lisa Hellerstein

A Boolean function f is correlation immune if each input variable of f is

independent of the output of f, with respect to the uniform distribution

on the domain of f. The parity function is correlation immune, as

is the Boolean function f = 1 if all input variables of f have equal values.

Correlation immune functions have been extensively studied in

cryptography; the lack of correlation between individual inputs

and output is a desirable property of an encoding function.

Correlation immune functions pose a problem to standard machine

learning algorithms, though, since this same lack of correlation makes

it more difficult to distinguish between essential (relevant) and inessential variables

of a target function.

We begin by reviewing previous results on correlation immune functions.

We then address the problem of identifying essential variables of correlation immune functions.

This is a difficult task if only uniform random examples of the target function are available.

Using a new structural lemma on Boolean functions, we show how access to examples from

non-uniform distributions can be exploited in order to efficiently accomplish the task.