Correlation immune functions and learning
Lisa Hellerstein
A Boolean function f is correlation immune if each input variable of f is
independent of the output of f, with respect to the uniform distribution
on the domain of f. The parity function is correlation immune, as
is the Boolean function f = 1 if all input variables of f have equal values.
Correlation immune functions have been extensively studied in
cryptography; the lack of correlation between individual inputs
and output is a desirable property of an encoding function.
Correlation immune functions pose a problem to standard machine
learning algorithms, though, since this same lack of correlation makes
it more difficult to distinguish between essential (relevant) and inessential variables
of a target function.
We begin by reviewing previous results on correlation immune functions.
We then address the problem of identifying essential variables of correlation immune functions.
This is a difficult task if only uniform random examples of the target function are available.
Using a new structural lemma on Boolean functions, we show how access to examples from
non-uniform distributions can be exploited in order to efficiently accomplish the task.