RUTCOR Colloquia - May 5, 2005
Speaker: Mahesh Kumar
Affiliation: Rutgers Business School, Rutgers University
Title: Clustering of Statistical Model Parameters Using Error-based Clustering
Time: 1:30 - 2:30 PM
Location: RUTCOR Building - Room 139, Rutgers University, Busch Campus, Piscataway, NJ
Abstract:
Traditional clustering methods assume that there is no measurement error, or
uncertainty, associated with data. Often, however, real world applications
require treatment of data that have such errors. In the presence of
measurement errors, well-known clustering methods like k-means and hierarchical
clustering may not produce satisfactory results. We have developed a new
clustering method that explicitly incorporates error information in the
clustering process.
In this talk I will discuss types of clustering problems where error
information associated with the data to be clustered is readily available and
where error-based clustering is likely to be superior to clustering methods
that ignore error. I will focus on clustering of derived data (typically
parameter estimates) obtained by fitting statistical models to the observed
data. We show that, for Gaussian distributed observed data, the optimal
error-based clusters of derived data are the same as the maximum likelihood
clusters of the observed data. I will also report briefly on a series of
empirical studies using four statistical models: (1) sample averaging
(2) multiple linear regression, (3) ARIMA time-series, and (4) Markov chains.
Our empirical studies suggest that error-based clustering performs
significantly better than traditional clustering methods on these applications.
Joint work with Nitin R. Patel of MIT.
Back to Seminars Page.
Back to RUTCOR homepage.
|