Bayesian treatment of incomplete discrete data applied to mutual information and feature selection
Loading...
Date
Authors
Hutter, Marcus
Zaffalon, Marco
Journal Title
Journal ISSN
Volume Title
Publisher
Springer Verlag
Abstract
Given the joint chances of a pair of random variables one can
compute quantities of interest, like the mutual information. The Bayesian
treatment of unknown chances involves computing, from a second order
prior distribution and the data likelihood, a posterior distribution of the
chances. A common treatment of incomplete data is to assume ignorability
and determine the chances by the expectation maximization (EM)
algorithm. The two different methods above are well established but
typically separated. This paper joins the two approaches in the case of
Dirichlet priors, and derives efficient approximations for the mean, mode
and the (co)variance of the chances and the mutual information. Furthermore,
we prove the unimodality of the posterior distribution, whence the
important property of convergence of EM to the global maximum in the
chosen framework. These results are applied to the problem of selecting
features for incremental learning and naive Bayes classification. A fast
filter based on the distribution of mutual information is shown to outperform
the traditional filter based on empirical mutual information on
a number of incomplete real data sets.
Description
Citation
Collections
Source
Type
Book Title
KI 2003: Advances in Artificial Intelligence: 26th Annual German Conference on AI, KI 2003, Hamburg, Germany, September 15-18, 2003. Proceedings
Entity type
Access Statement
Open Access