Cultural advice

The Australian National University acknowledges, celebrates and pays our respects to the Ngunnawal and Ngambri people of the Canberra region and to all First Nations Australians on whose traditional lands we meet and work, and whose cultures are among the oldest continuing cultures in human history.

Aboriginal and Torres Strait Islander peoples are advised that ANU Library collections may include images, names, voices, and other representations of deceased persons.

Material in the collection may contain terms, language or views that reflect the period in which the item was created and may be considered inappropriate today.

Bayesian treatment of incomplete discrete data applied to mutual information and feature selection

Loading...
Thumbnail Image

Date

Authors

Hutter, Marcus
Zaffalon, Marco

Journal Title

Journal ISSN

Volume Title

Publisher

Springer Verlag

Abstract

Given the joint chances of a pair of random variables one can compute quantities of interest, like the mutual information. The Bayesian treatment of unknown chances involves computing, from a second order prior distribution and the data likelihood, a posterior distribution of the chances. A common treatment of incomplete data is to assume ignorability and determine the chances by the expectation maximization (EM) algorithm. The two different methods above are well established but typically separated. This paper joins the two approaches in the case of Dirichlet priors, and derives efficient approximations for the mean, mode and the (co)variance of the chances and the mutual information. Furthermore, we prove the unimodality of the posterior distribution, whence the important property of convergence of EM to the global maximum in the chosen framework. These results are applied to the problem of selecting features for incremental learning and naive Bayes classification. A fast filter based on the distribution of mutual information is shown to outperform the traditional filter based on empirical mutual information on a number of incomplete real data sets.

Description

Citation

Source

Book Title

KI 2003: Advances in Artificial Intelligence: 26th Annual German Conference on AI, KI 2003, Hamburg, Germany, September 15-18, 2003. Proceedings

Entity type

Access Statement

Open Access

License Rights

Restricted until

abcd