Distribution of mutual information
Abstract
The mutual information of two random variables ı and with joint probabilities
{πij} is commonly used in learning Bayesian nets as well as in many other fields.
The chances πij are usually estimated by the empirical sampling frequency nij/n
leading to a point estimate I(nij/n) for the mutual information. To answer questions
like “is I(nij/n) consistent with zero?” or “what is the probability that the true
mutual information is much larger than the point estimate?” one has to go beyond
the point estimate. In the Bayesian framework one can answer these questions
by utilizing a (second order) prior distribution p(π) comprising prior information
about π. From the prior p(π) one can compute the posterior p(π|n), from which the
distribution p(I|n) of the mutual information can be calculated. We derive reliable
and quickly computable approximations for p(I|n). We concentrate on the mean,
variance, skewness, and kurtosis, and non-informative priors. For the mean we also
give an exact expression. Numerical issues and the range of validity are discussed.
Description
Citation
Collections
Source
Type
Book Title
Advances in neural information processing systems. proceedings of the 2002 conference