Distribution of Mutual Information
Title: Distribution of Mutual Information
Research Question: How can we determine the distribution of mutual information, a measure of the stochastic dependence between categorical random variables, when considering complete and incomplete data?
Methodology: The researchers used a Bayesian framework and a second-order Dirichlet prior distribution to derive the exact analytical expression for the mean, and analytical approximations for the variance, skewness, and kurtosis of mutual information. They considered both complete and incomplete samples, and developed lead ing order approximations for the mean and variance.
Results: The researchers derived analytical expressions for the mean, variance, skewness, and kurtosis of mutual information. They found that these approximations have a guaranteed accuracy level of the order O(n−3), where n is the sample size. They also developed lead ing order approximations for the mean and variance in the case of incomplete samples.
Implications: The derived analytical expressions allow for the distribution of mutual information to be approximated reliably and quickly. This makes mutual information a concrete alternative to descriptive mutual information in many applications that could benefit from moving to the inductive side. The researchers also discussed potential applications, such as feature selection and filter approaches, which could be improved by using inductive mutual information.
Link to Article: https://arxiv.org/abs/0403025v1 Authors: arXiv ID: 0403025v1