Generalized Maximum Entropy, Convexity and Machine Learning
Abstract
This thesis identifies and extends techniques that can be linked to the principle of maximum entropy (maxent) and applied to parameter estimation in machine learning and statistics. Entropy functions based on deformed logarithms are used to construct Bregman divergences, and together these represent a generalization of relative entropy. The framework is analyzed using convex analysis to charac- terize generalized forms of exponential family distributions. Various connections to the existing machine learning literature are discussed and the techniques are applied to the problem of non-negative matrix factorization (NMF).