A Submodularity-based Clustering Algorithm for the Information Bottleneck and Privacy Funnel
Date
Authors
Ding, Ni
Sadeghi, Parastoo
Journal Title
Journal ISSN
Volume Title
Publisher
Institute of Electrical and Electronics Engineers Inc.
Access Statement
Abstract
For the relevant data S that nests in the observation X, the information bottleneck (IB) aims to encode X into X in order to maximize the extracted useful information I(S;;X) with the minimum coding rate I(X;;X). For the dual privacy tunnel (PF) problem where S denotes the sensitive/private data, the goal is to minimize the privacy leakage I(S;X) while maintain a certain level of utility I(X;;X). For both problems, we propose an efficient iterative agglomerative clustering algorithm based on the minimization of the difference of submodular functions (IAC-MDSF). It starts with the original alphabet ;X:= X and iteratively merges the elements in the current alphabet\;X that optimizes the Lagrangian function I(S;;X-λ I(X;X). We prove that the best merge in each iteration of IAC-MDSF can be searched efficiently over all subsets of ;X by the existing MDSF algorithms. By varying the value of the Lagrangian multiplier λ, we obtain the experimental results on a heart disease data set in terms of the Pareto frontier: I(S;;X) vs.-I(X;;X). We show that our IAC-MDSF algorithm outperforms the existing iterative pairwise merge approaches for both PF and IB and is computationally much less complex.
Description
Keywords
Citation
Collections
Source
Type
Book Title
2019 IEEE Information Theory Workshop, ITW 2019
Entity type
Publication