A Submodularity-based Clustering Algorithm for the Information Bottleneck and Privacy Funnel

Date

Authors

Ding, Ni
Sadeghi, Parastoo

Journal Title

Journal ISSN

Volume Title

Publisher

IEEE

Abstract

For the relevant data S that nests in the observation X, the information bottleneck (IB) aims to encode X into Xˆ in order to maximize the extracted useful information I(S; Xˆ) with the minimum coding rate I(X; Xˆ). For the dual privacy funnel (PF) problem where S denotes the sensitive/private data, the goal is to minimize the privacy leakage I(S; Xˆ) while maintain a certain level of utility I(X; Xˆ). For both problems, we propose an efficient iterative agglomerative clustering algorithm based on the minimization of the difference of submodular functions (IAC-MDSF). It starts with the original alphabet Xˆ := X and iteratively merges the elements in the current alphabet Xˆ that optimizes the Lagrangian function I(S; Xˆ)−λI(X; Xˆ). We prove that the best merge in each iteration of IAC-MDSF can be searched efficiently over all subsets of Xˆ by the existing MDSF algorithms. By varying the value of the Lagrangian multiplier λ, we obtain the experimental results on a heart disease data set in terms of the Pareto frontier: I(S; Xˆ) vs. −I(X; Xˆ). We show that our IAC-MDSF algorithm outperforms the existing iterative pairwise merge approaches for both PF and IB and is computationally much less complex

Description

Keywords

Citation

Source

Book Title

Entity type

Access Statement

License Rights

Restricted until

2099-12-31