SHIELD: A SHapley and Information-theory based framework for Equitable Learning via Dissimilar feature grouping

Yun, Hyeonggeun

SHIELD: A SHapley and Information-theory based framework for Equitable Learning via Dissimilar feature grouping

Date

2025

Authors

Yun, Hyeonggeun

Abstract

Machine learning models are increasingly applied in clinical and biomedical settings, due to their ability to capture an intricate underlying structure, yet their complexity can obscure critical insights and risk propagating biases that disproportionately affect vulnerable populations. Hence, this thesis introduces SHIELD: A SHapley and Information-theory based framework for Equitable Learning via Dissimilar feature grouping, which combines dissimilarity-driven feature grouping with interpretable latent representations to mitigate proxy bias and enhance equitable learning of the resulting model. By constructing a dissimilarity matrix based on conditional mutual information (CMI), features are grouped to weaken correlations that might encode sensitive attributes, reducing redundant signals that may contribute to unfairness. This automated approach is more efficient than clustering similar features and fixing problematic groups post-hoc. Group-specific autoencoders learn latent representations that summarise each group’s unique information while preserving a decoder-weight mapping back to the original features. This enables precise SHapley Additive exPlanations (SHAP) value decomposition, leading to interpretable feature-level attributions despite dimensionality reduction. Experiments on four benchmark clinical datasets demonstrated that the proposed grouping approaches, greedy, Bicriterion, and K-plus anticlustering, achieved notable improvements in fairness metrics and produced more evenly distributed feature importance compared to raw features and traditional baselines. This was evident as grouping on average led to 9.47% improvement in the distance from origin of bias quadrant, which accounts for both explanation and prediction bias. In addition, the fairness overview score, which considers other typical fairness metrics as well, was improved by 2.42% when grouped by dissimilarity. While a modest reduction of 3.43% in accuracy and 5.16% in F1-score was observed, it remained within acceptable limits for clinical applications, demonstrating the feasibility of this fairness-performance trade-off. Overall, SHIELD provides a principled framework that integrates dissimilarity-based grouping, latent representation learning, and explanation-level auditing to promote equitable and explainable machine learning for health informatics.