Regional Explanations and Diverse Molecular Representations in Cheminformatics: A Comparative Study

dc.contributor.authorWang, Xinen
dc.contributor.authorBarnard, Amanda S.en
dc.contributor.authorLi, Sichaoen
dc.date.accessioned2025-12-16T13:40:36Z
dc.date.available2025-12-16T13:40:36Z
dc.date.issued2025en
dc.description.abstractIn cheminformatics, the explainability of machine learning models is important for interpreting complex chemical data, deriving new chemical insights, and building trust in predictive models. However, cheminformatics datasets often exhibit clustered distributions, while traditional explanation methods might overlook intra-cluster variations and complicate the extraction of meaningful explanations.Additionally, diverse representations (tabular, sequence, image, and graph) yield divergent explanations. To address these issues, we propose a novel approach termed regional explanation, designed as an intermediate-level interpretability method that bridges the gap between local and global explanations. This approach systematically reveals how explanations and feature importance vary across data clusters. Using 2 public datasets, a graphene oxide nanoflakes dataset and QM9, with natural clustering properties, we comprehensively evaluate 4 molecular representations through tabular, sequence, image, and graph regional explanation, providing practical guidelines for representation selection. Our analysis illuminates complex, nonlinear relationships between molecular structures and predicted properties within clusters; explores the interplay among molecular features, feature importance, and target properties across distinct regions of chemical space; and advances the interpretability of machine learning models for complex molecular systems.en
dc.description.sponsorshipThis work was supported by the Australian National University and the National Computational Infrastructure (grant number p00).en
dc.description.statusPeer-revieweden
dc.format.extent15en
dc.identifier.otherORCID:/0000-0002-4784-2382/work/186928866en
dc.identifier.scopus105007896810en
dc.identifier.urihttps://hdl.handle.net/1885/733795467
dc.language.isoenen
dc.provenanceDistributed under a Creative Commons Attribution License (CC BY).en
dc.rights© 2025 Xin Wang et al.en
dc.sourceIntelligent Computingen
dc.titleRegional Explanations and Diverse Molecular Representations in Cheminformatics: A Comparative Studyen
dc.typeJournal articleen
dspace.entity.typePublicationen
local.bibliographicCitation.lastpage15en
local.bibliographicCitation.startpage1en
local.contributor.affiliationWang, Xin; School of Computing, ANU College of Systems and Society, The Australian National Universityen
local.contributor.affiliationBarnard, Amanda S.; School of Computing, ANU College of Systems and Society, The Australian National Universityen
local.contributor.affiliationLi, Sichao; School of Computing, ANU College of Systems and Society, The Australian National Universityen
local.identifier.citationvolume4en
local.identifier.doi10.34133/icomputing.0126en
local.identifier.pure311f7bf5-988b-4f1d-9794-bb6e1addeb8fen
local.identifier.urlhttps://www.scopus.com/pages/publications/105007896810en
local.type.statusPublisheden

Downloads

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
icomputing.0126.pdf
Size:
15.19 MB
Format:
Adobe Portable Document Format