An AI System Evaluation Framework for Advancing AI Safety: Terminology, Taxonomy, Lifecycle Mapping

dc.contributor.authorXia, Bomingen
dc.contributor.authorLu, Qinghuaen
dc.contributor.authorZhu, Limingen
dc.contributor.authorXing, Zhenchangen
dc.date.accessioned2025-05-23T06:26:49Z
dc.date.available2025-05-23T06:26:49Z
dc.date.issued2024-07-10en
dc.description.abstractThe advent of advanced AI underscores the urgent need for comprehensive safety evaluations, necessitating collaboration across communities (i.e., AI, software engineering, and governance). However, divergent practices and terminologies across these communities, combined with the complexity of AI systems - of which models are only a part - and environmental affordances (e.g., access to tools), obstruct effective communication and comprehensive evaluation. This paper proposes a framework for AI system evaluation comprising three components: 1) harmonised terminology to facilitate communication across communities involved in AI safety evaluation; 2) a taxonomy identifying essential elements for AI system evaluation; 3) a mapping between AI lifecycle, stakeholders, and requisite evaluations for accountable AI supply chain. This framework catalyses a deeper discourse on AI system evaluation beyond model-centric approaches.en
dc.description.statusPeer-revieweden
dc.format.extent5en
dc.identifier.isbn9798400706851en
dc.identifier.scopus85199903661en
dc.identifier.urihttp://www.scopus.com/inward/record.url?scp=85199903661&partnerID=8YFLogxKen
dc.identifier.urihttps://hdl.handle.net/1885/733751677
dc.language.isoenen
dc.provenanceThis work is licensed under a Creative Commons Attribution 4.0 International License.en
dc.publisherAssociation for Computing Machinery (ACM)en
dc.relation.ispartofAIware 2024 - Proceedings of the 1st ACM International Conference on AI-Powered Software, Co-located with: ESEC/FSE 2024en
dc.relation.ispartofseries1st ACM International Conference on AI-Powered Software, AIware 2024, co-located with the ACM International Conference on the Foundations of Software Engineering, FSE 2024en
dc.relation.ispartofseriesAIware 2024 - Proceedings of the 1st ACM International Conference on AI-Powered Software, Co-located with: ESEC/FSE 2024en
dc.rights© 2024 Owner/Author.en
dc.subjectAI Safetyen
dc.subjectAI Testingen
dc.subjectBenchmarkingen
dc.subjectEvaluationen
dc.subjectResponsible AIen
dc.titleAn AI System Evaluation Framework for Advancing AI Safety: Terminology, Taxonomy, Lifecycle Mappingen
dc.typeConference paperen
dspace.entity.typePublicationen
local.bibliographicCitation.lastpage78en
local.bibliographicCitation.startpage74en
local.contributor.affiliationXia, Boming; CSIROen
local.contributor.affiliationLu, Qinghua; CSIROen
local.contributor.affiliationZhu, Liming; CSIROen
local.contributor.affiliationXing, Zhenchang; School of Computing, ANU College of Systems and Society, The Australian National Universityen
local.identifier.doi10.1145/3664646.3664766en
local.identifier.pure00a9c4f0-0b41-42e8-8991-784172cb62b2en
local.identifier.urlhttps://www.scopus.com/pages/publications/85199903661en
local.type.statusPublisheden

Downloads

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
3664646.3664766.pdf
Size:
485.36 KB
Format:
Adobe Portable Document Format