Skip navigation
Skip navigation

Infection status outcome, machine learning method and virus type interact to affect the optimised prediction of hepatitis virus immunoassay results from routine pathology laboratory assays in unbalanced data

Richardson, Alice; Lidbury, Brett

Description

BACKGROUND Advanced data mining techniques such as decision trees have been successfully used to predict a variety of outcomes in complex medical environments. Furthermore, previous research has shown that combining the results of a set of individually trained trees into an ensemble-based classifier can improve overall classification accuracy. This paper investigates the effect of data pre-processing, the use of ensembles constructed by bagging, and a simple majority vote to combine...[Show more]

dc.contributor.authorRichardson, Alice
dc.contributor.authorLidbury, Brett
dc.date.accessioned2015-12-01T02:58:25Z
dc.date.available2015-12-01T02:58:25Z
dc.identifier.issn1471-2105
dc.identifier.urihttp://hdl.handle.net/1885/16923
dc.description.abstractBACKGROUND Advanced data mining techniques such as decision trees have been successfully used to predict a variety of outcomes in complex medical environments. Furthermore, previous research has shown that combining the results of a set of individually trained trees into an ensemble-based classifier can improve overall classification accuracy. This paper investigates the effect of data pre-processing, the use of ensembles constructed by bagging, and a simple majority vote to combine classification predictions from routine pathology laboratory data, particularly to overcome a large imbalance of negative Hepatitis B virus (HBV) and Hepatitis C virus (HCV) cases versus HBV or HCV immunoassay positive cases. These methods were illustrated using a never before analysed data set from ACT Pathology (Canberra, Australia) relating to HBV and HCV patients. RESULTS It was easier to predict immunoassay positive cases than negative cases of HBV or HCV. While applying an ensemble-based approach rather than a single classifier had a small positive effect on the accuracy rate, this also varied depending on the virus under analysis. Finally, scaling data before prediction also has a small positive effect on the accuracy rate for this dataset. A graphical analysis of the distribution of accuracy rates across ensembles supports these findings. CONCLUSIONS Laboratories looking to include machine learning as part of their decision support processes need to be aware that the infection outcome, the machine learning method used and the virus type interact to affect the enhanced laboratory diagnosis of hepatitis virus infection, as determined by primary immunoassay data in concert with multiple routine pathology laboratory variables. This awareness will lead to the informed use of existing machine learning methods, thus improving the quality of laboratory diagnosis via informatics analyses.
dc.description.sponsorshipThe project was funded by The Medical Advances Without Animals Trust (MAWA).
dc.publisherBioMed Central
dc.rights© 2013 Richardson and Lidbury; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
dc.sourceBMC Bioinformatics
dc.subjectdecision support techniques
dc.subjectdecision trees
dc.subjecthepacivirus
dc.subjecthepatitis b
dc.subjecthepatitis b virus
dc.subjecthepatitis c
dc.subjecthumans
dc.subjectimmunoassay
dc.subjectimmunologic tests
dc.subjectartificial intelligence
dc.titleInfection status outcome, machine learning method and virus type interact to affect the optimised prediction of hepatitis virus immunoassay results from routine pathology laboratory assays in unbalanced data
dc.typeJournal article
local.description.notesImported from ARIES
local.identifier.citationvolume14
dc.date.issued2013-06-25
local.identifier.absfor110309
local.identifier.absfor010402
local.identifier.ariespublicationf5625xPUB4161
local.type.statusPublished Version
local.contributor.affiliationRichardson, Alice M, University of Canberra, Australia
local.contributor.affiliationLidbury, Brett, College of Medicine, Biology and Environment, CMBE John Curtin School of Medical Research, Genome Sciences, The Australian National University
local.identifier.essn1471-2105
local.bibliographicCitation.issue1
local.bibliographicCitation.startpage206
local.bibliographicCitation.lastpage8
local.identifier.doi10.1186/1471-2105-14-206
local.identifier.absseo920109
local.identifier.absseo970111
dc.date.updated2015-12-11T08:45:50Z
local.identifier.scopusID2-s2.0-84879196286
local.identifier.thomsonID000321136900001
CollectionsANU Research Publications

Download

File Description SizeFormat Image
01_Richardson_Infection_status_outcome,_2013.pdfPublished Version365.86 kBAdobe PDFThumbnail


Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.

Updated:  19 May 2020/ Responsible Officer:  University Librarian/ Page Contact:  Library Systems & Web Coordinator