Hybrid random forests: Advantages of mixed trees in classifying text data

dc.contributor.authorXu, Baoxunen
dc.contributor.authorHuang, Joshua Zhexueen
dc.contributor.authorWilliams, Grahamen
dc.contributor.authorLi, Mark Junjieen
dc.contributor.authorYe, Yunmingen
dc.date.accessioned2025-06-24T02:36:20Z
dc.date.available2025-06-24T02:36:20Z
dc.date.issued2012en
dc.description.abstractRandom forests are a popular classification method based on an ensemble of a single type of decision tree. In the literature, there are many different types of decision tree algorithms, including C4.5, CART and CHAID. Each type of decision tree algorithms may capture different information and structures. In this paper, we propose a novel random forest algorithm, called a hybrid random forest. We ensemble multiple types of decision trees into a random forest, and exploit diversity of the trees to enhance the resulting model. We conducted a series of experiments on six text classification datasets to compare our method with traditional random forest methods and some other text categorization methods. The results show that our method consistently outperforms these compared methods.en
dc.description.statusPeer-revieweden
dc.format.extent12en
dc.identifier.isbn9783642302169en
dc.identifier.issn0302-9743en
dc.identifier.otherORCID:/0000-0001-7041-4127/work/162449858en
dc.identifier.scopus84861442725en
dc.identifier.urihttp://www.scopus.com/inward/record.url?scp=84861442725&partnerID=8YFLogxKen
dc.identifier.urihttps://hdl.handle.net/1885/733764648
dc.language.isoenen
dc.relation.ispartofAdvances in Knowledge Discovery and Data Mining - 16th Pacific-Asia Conference, PAKDD 2012, Proceedingsen
dc.relation.ispartofseries16th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2012en
dc.relation.ispartofseriesLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)en
dc.relation.isversionofPART 1en
dc.subjectClassificationen
dc.subjectDecision Treeen
dc.subjectHybrid Random Foresten
dc.subjectRandom Forestsen
dc.titleHybrid random forests: Advantages of mixed trees in classifying text dataen
dc.typeConference paperen
dspace.entity.typePublicationen
local.bibliographicCitation.lastpage158en
local.bibliographicCitation.startpage147en
local.contributor.affiliationXu, Baoxun; Harbin Institute of Technologyen
local.contributor.affiliationHuang, Joshua Zhexue; Shenzhen Institute of Advanced Technologyen
local.contributor.affiliationWilliams, Graham; Shenzhen Institute of Advanced Technologyen
local.contributor.affiliationLi, Mark Junjie; Shenzhen Institute of Advanced Technologyen
local.contributor.affiliationYe, Yunming; Harbin Institute of Technologyen
local.identifier.ariespublicationu3968803xPUB67en
local.identifier.doi10.1007/978-3-642-30217-6_13en
local.identifier.essn1611-3349en
local.identifier.pure65999baf-5092-4bf6-8e19-52c4e6fbb0been
local.identifier.urlhttps://www.scopus.com/pages/publications/84861442725en
local.type.statusPublisheden

Downloads