Van Lissa, Caspar J.Stroebe, WolfgangvanDellen, Michelle R.Leander, N. PontusAgostini, MaximilianDraws, TimGrygoryshyn, AndriiGützgow, BenKreienkamp, JannisVetter, Clara S.Abakoumkin, GeorgiosAbdul Khaiyom, Jamilah HanumAhmedi, VjolicaAkkas, HandanAlmenara, Carlos A.Atta, MohsinBagci, Sabahat CigdemBasel, SimaKida, Edona BerishaBernardo, Allan B.I.Buttrick, Nicholas R.Chobthamkit, PhatthanakitChoi, Hoon SeokCristea, MioaraCsaba, SáraDamnjanović, KajaDanyliuk, IvanDash, ArobinduDi Santo, DanielaDouglas, Karen M.Enea, VioletaFaller, Daiane GracieliFitzsimons, Gavan J.Gheorghiu, AlexandraGómez, ÁngelHamaidia, AliHan, QingHelmy, MaiHudiyana, JoevarianJeronimus, Bertus F.Jiang, Ding YuJovanović, VeljkoKamenov, ŽeljkaKende, AnnaKeng, Shian LingThanh Kieu, Tra ThiKoc, YasinKovyazina, KamilaKozytska, InnaRyan, Michelle K.2025-05-312025-05-31ORCID:/0000-0003-1091-9275/work/177036651http://www.scopus.com/inward/record.url?scp=85127500709&partnerID=8YFLogxKhttps://hdl.handle.net/1885/733756001Before vaccines for coronavirus disease 2019 (COVID-19) became available, a set of infection-prevention behaviors constituted the primary means to mitigate the virus spread. Our study aimed to identify important predictors of this set of behaviors. Whereas social and health psychological theories suggest a limited set of predictors, machine-learning analyses can identify correlates from a larger pool of candidate predictors. We used random forests to rank 115 candidate correlates of infection-prevention behavior in 56,072 participants across 28 countries, administered in March to May 2020. The machine-learning model predicted 52% of the variance in infection-prevention behavior in a separate test sample—exceeding the performance of psychological models of health behavior. Results indicated the two most important predictors related to individual-level injunctive norms. Illustrating how data-driven methods can complement theory, some of the most important predictors were not derived from theories of health behavior—and some theoretically derived predictors were relatively unimportant.The lead author was funded by a NWO Veni Grant (NWO Grant Number VI.Veni.191G.090 ). This research received support from the New York University Abu Dhabi ( VCDSF/75-71015 ), the University of Groningen (Sustainable Society & Ubbo Emmius Fund), and the Instituto de Salud Carlos III ( COV20/00086 ) co-funded by the European Regional Development Fund (ERDF) “A way to make Europe.”14enPublisher Copyright: © 2022 The Author(s)COVID-19DSML2: Proof-of-concept: Data science output has been formulated, implemented, and tested for one domain/problemhealth behaviorsmachine learningpublic goods dilemmarandom forestsocial normsUsing machine learning to identify important predictors of COVID-19 infection prevention behaviors during the early phase of the pandemic2022-04-0810.1016/j.patter.2022.10048285127500709