Evaluation of recombination detection methods for viral sequencing
| dc.contributor.author | Jaya, Frederick | |
| dc.contributor.author | Brito, Barbara P. | |
| dc.contributor.author | Darling, Aaron | |
| dc.date.accessioned | 2024-08-26T05:21:10Z | |
| dc.date.available | 2024-08-26T05:21:10Z | |
| dc.date.issued | 2023 | |
| dc.date.updated | 2024-04-28T08:16:16Z | |
| dc.description.abstract | Recombination is a key evolutionary driver in shaping novel viral populations and lineages. When unaccounted for, recombination can impact evolutionary estimations or complicate their interpretation. Therefore, identifying signals for recombination in sequencing data is a key prerequisite to further analyses. A repertoire of recombination detection methods (RDMs) have been developed over the past two decades; however, the prevalence of pandemic-scale viral sequencing data poses a computational challenge for existing methods. Here, we assessed eight RDMs: PhiPack (Profile), 3SEQ, GENECONV, recombination detection program (RDP) (OpenRDP), MaxChi (OpenRDP), Chimaera (OpenRDP), UCHIME (VSEARCH), and gmos; to determine if any are suitable for the analysis of bulk sequencing data. To test the performance and scalability of these methods, we analysed simulated viral sequencing data across a range of sequence diversities, recombination frequencies, and sample sizes. Furthermore, we provide a practical example for the analysis and validation of empirical data. We find that RDMs need to be scalable, use an analytical approach and resolution that is suitable for the intended research application, and are accurate for the properties of a given dataset (e.g. sequence diversity and estimated recombination frequency). Analysis of simulated and empirical data revealed that the assessed methods exhibited considerable trade-offs between these criteria. Overall, we provide general guidelines for the validation of recombination detection results, the benefits and shortcomings of each assessed method, and future considerations for recombination detection methods for the assessment of large-scale viral sequencing data. | |
| dc.description.sponsorship | This research was supported by the Australian Government Research Training Program. Computational facilities and support were provided by the University of Technology eResearch High Performance Computer Cluster. The authors would like to thank Sebastian Duchene, Cheong Xin Chan, and two anonymous reviewers for their valuable comments and suggestions. | |
| dc.format.mimetype | application/pdf | en_AU |
| dc.identifier.issn | 2057-1577 | |
| dc.identifier.uri | https://hdl.handle.net/1885/733715962 | |
| dc.language.iso | en_AU | en_AU |
| dc.provenance | This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com | |
| dc.publisher | Oxford University Press | |
| dc.rights | © 2023 The authors | |
| dc.rights.license | Creative Commons Attribution licence | |
| dc.rights.uri | http://creativecommons.org/licenses/ by-nc/4.0/ | |
| dc.source | Virus Evolution | |
| dc.subject | recombination detection methods | |
| dc.subject | recombination | |
| dc.subject | bioinformatics | |
| dc.title | Evaluation of recombination detection methods for viral sequencing | |
| dc.type | Journal article | |
| dcterms.accessRights | Open Access | |
| local.bibliographicCitation.issue | 2 | |
| local.bibliographicCitation.lastpage | 14 | |
| local.bibliographicCitation.startpage | 1 | |
| local.contributor.affiliation | Jaya, Frederick, College of Science, ANU | |
| local.contributor.affiliation | Brito, Barbara P., University of Technology Sydney | |
| local.contributor.affiliation | Darling, Aaron, University of Technology Sydney | |
| local.contributor.authoruid | Jaya, Frederick, u1070770 | |
| local.description.notes | Imported from ARIES | |
| local.identifier.absfor | 310400 - Evolutionary biology | |
| local.identifier.absfor | 310200 - Bioinformatics and computational biology | |
| local.identifier.absfor | 310700 - Microbiology | |
| local.identifier.ariespublication | u9511635xPUB2516 | |
| local.identifier.citationvolume | 9 | |
| local.identifier.doi | 10.1093/ve/vead066 | |
| local.publisher.url | https://academic.oup.com/ | |
| local.type.status | Published Version | |
| publicationvolume.volumeNumber | 9 |
Downloads
Original bundle
1 - 1 of 1