Mugen-UMAP: UMAP visualization and clustering of mutated genes in single-cell DNA sequencing data

dc.contributor.authorLi, Tengen
dc.contributor.authorZou, Yiranen
dc.contributor.authorLi, Xianghanen
dc.contributor.authorWong, Thomas K.F.en
dc.contributor.authorRodrigo, Allen G.en
dc.date.accessioned2025-05-23T10:21:01Z
dc.date.available2025-05-23T10:21:01Z
dc.date.issued2024en
dc.description.abstractBackground: The application of Uniform Manifold Approximation and Projection (UMAP) for dimensionality reduction and visualization has revolutionized the analysis of single-cell RNA expression and population genetics. However, its potential in single-cell DNA sequencing data analysis, particularly for visualizing gene mutation information, has not been fully explored. Results: We introduce Mugen-UMAP, a novel Python-based program that extends UMAP’s utility to single-cell DNA sequencing data. This innovative tool provides a comprehensive pipeline for processing gene annotation files of single-cell somatic single-nucleotide variants and metadata to the visualization of UMAP projections for identifying clusters, along with various statistical analyses. Employing Mugen-UMAP, we analyzed whole-exome sequencing data from 365 single-cell samples across 12 non-small cell lung cancer (NSCLC) patients, revealing distinct clusters associated with histological subtypes of NSCLC. Moreover, to demonstrate the general utility of Mugen-UMAP, we applied the program to 9 additional single-cell WES datasets from various cancer types, uncovering interesting patterns of cell clusters that warrant further investigation. In summary, Mugen-UMAP provides a quick and effective visualization method to uncover cell cluster patterns based on the gene mutation information from single-cell DNA sequencing data. Conclusions: The application of Mugen-UMAP demonstrates its capacity to provide valuable insights into the visualization and interpretation of single-cell DNA sequencing data. Mugen-UMAP can be found at https://github.com/tengchn/Mugen-UMAPen
dc.description.sponsorshipWe thank Yuantong Ding, Xia Hua, Bui Quang Minh, Imelda Forteza, and Tianshu Yang for participating in our group meetings where these results were discussed. We also thank Michael J. Campa, Elizabeth B. Gottlin, and Edward F. Patz Jr for consultation on various clinical aspects of NSCLC and helpful discussions. We acknowledge the use of New Zealand eScience Infrastructure (NeSI) high performance computing facilities. This work was supported by the start-up funds from the University of Auckland, New Zealand to AR (4020\u201312090).en
dc.description.statusPeer-revieweden
dc.identifier.otherPubMed:39333868en
dc.identifier.otherORCID:/0000-0002-0580-6324/work/184099386en
dc.identifier.scopus85205336332en
dc.identifier.urihttp://www.scopus.com/inward/record.url?scp=85205336332&partnerID=8YFLogxKen
dc.identifier.urihttps://hdl.handle.net/1885/733752003
dc.language.isoenen
dc.rightsPublisher Copyright: © The Author(s) 2024.en
dc.sourceBMC Bioinformaticsen
dc.subjectClusteringen
dc.subjectGene mutationen
dc.subjectSingle-cell DNA sequencingen
dc.subjectUMAPen
dc.subjectVisualizationen
dc.titleMugen-UMAP: UMAP visualization and clustering of mutated genes in single-cell DNA sequencing dataen
dc.typeJournal articleen
dspace.entity.typePublicationen
local.contributor.affiliationLi, Teng; Division of Ecology and Evolution, Research School of Biology, ANU College of Science and Medicine, The Australian National Universityen
local.contributor.affiliationZou, Yiran; Division of Ecology and Evolution, Research School of Biology, ANU College of Science and Medicine, The Australian National Universityen
local.contributor.affiliationLi, Xianghan; Genome Sciences and Cancer Division, John Curtin School of Medical Research, ANU College of Science and Medicine, The Australian National Universityen
local.contributor.affiliationWong, Thomas K.F.; School of Computing, ANU College of Systems and Society, The Australian National Universityen
local.contributor.affiliationRodrigo, Allen G.; Administration, Research School of Biology, ANU College of Science and Medicine, The Australian National Universityen
local.identifier.citationvolume25en
local.identifier.doi10.1186/s12859-024-05928-xen
local.identifier.pured5dab0e4-4c88-4858-8acb-c168cb0705d6en
local.identifier.urlhttps://www.scopus.com/pages/publications/85205336332en
local.type.statusPublisheden

Downloads