Kmer2SNP: Reference-free SNP calling from raw reads based on matching

Loading...
Thumbnail Image

Date

Authors

Li, Yanbo
Patel, Hardip
Lin, Yu

Journal Title

Journal ISSN

Volume Title

Publisher

IEEE

Abstract

SNP calling is a fundamental problem of genetic analysis and has many applications, such as gene-disease diagnosis, drug design, and ancestry inference. Prior approaches either require high-quality reference genome, or suffer from low recall/precision or high runtime. We develop a reference-free algorithm Kmer2SNP to call SNP directly from raw reads, an approach that models SNP calling into a maximum weight matching problem. We benchmark Kmer2SNP against reference-free methods including hybrid (assembly-based) and assembly-free methods on both simulated and real datasets. Experimental results show that Kmer2SNP achieves better SNP calling quality while being an order of magnitude faster than the state-of-the-art methods. Kmer2SNP shows the potential of calling SNPs only using k-mers from raw reads without assembly. The source code is freely available at https://github.com/yanboANU/Kmer2SNP.

Description

Citation

Source

Book Title

Entity type

Access Statement

License Rights

Restricted until

2099-12-31