High Activity Target-Site Identification Using Phenotypic Independent CRISPR-Cas9 Core Functionality
Date
Authors
Wilson, Laurence
Reti, Daniel
O'Brien, Aidan
Dunne, Robert A.
Bauer, Denis C
Journal Title
Journal ISSN
Volume Title
Publisher
Mary Ann Liebert, Inc.
Abstract
The activity of CRISPR-Cas9 target sites can be measured experimentally through phenotypic assays or mutation rate and used to build computational models to predict activity of novel target sites. However, currently published models have been reported to perform poorly in situations other than their training conditions. In this study, we hence investigate how different sources of data influence predictive power and identify the best data set for the most robust predictive model. We use the activity of 28,606 target sites and a machine learning approach to train a predictive model of CRISPR-Cas9 activity, outperforming other published methods by an average increase in accuracy of 80% for prediction of the degree of activity and 13% for classification into active and inactive categories. We find that using data sets that measure CRISPR-Cas9 activity through sequencing provides more accurate predictions of activity. Our model, dubbed TUSCAN, is highly scalable, predicting the activity of 5000 target sites in under 7 s, making it suitable for genome-wide screens. We conclude that sophisticated machine learning methods can classify binary CRISPR-Cas9 activity; however, predicting fine-scale activity scores will require larger data sets directly measuring Indel insertion rate.
Description
Keywords
Citation
Collections
Source
The CRISPR Journal
Type
Book Title
Entity type
Access Statement
License Rights
Restricted until
2099-12-31
Downloads
File
Description