|CRISPR Target Assessment|
CRISTA is based on learning a regression model using the Random Forest algorithm within the machine learning
paradigm. CRISTA can be used to determine the propensity of a genomic site to be cleaved by a given sgRNA.
was trained on a large dataset assembled from published data of genome-wide unbiased methods for CRISPR-Cas9
cleavage sites profiling [1–5]. It accounts for the possibility of bulges and incorporates a wide range of
encompassing those that are specific to the genomic content, features that define the thermodynamics of the
and features concerning the pairwise similarity between the sgRNA and the genomic target. Altogether, these
complex model that can be used to predict the cleavage propensity of a selected genomic site. In contrast to
tools for scoring cleavage scores [6–14], the prediction function used by CRISTA cannot be translated to a
function composed of the mismatched positions and the number/location of bulges.
CRISTA is now published! If you used CRISTA, please cite:
Abadi S, Yan WX, Amar D, Mayrose I (2017) A machine learning approach for predicting CRISPR-Cas9 cleavage efficiencies and patterns underlying its mechanism of action. PLoS Comput Biol 13(10): e1005807.
The website offers you several functionalities:
This straight-forward use of CRISTA provides the predicted cleavage score for a given pair of a nucleotide target and the corresponding sgRNA. The genomic target can be specified by its genomic sequence (with additional 3 bases upstream from the 5’-end and 3 bases downstream of the PAM site) or the genomic coordinates in the desired genome reference. Due to the availability of a wide range of genomic features for the hg19 dataset, the selection of a cell-line would utilize a broader model of CRISTA. A list of pairs can be uploaded using a csv file in both modes.
Given an sgRNA and a genomic assembly, potential targets are detected using the Burrows-Wheeler Aligner (BWA) with the following parameters: “-N -l 20 -i 0 -n 5 -o 3 -d 3 -k 4 -M 0 -O 1 -E 0”. This identifies all targets with up to four mismatches and/or gaps in the 20-nt matching region. For each potential target the cleavage scores are predicted based on the whole collection of features. The targets are then presented in a ranked order according to their CRISTA score. Please note that the user cannot expect an immediate result since, as opposed to most currently available alternatives, CRISTA considers possible bulges within the DNA site or sgRNA.
Given a nucleotide sequence, all the potential targets within it, i.e., those that are followed by an ‘NGG’ sequence (either in the forward or reverse strand), are detected and ranked according to the cleavage score predicted by CRISTA. Note that 3 bases from the beginning and end of the sequence are disregarded since some of the features implemented in CRISTA require 3 bases from each end.
The score predicted by CRISTA represents the frequency of genomic indels at a given site relative
to a cleavage at the on-target of a highly efficient sgRNA. Further details will be made available soon
as our manuscript is published.
1. Ran FA, Cong L, Yan WX, Scott D a., Gootenberg JS, Kriz AJ, et al. In vivo genome editing using Staphylococcus aureus Cas9. Nature. 2015;520: 186–190. doi:10.1038/nature14299
2. Slaymaker IM, Gao L, Zetsche B, Scott DA, Yan WX, Zhang F. Rationally engineered Cas9 nucleases with improved specificity. Science (80- ). 2015; doi:10.1126/science.aad5227
3. Tsai SQ, Zheng Z, Nguyen NT, Liebers M, Topkar V V, Thapar V, et al. GUIDE-seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases. Nat Biotechnol. 2014;33: 187–197. doi:10.1038/nbt.3117
4. Kleinstiver BP, Prew MS, Tsai SQ, Topkar V V., Nguyen NT, Zheng Z, et al. Engineered CRISPR-Cas9 nucleases with altered PAM specificities. Nature. 2015;523: 481–5. doi:10.1038/nature14592
5. Frock RL, Hu J, Meyers RM, Ho Y, Kii E, Alt FW. Genome-wide detection of DNA double-stranded breaks induced by engineered nucleases. Nat Biotechnol. 2014;33: 179–186. doi:10.1038/nbt.3101
6. Xiao A, Cheng Z, Kong L, Zhu Z, Lin S, Gao G, et al. CasOT: a genome-wide Cas9/gRNA off-target searching tool. Bioinformatics. 2014; doi:10.1093/bioinformatics/btt764
7. Haeussler M, Schönig K, Eckert H, Eschstruth A, Mianné J, Renaud J-B, et al. Evaluation of off-target and on-target scoring algorithms and integration into the guide RNA selection tool CRISPOR. Genome Biol. 2016;17: 148. doi:10.1186/s13059-016-1012-2
8. Heigwer F, Kerr G, Boutros M. E-CRISP: fast CRISPR target site identification. Nat Methods. Nature Research; 2014;11: 122–123. doi:10.1038/nmeth.2812
9. Hwang WY, Fu Y, Reyon D, Maeder ML, Tsai SQ, Sander JD, et al. Efficient genome editing in zebrafish using a CRISPR-Cas system. Nat Biotechnol. Nature Research; 2013;31: 227–229. doi:10.1038/nbt.2501
10. O’Brien A, Bailey TL. GT-Scan: identifying unique genomic targets. Bioinformatics. Oxford University Press; 2014;30: 2673–2675. doi:10.1093/bioinformatics/btu354
11. MacPherson CR, Scherf A. Flexible guide-RNA design for CRISPR applications using Protospacer Workbench. Nat Biotechnol. Nature Research; 2015;33: 805. doi:10.1038/nbt.3291
12. Marraffini LA, Sontheimer EJ, Marraffini L, Sontheimer E, Barrangou R, Fremaux C, et al. CRISPR Interference Limits Horizontal Gene Transfer in Staphylococci by Targeting DNA. Science (80- ). BioMed Central; 2008;322: 1843–1845. doi:10.1126/science.1165771
13. Naito Y, Hino K, Bono H, Ui-Tei K, S.F. A, S. B, et al. CRISPRdirect: software for designing CRISPR/Cas guide RNA with reduced off-target sites. Bioinformatics. Oxford University Press; 2015;31: 1120–1123. doi:10.1093/bioinformatics/btu743
14. Xie S, Shen B, Zhang C, Huang X, Zhang Y, Cong L, et al. sgRNAcas9: A Software Package for Designing CRISPR sgRNA and Evaluating Potential Off-Target Cleavage Sites. Khodursky AB, editor. PLoS One. Public Library of Science; 2014;9: e100448. doi:10.1371/journal.pone.0100448