Designing sgRNAs with CRISPy-web
by Kai Blin, Sang Yup Lee and Tilmann Weber (Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark) Labtimes 01/2017
Tilmann Weber’s group at the Novo Nordisk Foundation Center for Biosustainability developed a user-friendly, web server implementation of the sgRNA prediction software, CRISPy, for non-computer scientists.
The development of CRISPR/Cas9, which originates from a bacterial plasmid/phage defense system, into a powerful, genome-editing tool has been one of the major breakthrough technologies in biotechnology within the last few years. With the RNA-guided endonuclease Cas9 from the Streptococcus pyogenes CRISPR system, currently the most widely used enzyme, it is nowadays feasible to highly efficiently edit DNA in a broad variety of organisms. The method works in most organisms that allow the expression of the different components.
Photo: Tilmann Weber
Simplified, Cas9 can be regarded as a programmable, blunt-cutting restriction endonuclease that recognises its target DNA sequence by Watson-Crick base-pairing with the ~20bp protospacer (crRNA) that is bound to the Cas9/tracerRNA complex and cleaves the target DNA within this protospacer region, i.e., is complementary to the target region. Cas9 and the RNAs may be used in vitro but may also be expressed within the cell.
Soon, it became evident that the genes encoding the tracer RNA and crRNA, which in the native system bind Cas9 as individual RNAs, can artificially be linked to a single-guide RNA (sgRNA), which still efficiently directs the Cas9 endonuclease to its target, while at the same time being easily cloned and expressed (Science 337:816–821). Another prerequisite for cleavage by Cas9 is the presence of a Protospacer Adjacent Motif (PAM) that has to directly follow the DNA sequence, to be targeted by the protospacer. In the case of the S. pyogenes Cas9, this PAM is NGG.
For a biotechnological application of this CRISPR-system it is, therefore, essential to design the target sequence for the sgRNAs in a way that ensures its placement directly in front of a PAM and, on the other hand, is unique within the genome to avoid that Cas9 cleaves at other positions in the genome than desired. In addition, Cas9 unfortunately also displays “off-target” effects, i.e., cleavage activity at positions not 100% identical to the protospacer sequence. These happen mostly at sequences, which still have some similarity to the protospacer but no 100% match.
Avoiding these sequences is quite challenging, when designing the sgRNAs by hand – computational tools are highly recommended to be used for finding suitable sequences.
If you are working with model organisms, there are many different programmes and websites available that offer such functionality, for example, CHOPCHOP (Nucleic Acids Res 42: 401-07), or CCTop (PLoS ONE, 10:e0124633). However, when we started our work with developing the CRISPR technology for the organisms we work in our lab, no easy usable tool existed, to enable designing sgRNAs for such non-model organisms, i.e., tools that allow the users to provide an arbitrary genome sequence against which may be searched.
Therefore, we have developed CRISpy-web, a web-based tool to design sgRNAs for non-model microorganisms (Synth Syst Biotechnol 1:118-21). CRISpy-web is based on the software CRISpy, a web-tool to design sgRNAs for use with Chinese Hamster Ovary (CHO) cells that was previously developed at our institute (Biotechnol Bioeng 111:1604-16). CRISPy-web is freely accessible at: http://crispy.secondarymetabolites.org.
The first step to use CRISpy-web is to upload the genome sequence of the microorganism of interest in Genbank format. Alternatively, sequences can be directly transferred from the antiSMASH secondary metabolite genome mining platform (Nucleic Acids Res 43: 237-43), by entering the antiSMASH job ID instead. After selecting and uploading the genome to be analysed, in the next screen, the target region to predict protospacer sequences can be specified. It can be defined either by entering the positions as a range (e.g., 1234-5678), by providing the locus tag, gene name or protein ID (if annotated in the Genbank sequence), or – if the data was pre-analysed with antiSMASH – the gene cluster number. In this case, the antiSMASH detected gene cluster can also be directly selected by clicking on the respective line in the displayed table.
When pressing the “Find Targets” button, suitable protospacer sequences are identified in the selected region of the genome of interest. Depending on the size of the genome sequence to analyse, this step can take a few minutes to complete. On the top line of the screen, genes encoded within the selected region of the genome are displayed as arrows; in the case that the region contains several genes, the user can zoom in on individual genes by clicking on the gene of interest and selecting “show results for this gene only”.
Potential protospacer motifs are indicated as little red boxes, depending on their DNA strand orientation (forward strand on top, reverse strand on bottom) and listed in the table – sorted by quality (uniqueness). To select a potential protospacer sequence for export, either click on the red box or click the shopping basket in the list view.
On clicking the “checkout” button at top right of the screen, a table containing all selected protospacer sequences is displayed and ready for export as a CSV file, and ready to be used in the individual sgRNA cloning workflows.
Last Changed: 11.02.2017