From GWAS to the fishtank

This is a guest post by Mari Niemi, an MSc student visiting the lab this year.

In the last few years, a major focus of the group has been identifying the genomic regions associated with risk of inflammatory bowel disease (IBD). Currently, there are 163 loci associated with the condition; the largest number of associations for a complex disease to date, explaining 13.6% Crohn’s disease and 7.5% ulcerative colitis total disease variance. These lists of associated loci are drawn up with some heavy-duty statistical computing, but still leave key questions about which genes in those regions are actually responsible for susceptibility to IBD – and what their role is in this complex plot? In order to understand more about the disease we need to functionally annotate these IBD candidate genes, and to do so we need to get our hands (quite literally) dirty in a laboratory!

A classical and robust method for assessing the function of a gene is to look at what effects the loss of the protein’s normal function has on a morphological and molecular level. As the list of associated genes continues to grow, we decided to shortlist a set of ten genes (NOD2, CYLD, SNX20, MST1, PTGER4, IFNGR2, IFNG, IL12RB2, IL10RB and IL23R) for closer inspection for this loss of function (LoF) analysis. Knocking out genes in human embryos and observing the outcome is rather unfeasible (and unethical), so instead we have opted to use a model system. But in order to begin our functional analysis, the first hurdle to tackle is the question of which model would best fit our purpose of assessing IBD candidate gene function?

The Sanger Institute maintains facilities for three main groups of model systems – the mouse, zebrafish and, more recently, induced pluripotent stem cells (IPSCs). Cell lines can be useful for tissue-specific experimental design, however, using a model organism for LoF analysis allows for assessing molecular and phenotypic changes in a complete in vivo system. Mice have been shown to be an effective model for several human diseases; however, generating a knockout line for each of our ten candidate genes – not to mention analysing the data – would be very expensive and require years of work. So we instead look to the zebrafish (Danio rerio), which in fact fits in rather nicely with our grand scheme of dissecting functional information about an extensive list of candidate genes.

Although we may not look much like one another, humans share ~70% of their genes with this tiny tropical vertebrate, making it possible for us to analyse many orthologous candidate genes with this model. As a model organism, zebrafish are fairly easy and cheap to maintain, and when they reach maturity at around four months of age they are able to produce large clutches of embryos in a single crossing. These embryos then develop ex utero, meaning it is easy to observe and document their development.

Partly with the aim of enabling the zebrafish as a genetic model, the Sanger has recently led the completion of its reference genome to a level only comparable to human and mouse genomes (and if you look closely you’ll notice the Barrett lab contributed Figure 2, in an attempt to understand sex determination in zebrafish!). The Zebrafish Mutation Project aims to expand on this work by creating a knockout (LoF) allele for every single protein-coding gene in the 26,000-gene strong zebrafish genome through introduction of random mutations via ENU-treatment. Currently, the facility holds a catalogue of mutant alleles for 48% of the zebrafish protein-coding genes. To our benefit, it so happened that by early spring the ZMP already possessed a knockout allele for four genes of our interest: NOD2 (nod2), CYLD, (cylda), IFNG (ifng1-1, ifng1-2) and SNX20 (snx20). This meant that in February 2013, the ZMP were able to produce families of fish heterozygous for these mutant alleles for us to use in our analysis later on this summer. We have now begun to cross these fish, and we expect approximately 25% of their offspring to be homozygous for the mutations, allowing us to conduct loss of function analysis on these genes.

So what about obtaining knockout alleles for the remaining genes on our candidate list? The ZMP have estimated that with the current method of random mutation, around 25% of zebrafish genes will be difficult to hit due to e.g. small size of some genes; thus we may have to wait a while to get our hands on these alleles. However, there’s a new technique on the market that may well provide us a quick and (relatively) easy solution.

In January 2013, Dr Joung’s lab showed that by using a new bacterial derived, engineered CRISPR-Cas9-system it is possible to target specific loci in the zebrafish genome and induce short insertion/deletions (indels) that typically lead to frame shifts, causing premature stops. This method uses a combination of Cas9-mRNA and a so-called single-guide RNA (sgRNA) that are injected into zebrafish embryos at the one cell stage of development. The sgRNA contains a 22nt-long targeting site, which can be designed to bind complementarily to any N20GG genomic sequence. Within the dividing cells, Cas9-mRNA is eventually translated into an endonuclease protein after which it forms a complex with the sgRNA at its target site. The Cas9 makes a double-stranded cut in the genome which is then corrected via the endogenous, error-prone non-homologous end-joining (NHEJ) DNA repair system, leading to short indel changes in the gene.

We have now designed sgRNAs to target specific sites within the nod2 (NOD2), il12rb2 (IL12RB2), il23r (IL23R) and crfb4 (IL10RB) protein-coding sequences. So far, we have tested the nod2-CRISPR system, and as you can see from the figure below, this method seems to be highly efficient in creating mutations at the cut target site! Our hope now is that some of these fish will carry a LoF mutation in their germline cells. This would mean that after we grow the fish to adulthood, some of their offspring would be carriers of the faulty gene, which would allow us to use those offspring to eventually raise heterozygous mutant lines.

Putting all of this together, once we have obtained these heterozygous mutant lines we plan to carry out a phenotypic analysis followed by transcriptomics on their homozygous offspring. This would allow us to examine how knocking out a candidate gene for IBD changes the whole organism’s gene expression profile compared to wild type controls. This should provide us information on the function and networks of regulation of these candidate genes, hopefully highlighting interesting new information that may be valuable to the IBD project and for refining the original list of 163 associated loci.

Zebrafish embryos were injected at one cell stage with Cas9-mRNA and sgRNA targeting nod2. DNA was then taken from these embryos at day 5, and a 243bp-long region spanning the gRNA-binding site was amplified with PCR. The sgRNAs had been designed specifically so that the 22nt DNA-binding region contained a restriction site. The Cas9-protein cuts near this binding site, which can thus introduce of an indel at the restriction site. When the PCR-amplified DNA fragments were digested with NcoI, fragments without a mutation were cut into 118bp+125bp, shown as the lower molecular weight band (bottom band) on a 1% agarose gel. Embryos injected with a higher (11.9 ng/ul) concentration of gRNA+Cas9 mRNA show a prominent band of undigested, CRISPR-Cas9 modified fragments (upper band) at varying proportions compared to digested wildtype DNA in the samples. Injecting a lower concentration (5.4 ng/ul) of gRNA also causes mutations at the restriction site, but at a lower efficiency.

Latest Images

Trending Articles

Latest Images