Method Special: Single Cell Analysis
by Steven Buckingham, Labtimes 04/2017
For centuries, biologists have preferred to analyse cells in large numbers, averaging out the results. But now things are beginning to change and we are starting to pay closer attention to the idiosyncratic properties of individual cells. Why is this happening – and in which direction is it taking us?
Photo: University of South Florida
It kicks in like a reflex, doesn't it? You are designing an experiment to characterise a class of cells and your instinctive, go-to methodology is to split the cells into two populations (“control” and “test”, for example), run the experiment on large numbers of cells and analyse the data. Chances are, you will average out the data from the two conditions, and use the mean and spread to run some sort of statistics test.
But have you thought about why you do it this way? Of course I have! (I hear you say). It is to take account of the inherent noise, the meaningless variations between individual cells. After all, what I am interested in is fundamental biological principles and to get that I have to separate out the signal from the noise.
Well, that depends on whether it really is noise or not. And you could say that the emerging trend of single cell analysis represents a growing realisation that those idiosyncratic properties of individual cells actually isn't noise, it's data.
Some of the drive behind the rise of single cell analysis comes from natural developments in the way we think about biology. Scientists studying tumours, for example, found that there are tumour cells, and then there are tumour cells – depending on where in the tumour your cell is. In other words, there are substantial and consistent differences in individual tumour cells, even within the same tumour. And that becomes very important, if you are trying to target cells with a drug aimed at a particular protein or if you are trying to measure the number of cells using a specific cell marker. And of course, you don't need to tell immunologists about the importance of individual cells – our adaptive immune system is founded on such individuality.
But there is a deeper reason, why scientists are starting to look more at the things that distinguish cells than what unites them and that is to do with advances in the technology. When your method of measurement is crude or expensive, it makes sense to look for a single factor that cells within a population share and to design your experiments, to tease out that factor from the “noise”. But new technologies are changing these economics. Sequencing has got a whole lot faster; there has been a rapid growth in the palette of genomic, transcriptomic and proteinomic techniques for the skilled artist to combine and work with.
Single cell analysis really started to take off in 2009, when Fuchou Tan's group at University of Cambridge, UK published a paper in Nature Methods describing how they used messenger RNA sequencing (mRNA-Seq) on a single blastomere (Nature Methods, 6, 377-82). Just two years later, Nicholas Navin's team at Cold Spring Harbor Laboratory, USA, took up the theme and described in a key Nature paper the first case of single-cell whole-genome sequencing (Nature 472, 90-4).
Navin's approach was to take cells and lyse them to release the nuclei. Individual nuclei are then identified using a FACS machine and deposited into single wells of a 96-well plate containing lysis buffer. The liberated DNA was fragmented and subjected to whole-genome amplification. Using this technique, Navin was able to sequence 100 single cells.
The following year, Yingrui Li, Xiuqing Zhang and Jun Wang's team at BGI-Shenzhen, China, applied exome-sequencing to cells taken from the person, whose genome was the first Asian to be sequenced. Using such a source meant they could confirm that single-cell exome sequencing was of a similar quality to mass-genome sequencing and then went on to apply the technique to identify new mutations that mark tumour evolution in a JAK2-negative myeloproliferative neoplasm patient.
Again, just one year later, Ernest Laue, Amos Tanay and Peter Fraser's team at Cambridge University, Cambridge's Babraham Institute and the Weizmann Institute, Israel took the development to the level of epigenetics, by introducing a single-cell version of a variant of chromosome conformation capture (3C) called Hi-C. In the standard, multi-cell versions of these methods, chromosomes are extracted from the cells and cross-linked to capture connections between epigenetic sites. These might be well-separated in terms of their location on the genome but, due to the way the genome is packed together, they are actually closely located in 3D space.
Fraser adapted the technique by taking CD4+ cells from mice, differentiating them into T-helper cells and fixing them. Fraser then cross-linked the DNA while it was still in the nuclei. They then extracted individual nuclei from which they prepared single-cell Hi-C libraries for paired-end sequencing.
So, in the interval from 2009 to 2013, roughly the time it takes to do a PhD, we have developed key techniques for single-cell analysis at three levels in the central dogma: genomics, exomics and epigenetics. Of these, single-cell analysis using RNA-Seq is the most advanced and most used.
Given the range of DNA, RNA and protein methods and techniques that can be applied, it is no surprise that the list of single-cell methods is growing at a high rate. But they all share some basic core steps, each with its own limitations, challenges and opportunities. First of all, you have to get your single cells and that is not as easy as you might imagine. The most obvious way, you might think, would be to take a population of cells and dilute it down until you get one in each well. However, labs that have tried this have invariably found it unsatisfactory.
An alternative is to use laser capture or laser micro-dissection to cut single cells out of tissue. Slow and laborious, it is not suitable for high-throughput or discovery platforms, at least it means you have good control of where the cells come from. But perhaps the most commonly used method is FACS cell sorting, which is fast and reproducible. Its versatility arises from the wide choice of proteins to which you can attach the fluorescent marker.
All these methods for single-cell isolation assume you are starting with a source rich in cells but sometimes you don't have that luxury. Isolating rare cells can be a real challenge. With a lot of single-cell analysis centring on cancer research, there is great interest in being able to harvest rare circulating tumour cells (CTCs), so much so that a number of commercial applications and kits are now available.
Indeed, the only FDA-approved kit for quantitatively harvesting CTCs is CellSearch(R), which works by sticking antibodies that target epithelial cell adhesion molecules onto magnetic nanoparticles, allowing cells to be separated using a magnetic field. Another system, DEPArray from Silicone Biosystems, uses complex electrical fields to separate cells, which are then picked out using visual identification.
The development of microfluidics has proven to be a leg-up to the evolution of single-cell analysis, by providing a set of options for manipulating cells and providing tiny reaction chambers for amplification. A microfluidic chip the size of a microscope slide can allow the automated sorting of cells and allow cells to be captured in water droplets suspended (and hence separated from each other) in oil. The droplets become tiny reactor chambers, their small size meaning effectively instantaneous temperature changes and diffusion.
But single-cell methods are beset with difficulties that are only just beginning to be addressed. And these difficulties are not trivial. For one thing, they all depend on amplification, and that brings with it the demons of unequal capture and amplification bias. Unequal capture refers to the random nature of the amplification reaction – there is a real chance that a DNA or RNA molecule just won't get picked up at the first stages of the reaction and so never gets amplified.
Amplification bias is the nemesis of all amplification-based quantitative methods. The PCR reaction is, as the name tells us, a chain-reaction. That means that small variations at the beginning of the reaction get amplified logarithmically as the reaction proceeds. Hence, you can never be sure what the relationship is between the strength of your final signal and the abundance of the transcript in the one cell you have amplified from.
Although you have the same problem in “traditional” RNA-Seq, in single-cell RNA-Seq the problem is exaggerated because you begin with such a small amount of starting material. On top of which, the starting material is particularly sparse – with a single cell you are much more likely to get cases of single genes that aren't expressed at all. So, we have a small quantity of a sparse set being amplified non-linearly: you can see what we are up against.
The success of single-cell analysis depends on solving this problem. One solution takes the approach of effectively labelling each transcript with an unique barcode right at the beginning of the reaction. This is an approach taken by CytoSeq (Fan et al., Science, 347, 2015). Cells are incubated, one-by-one, with a single bead, bearing a number of oligonucleotide primers. However, all the oligos on the bead have a unique code identifying that bead. In addition, each oligo also bears its own unique code. When the material is amplified, the bead-specific code allows the cell to be identified, while the oligo-specific code allows you to determine which oligo first entered the reaction. So, rather than measuring abundance of a transcript, you just count the number of different oligo-specific codes.
Although most single cell work has been performed on the transcriptome, some labs have directed their attention to the next level – the proteome. Single-cell proteomics usually rely on antibody labelling. Cells are often placed in single wells and some method of restraining an antibody is used. For instance, a slide bearing immobilised antibodies can be placed onto an array of wells, each with a single cell. Positive signals can be quantified and compared to live-cell imaging of the cells.
Another approach to single-cell proteomics is CyTOF (Fluidigm). Here, you label cells with an antibody but instead of tracing the antibody with an attached fluorescent probe, you label the antibody with a heavy metal. Single cells then undergo Time-Of-Flight mass spectrometry. Proof that single-cell proteomics is becoming mainstream is the fact that Fluidigm are selling a comprehensive CyTOF kit (Helios).
These technologies have brought about a shift in practice away from thinking of cells en masse and this has turned attention towards looking directly at cell variability, bringing about a change in perspective on the ways tissues work. For instance, a collaboration between several centres looked at 25,000 single-cell transcriptomes from retinal cells and used unsupervised clustering to reveal 15 types of cell. What is particularly satisfying is that 13 of these sub-classes were already known but two of them were new. Thus, decades of expertise on retinal cell types were recapitulated, and extended, in one (admittedly large) experiment.
Where is single-cell analysis heading in the near future? Standard procedures for single cell genomics, transcriptomics and proteomics are emerging, and there will surely be incremental improvements in solving the core challenges, such as capture and amplification bias. However, as Iain Macaulay, Chris Ponting and Thierry Voet point out in a recent review, there will arise cases where more than one “omic” in a cell needs to be analysed at the same time (Trends in Genetics 33, 155-68).
To get the full picture of what is happening in a cell, we would ideally want to compare what is happening at the DNA, RNA and protein levels. But how can you do this on the same cell? Siddarth Dey and Lennart Kester of the Royal Netherlands Academy of Arts and Sciences, and the University Medical Center Utrecht, showed how it can be done by incorporating a quasi-linear amplification step – a method they call DR-seq (Nature Biotechnology 33).
An alternative approach, typified by Macaulay, Ponting and Voet’s “G&T-seq”, is to separate the DNA and RNA before amplification, pulling out the polyadenylated mRNA from the DNA using oligo-dT-coated magnetic beads. Similar methods are being developed to combine transcriptomics and epigenetics in the single-cell setting. In single-cell methylome and transcriptome sequencing (scM&T-seq), for instance, the RNA is pulled away, as in G&T-seq, and the pattern of DNA methylation determined using standard bisulphite-sequencing (Angermueller et al. Nat. Methods 13).
It is no exaggeration to say that single cell analysis is a real trend. I did a quick PubMed search using the term “single cell” as my term and I was told that in 2010 there were 1,269 publications in the database, whereas in 2016 it hit 2,512. But that doesn't mean it is ready for routine use in non-expert labs. Anyone taking the technique on must be ready to do a lot of optimisation and will probably only get the job done by setting up a multi-centre collaboration.
On top of the technical challenges involved in doing single-cell analysis, there lies a deeper challenge. That is to do with the data itself. Abandoning our reliance on the law of mass action (the notion that if you make measurements on lots of examples, the effects of individual variation get averaged out) is a big change. We will need new statistical tools, new tricks of data visualisation and new ways of describing and interpreting data of very high dimensionality. True, some of the basic tools are already there, but they are not immediately understandable to many biologists, for whom principal component analysis, hierarchical agglomerative clustering and multidimensional scaling are arcane mysteries.
John F. Kennedy once said, “What unites us is far greater than what divides us.” It seems biology doesn't agree.
Last Changed: 28.08.2017