On Friday, the AAI 2016 meeting in Seattle started off with a review session on fast moving fields, with the first talk given by John T. Chang from the University of California on single cell technology. His talk focused on single cell approaches in immunology and on gene expression analysis in particular.
The beginning of his presentation nicely summarized the different single cell (SC) approaches available to date, as being flow cytometry analysis, CYTOF, which Garry P. Nolan gave further insights on in the second talk, and approaches such as SC gene expression analysis, which can be performed by qPCR or SC sequencing. Lately also Chromatin analysis such as ATAC experiments were performed on the SC level.
He pointed out that in difference to previous study strategies, such as marker based subset analysis, SC approaches offer the possibility of “bottom up analysis”. Meaning, that by using SC approaches such as SC sequencing for gene expression analysis we have the great possibility to analyze gene expression patterns in an unbiased way. This is in particular important in the context of very heterogenetic cell contexts. He gave a nice overview of studies published so far using SC expression analysis (also mentioning our latest published work, thanks for that). By analyzing the gene expression profiles in a Principal Component Analysis (PCA), giving you the degree of variation, or also in a hierarchical clustering, often used in combination with a heat map, one can identify cell subsets or populations based on their similarity in gene expression and different patterns compared to distinct cell populations. Further, those data can then be used as a basis to identify proper marker genes for a certain cell population, by analyzing the top upregulated genes. This can ultimately lead to the identification of new markers and refinement of so far established developmental maps. In this context, he also mentioned work done by Paul et al. (Cell 2016), where they gave an example for currently used markers (FcgR and CD34 to distinguish GMP, CMP and MEP in myeloid progenitor populations) which based on the expression profile clustering do not directly reflect the expected cell population patterns.
He ended with the recommendation of some literature for those of you being interested, I list them here again:
Kolodziejczyk, Molecular Cell 2015
Stegle, Teichmann, Nat Rev Genetics 2015
Liu and Trapnell, F1000 Research 2016
Grün and van Oudenaarden, Cell 2015
Satija and Shalek, Trends in Immunology 2014
In the discussion the following question came up: “How many cells does one need to sequence to get a robust dataset.” Correctly, the financial limitation was pointed out and that in this context, the-more-the-better holds true. Here I would like to take the opportunity to add one example. In our studies (Brennecke et al.) we applied SC seq to medullary thymic epithelial cells (mTECs) which are known be highly heterogeneous in their gene expression. mTECs express the body´s own peptides (called tissue-restricted antigens, TRAs) in a so called mosaic expression fashion, in which each self-peptide is just expressed in about 1-3 percent of the cells, in sum on the population level adding up to the complete repertoire of self-antigens being presented to developing thymocytes in the thymus. When we sequenced 203 mTECs, 95% of the previously reported TRA-coding genes were detected. This surprisingly high coverage led to the conclusion that thymocytes might capture the self-repertoire already by scanning a small proportion of the mTEC population.
However, in general, as mentioned by Chang, the number of cells which will be necessary to analyze in order to get a robust data set vary according to the cell context and the degree of variation that is observed.
It will be interesting to see how this technology develops and as discussed we are still facing limitations in the sensitivity of this technology and the challenges will be to establish optimized data analysis approaches to get the most out of the generated data.