Last updated: 2021-06-02

This project demonstrates use cases for splatPop, an extension of the splat model implemented in Splatter, that allows for the simulation of population-scale single-cell RNA-sequencing data.

The splatPop functions are available in the splatter package (v1.14.1+), available in Bioconductor.

Key features:

Data pre-processing

The empirical data used as reference data in this study were downloaded already processed by the original authors.


The results presented in the paper were produced with the following reproducible analyses. They were generated by rendering the R Markdown documents into web pages available at the links below. Analyses are organized by the empirical dataset was used for reference. Details about where to download the empirical data are in the preprocessing page for each dataset.

  1. 10x Differentiating iPSCs (floor plate progenitors and dopaminergic neurons)
  2. SmartSeq2 iPSCs from HipSci (multiple batches)
  3. 10x Fibroblasts (with and without bleomycin to induce fibrosis)
  4. Example use cases

eQTL mapping was done using the best practices as described in Cuomo, Alveri, Azodi, et. al. Snakemake was used to automate that workflow.

Reference Data

Reference VCF and GFF files for chromosome 22 were downloaded and processed as follows.

Reference VCF

# Download

mv ALL.chr22.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz chr22.vcf.gz

# Filter
conda activate plink
plink --vcf chr22.vcf.gz --biallelic-only --snps-only --geno 0 --maf 0.05 --hwe 0.00001 --indep-pairwise 1600 5 0.75 --keep keep_samples.txt --out chr22.filt --recode

plink --vcf chr22.vcf.gz --extract --keep keep_samples.txt --recode vcf --out chr22.filtered

# Output: chr22.filtered.vcf**

Reference GFF

## Download

## Filter
gunzip Homo_sapiens.GRCh38.99.chromosome.22.gff3.gz
awk '$3 == "gene"' Homo_sapiens.GRCh38.99.chromosome.22.gff3 > Homo_sapiens.GRCh38.99.chromosome.22.genes.gff3
mv Homo_sapiens.GRCh38.99.chromosome.22.genes.gff3 chr22.genes.gff3

# Output: chr22.genes.gff3

With the data downloaded and organised as above, you will be able to reproduce the analyses presented in the RMarkdown files linked to above and, if desired, even run the whole analysis pipeline from raw reads to results following these instructions.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

─ Session info ───────────────────────────────────────────────────────────────
 setting  value                       
 version  R version 4.0.4 (2021-02-15)
 os       Red Hat Enterprise Linux    
 system   x86_64, linux-gnu           
 ui       X11                         
 language (EN)                        
 collate  en_US.UTF-8                 
 ctype    en_US.UTF-8                 
 tz       Australia/Melbourne         
 date     2021-06-02                  

─ Packages ───────────────────────────────────────────────────────────────────
 package     * version date       lib source        
 bslib         0.2.4   2021-01-25 [1] CRAN (R 4.0.3)
 cachem        1.0.4   2021-02-13 [1] CRAN (R 4.0.3)
 callr         3.7.0   2021-04-20 [1] CRAN (R 4.0.4)
 cli           2.5.0   2021-04-26 [1] CRAN (R 4.0.4)
 crayon        1.4.1   2021-02-08 [1] CRAN (R 4.0.4)
 desc          1.3.0   2021-03-05 [1] CRAN (R 4.0.4)
 devtools      2.3.2   2020-09-18 [1] CRAN (R 4.0.2)
 digest        0.6.27  2020-10-24 [1] CRAN (R 4.0.2)
 ellipsis      0.3.1   2020-05-15 [1] CRAN (R 4.0.2)
 evaluate      0.14    2019-05-28 [1] CRAN (R 4.0.2)
 fansi         0.4.2   2021-01-15 [1] CRAN (R 4.0.4)
 fastmap       1.1.0   2021-01-25 [1] CRAN (R 4.0.3)
 fs            1.5.0   2020-07-31 [1] CRAN (R 4.0.2)
 git2r         0.28.0  2021-01-10 [1] CRAN (R 4.0.4)
 glue          1.4.2   2020-08-27 [1] CRAN (R 4.0.2)
 htmltools 2021-01-22 [1] CRAN (R 4.0.3)
 httpuv        1.5.5   2021-01-13 [1] CRAN (R 4.0.4)
 jquerylib     0.1.4   2021-04-26 [1] CRAN (R 4.0.4)
 jsonlite      1.7.2   2020-12-09 [1] CRAN (R 4.0.4)
 knitr         1.32    2021-04-14 [1] CRAN (R 4.0.4)
 later 2020-06-05 [1] CRAN (R 4.0.2)
 lifecycle     1.0.0   2021-02-15 [1] CRAN (R 4.0.4)
 magrittr      2.0.1   2020-11-17 [1] CRAN (R 4.0.3)
 memoise       2.0.0   2021-01-26 [1] CRAN (R 4.0.4)
 pillar        1.6.0   2021-04-13 [1] CRAN (R 4.0.4)
 pkgbuild      1.2.0   2020-12-15 [1] CRAN (R 4.0.4)
 pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.0.2)
 pkgload       1.2.1   2021-04-06 [1] CRAN (R 4.0.4)
 prettyunits   1.1.1   2020-01-24 [1] CRAN (R 4.0.2)
 processx      3.5.2   2021-04-30 [1] CRAN (R 4.0.4)
 promises 2021-02-11 [1] CRAN (R 4.0.4)
 ps            1.6.0   2021-02-28 [1] CRAN (R 4.0.4)
 R6            2.5.0   2020-10-28 [1] CRAN (R 4.0.2)
 Rcpp          1.0.6   2021-01-15 [1] CRAN (R 4.0.4)
 remotes       2.3.0   2021-04-01 [1] CRAN (R 4.0.4)
 rlang         0.4.10  2020-12-30 [1] CRAN (R 4.0.4)
 rmarkdown     2.7     2021-02-19 [1] CRAN (R 4.0.4)
 rprojroot     2.0.2   2020-11-15 [1] CRAN (R 4.0.3)
 sass          0.3.1   2021-01-24 [1] CRAN (R 4.0.3)
 sessioninfo   1.1.1   2018-11-05 [1] CRAN (R 4.0.2)
 stringi       1.5.3   2020-09-09 [1] CRAN (R 4.0.2)
 stringr       1.4.0   2019-02-10 [1] CRAN (R 4.0.2)
 testthat      3.0.0   2020-10-31 [1] CRAN (R 4.0.2)
 tibble        3.1.1   2021-04-18 [1] CRAN (R 4.0.4)
 usethis       1.6.3   2020-09-17 [1] CRAN (R 4.0.2)
 utf8          1.2.1   2021-03-12 [1] CRAN (R 4.0.4)
 vctrs         0.3.7   2021-03-29 [1] CRAN (R 4.0.4)
 whisker       0.4     2019-08-28 [1] CRAN (R 4.0.2)
 withr         2.4.2   2021-04-18 [1] CRAN (R 4.0.4)
 workflowr     1.6.2   2020-04-30 [1] CRAN (R 4.0.2)
 xfun          0.22    2021-03-11 [1] CRAN (R 4.0.4)
 yaml          2.2.1   2020-02-01 [1] CRAN (R 4.0.2)

[1] /mnt/mcfiles/cazodi/R/x86_64-pc-linux-gnu-library/4.0
[2] /opt/R/4.0.4/lib/R/library