vignettes/signac-Seurat_AMP_RA.Rmd
signac-Seurat_AMP_RA.Rmd
In Figure 3 of the pre-print, we validated Signac with flow cytometry and compared Signac to SingleR. We reproduced that analysis using Seurat in this vignette, and provide interactive access to the data here. We start with raw counts.
Read the CEL-seq2 data.
ReadCelseq <- function(counts.file, meta.file) {
E = suppressWarnings(readr::read_tsv(counts.file))
gns <- E$gene
E = E[, -1]
E = Matrix::Matrix(as.matrix(E), sparse = TRUE)
rownames(E) <- gns
E
}
counts.file = "./fls/celseq_matrix_ru10_molecules.tsv.gz"
meta.file = "./fls/celseq_meta.immport.723957.tsv"
E = ReadCelseq(counts.file = counts.file, meta.file = meta.file)
M = suppressWarnings(readr::read_tsv(meta.file))
# filter data based on depth and number of genes detected
kmu = Matrix::colSums(E != 0)
kmu2 = Matrix::colSums(E)
E = E[, kmu > 200 & kmu2 > 500]
# filter by mitochondrial percentage
logik = grepl("^MT-", rownames(E))
MitoFrac = Matrix::colSums(E[logik, ])/Matrix::colSums(E) * 100
E = E[, MitoFrac < 20]
Start with the standard pre-processing steps for a Seurat object.
Create a Seurat object, and then perform SCTransform normalization. Note:
# load data
synovium <- CreateSeuratObject(counts = E, project = "FACs")
# run sctransform
synovium <- SCTransform(synovium, verbose = F)
Perform dimensionality reduction by PCA and UMAP embedding. Note:
# These are now standard steps in the Seurat workflow for visualization and clustering
synovium <- RunPCA(synovium, verbose = FALSE)
synovium <- RunUMAP(synovium, dims = 1:30, verbose = FALSE)
synovium <- FindNeighbors(synovium, dims = 1:30, verbose = FALSE)
Generate Signac labels for the Seurat object. Note:
# Run Signac
labels <- Signac(synovium, num.cores = 4)
celltypes = GenerateLabels(labels, E = synovium)
SignacX (rows are FACs labels, columns are SignacX)
B | F | M | NonImmune | T | Unclassified | |
---|---|---|---|---|---|---|
B | 945 | 0 | 2 | 0 | 0 | 19 |
F | 0 | 2218 | 10 | 223 | 0 | 58 |
M | 1 | 28 | 891 | 18 | 0 | 96 |
T | 4 | 0 | 0 | 0 | 1768 | 21 |
SingleR (rows are FACs labels, columns are SingleR)
B | Chondr. | F | M | NK | NonImmune | T | |
---|---|---|---|---|---|---|---|
B | 958 | 1 | 0 | 6 | 1 | 0 | 0 |
F | 2 | 1468 | 36 | 19 | 23 | 960 | 1 |
M | 4 | 39 | 0 | 964 | 6 | 21 | 0 |
T | 9 | 0 | 0 | 2 | 368 | 0 | 1414 |
Note:
Signac accuracy
logik = xy != "Unclassified"
Signac_Accuracy = round(sum(xy[logik] == True_labels[logik])/sum(logik) * 100, 2)
Signac_Accuracy
## [1] 95.32
SingleR accuracy
## [1] 55.21
Save results
saveRDS(synovium, file = "synovium_signac.rds")
saveRDS(celltypes, file = "synovium_signac_celltypes.rds")
## R version 4.0.0 (2020-04-24)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: CentOS Linux 7 (Core)
##
## Matrix products: default
## BLAS/LAPACK: /usr/lib64/libopenblasp-r0.3.3.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## loaded via a namespace (and not attached):
## [1] knitr_1.31 magrittr_2.0.1 R6_2.5.0 ragg_1.1.1
## [5] rlang_0.4.10 fastmap_1.1.0 highr_0.8 stringr_1.4.0
## [9] tools_4.0.0 xfun_0.21 jquerylib_0.1.3 htmltools_0.5.1.1
## [13] systemfonts_1.0.1 yaml_2.2.1 assertthat_0.2.1 digest_0.6.27
## [17] rprojroot_2.0.2 pkgdown_1.6.1 crayon_1.4.1 textshaping_0.3.1
## [21] formatR_1.7 sass_0.3.1 fs_1.5.0 memoise_2.0.0
## [25] cachem_1.0.3 evaluate_0.14 rmarkdown_2.7 stringi_1.5.3
## [29] compiler_4.0.0 bslib_0.2.4 desc_1.2.0 jsonlite_1.7.2