Signac
trains and then uses an ensemble of neural networks to classify cellular phenotypes using an expression matrix or Seurat object.
The neural networks are trained with the HPCA training data using only features that are present
in both the single cell and HPCA training data set. Signac
returns annotations at each level of the classification
hierarchy, which are then converted into cell type labels using GenerateLabels
. For a faster alternative,
try SignacFast
, which uses pre-computed neural network models.
Signac( E, R = "default", spring.dir = NULL, N = 100, num.cores = 1, threshold = 0, smooth = TRUE, impute = TRUE, verbose = TRUE, do.normalize = TRUE, return.probability = FALSE, hidden = 1, set.seed = TRUE, seed = "42" )
E | a sparse gene (rows) by cell (column) matrix, or a Seurat object. Rows are HUGO symbols. |
---|---|
R | Reference data. If 'default', R is set to GetTrainingData_HPCA(). |
spring.dir | If using SPRING, directory to categorical_coloring_data.json. Default is NULL. |
N | Number of machine learning models to train (for nn and svm). Default is 100. |
num.cores | Number of cores to use. Default is 1. |
threshold | Probability threshold for assigning cells to "Unclassified." Default is 0. |
smooth | if TRUE, smooths the cell type classifications. Default is TRUE. |
impute | if TRUE, gene expression values are imputed prior to cell type classification (see |
verbose | if TRUE, code will report outputs. Default is TRUE. |
do.normalize | if TRUE, cells are normalized to the mean library size. Default is TRUE. |
return.probability | if TRUE, returns the probability associated with each cell type label. Default is TRUE. |
hidden | Number of hidden layers in the neural network. Default is 1. |
set.seed | If true, seed is set to ensure reproducibility of these results. Default is TRUE. |
seed | if set.seed is TRUE, seed is set to 42. |
A list of character vectors: cell type annotations (L1, L2, ...) at each level of the hierarchy as well as 'clusters' for the Louvain clustering results.
SignacFast
, a faster alternative that only differs from Signac
in nuanced T cell phenotypes.
if (FALSE) { # download single cell data for classification file.dir = "https://cf.10xgenomics.com/samples/cell-exp/3.0.0/pbmc_1k_v3/" file = "pbmc_1k_v3_filtered_feature_bc_matrix.h5" download.file(paste0(file.dir, file), "Ex.h5") # load data, process with Seurat library(Seurat) E = Read10X_h5(filename = "Ex.h5") pbmc <- CreateSeuratObject(counts = E, project = "pbmc") # run Seurat pipeline pbmc <- SCTransform(pbmc, verbose = FALSE) pbmc <- RunPCA(pbmc, verbose = FALSE) pbmc <- RunUMAP(pbmc, dims = 1:30, verbose = FALSE) pbmc <- FindNeighbors(pbmc, dims = 1:30, verbose = FALSE) # classify cells labels = Signac(E = pbmc) celltypes = GenerateLabels(labels, E = pbmc) # add labels to Seurat object, visualize pbmc <- Seurat::AddMetaData(pbmc, metadata=celltypes$CellTypes_novel, col.name = "immmune") pbmc <- Seurat::SetIdent(pbmc, value='immmune') DimPlot(pbmc) # save results saveRDS(pbmc, "example_pbmcs.rds") }