How are variable genes handled across layers in multi-sample SCT assays? #9869
araboapresyan
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi all,
I’m working with single-cell data from the same tissue collected at three timepoints. After merging them into a single Seurat object, I performed SCTransform(). This fits separate models for each layer.
In the single-layer SCT workflow, the top 3000 variable features are selected and stored in VariableFeatures(), and their Pearson residuals are placed in the scale.data slot.
However, with the multi-layer assay structure, I’ve noticed that the union of variable genes across layers (i.e., the top 3000 selected features) does not match the genes present in the scale.data slot of the combined SCT assay. Some genes in the top variable features list appear to be missing entirely from scale.data, likely because they didn’t pass filtering in one or more individual layers.
These missing genes often represent rare cell types that are present in only one or two samples. This raises a concern: such biologically relevant, sample-specific features may be excluded from downstream steps like PCA or clustering.
Question:
Is there a way to ensure that all top variable features (even if filtered out in some layers) are retained in the scale.data slot of the SCT assay?
I’d really appreciate any clarification on how SCT handles variable features across layers, and if there’s a recommended approach to retain rare or sample-specific genes.
Thanks in advance!
Beta Was this translation helpful? Give feedback.
All reactions