Cell-type annotation#

Cell-type annotation is fundamental for data interpretation and a prerequisite for almost any downstream analysis. Performing cell-type annotation manually based on clustering is a widely-used and straightforward approach. To this end, previously known marker genes can be checked for their expression across clusters and cell-type labels assigned accordingly. Alternatively, as a more unbiased approach, it is possible to perform differential gene expression analysis between clusters, and assign cell-types based on cluster-specific differentially expressed (DE) genes. Panels of marker genes can be obtained from publications or cell-type databases such as CellMarker [103] or PanglaoDB [104].

1. Load the required libraries#

import pandas as pd
import scanpy as sc

2. Load input data#

adata = sc.read_h5ad("../../data/input_data_zenodo/atlas-integrated-annotated.h5ad")

markers = pd.read_csv("../../data/input_data_zenodo/cell_type_markers_lung.tsv", sep="\t")

markers

	cell_type	gene_symbol
0	Alevolar cell type 1	AGER
1	Alevolar cell type 1	CLDN18
2	Alevolar cell type 2	SFTPC
3	Alevolar cell type 2	SFTPB
4	Alevolar cell type 2	SFTPA1
...	...	...
84	Vessel	PECAM1
85	Endothelial cell lymphatic	VWF
86	Endothelial cell lymphatic	CCL21
87	Neutrophils	CSF3R
88	Neutrophils	CXCR2

89 rows × 2 columns

3. Perform unsupervised clustering using the leiden algorithm#

sc.tl.leiden(adata, resolution=1)

sc.pl.umap(adata, color="leiden")

../_images/b2b03ef663aad6933e1115eeaab88cc42a745265ba9a3784a5d98814c24accab.png

Cell-type annotation

Contents

Cell-type annotation#

1. Load the required libraries#

2. Load input data#

3. Perform unsupervised clustering using the leiden algorithm#