atlas_protocol_scripts.tl.CpdbAnalysis#
- class atlas_protocol_scripts.tl.CpdbAnalysis(cpdb, adata, *, pseudobulk_group_by, cell_type_column, min_obs=10)#
Class that handles comparative cellphonedb analysis.
- Parameters
cpdb – pandas data frame with cellphonedb interactions. Required columns:
source_genesymbols
,target_genesymbol
. You can get this from omnipathdb: https://omnipathdb.org/interactions/?fields=sources,references&genesymbols=1&databases=CellPhoneDBadata – Anndata object with the target cells. Will use this to derive mean fraction of expressed cells. Should contain counts in X.
pseudobulk_group_by (
list
[str
]) – Pseudobulk is used to compute the mean fraction of expressed cells by patientcell_type_column (
str
) – Column in anndata that contains the cell-type annotation.min_obs (default:
10
) – Only consider samples with at leastmin_obs
cells for pseudobulk analysis.
Methods table#
|
Plot cpdb results as heatmap. |
|
Generates a data frame of differentiall cellphonedb interactions. |
Methods#
- CpdbAnalysis.plot_result(cpdb_res, *, pvalue_col='fdr', fc_col='log2FoldChange', group_col='group', title='CPDB analysis', aggregate=True, clip_fc_at=(-5, 5), label_limit=100, cluster='dotplot', de_genes_mode='ligand')#
Plot cpdb results as heatmap.
- Parameters
cpdb_res – result of significant_interactions. May be further filtered or modified.
pvalue_col (default:
'fdr'
) – column in cpdb_res that contains the pvalue of ligands (or receptors) used for the upper panel of the plotfc_col (default:
'log2FoldChange'
) – column in cpdb_res that contains the log fold change of ligands (or receptors) used for the upper panel of the plotgroup_col (default:
'group'
) – column to be used for the y axis of the heatmaptitle (default:
'CPDB analysis'
) – main title of the plotaggregate (default:
True
) – whether to merge multiple targets of the same ligand into a single columnclip_fc_at (default:
(-5, 5)
) – Limit the maximum log fold change at this valuelabel_limit (default:
100
) – Maximum length before a gene symbol gets truncated (plays a role when using aggregate=True)cluster (
Literal
['heatmap'
,'dotplot'
,None
] (default:'dotplot'
)) – whether to cluster the heatmap or the dotplot or neitherde_genes_mode (
Literal
['ligand'
,'receptor'
] (default:'ligand'
)) – If the list of de genes provided are ligands (default) or receptors. If receptor, will show the dotplot at the top (source are expressed ligands) and the de heatmap at the bottom (target are the DE receptors). Otherwise the other way round.
- CpdbAnalysis.significant_interactions(de_res, *, pvalue_col='pvalue', fc_col='log2FoldChange', gene_symbol_col='gene_id', max_pvalue=0.1, min_abs_fc=1, adjust_fdr=True, min_frac_expressed=0.1, de_genes_mode='ligand', complex_policy='explode')#
Generates a data frame of differentiall cellphonedb interactions.
This function will extract all known ligands (or receptors, respectively) from a list of differentially expressed and find all receptors (or ligands, respectively) that are expressed above a certain cutoff in all cell-types.
- Parameters
de_res (
DataFrame
) – List of differentially expressed genespvalue_col (default:
'pvalue'
) – column in de_res that contains the pvalue or false discovery ratefc_col (default:
'log2FoldChange'
) – column in de_res that contains the log fold changegene_symbol_col (default:
'gene_id'
) – column in de_res that contains the gene symbolmax_pvalue (default:
0.1
) – Only consider genes in de_res with a p-value lower than max_pvalue (after FDR-adjustion)min_abs_fc (default:
1
) – Only consider genes in de_res with at least this abs. log fold changeadjust_fdr (default:
True
) – Adjust the false-discovery rate of the pvalues in pvalue_col. The FDR-adjustment happens after the input table is filtered for genes that are in ligand/receptor database.min_frac_expressed (default:
0.1
) – Minimum fraction cells that need to express the receptor (or ligand) to be considered a potential interactionde_genes_mode (
Literal
['ligand'
,'receptor'
] (default:'ligand'
)) – If the list of de genes provided are ligands (default) or receptors. In case of ligand, cell-types that express corresonding receptors above the threshold will be identified. In case of receptor, cell-types that express corresponding ligands above the threshold will be identified.complex_policy (
Literal
['ignore'
,'explode'
] (default:'explode'
)) –How to handle protein:protein complexes. Currently implemented options are
ignore: Do nothing, i.e. treat complexes as if they were single genes. This usually means that they will be removed from the result, because there is no corresponding gene symbol (e.g. ITGA8_ITGB1) in the DE genes list or in the anndata object used to compute fractions/gene expression.
explode: Split complexes into individual genes, essentially discard the information that the genes form a complex
Future options could be aggregate, i.e. aggregate metrics of a complex to a single value (e.g. by min as performed in the original CellPhoneDB publication)
- Return type