anndata scanpy拆分成多个数据集python Linux 多个单细胞合并 切分数据集合 切割

Concatenation — anndata 0.9.0.dev38+g3c5f63d documentation

Concatenation

With concat(), AnnData objects can be combined via a composition of two operations: concatenation and merging.

  • Concatenation is when we keep all sub elements of each object, and stack these elements in an ordered way.

  • Merging is combining a set of collections into one resulting collection which contains elements from the objects.

Note

This function borrows from similar functions in pandas and xarray. Argument which are used to control concatenation are modeled after pandas.concat() while strategies for merging are inspired by xarray.merge()’s compat argument.

Concatenation

Let’s start off with an example:

import scanpy as sc, anndata as ad, numpy as np, pandas as pd
from scipy import sparse
from anndata import AnnData
pbmc = sc.datasets.pbmc68k_reduced()
pbmc


groups = pbmc.obs.groupby("louvain").indices
pbmc_concat = ad.concat([pbmc[inds] for inds in groups.values()], merge="same")
assert np.array_equal(pbmc.X, pbmc_concat[pbmc.obs_names].X)
pbmc_concat
>>> import scanpy as sc, anndata as ad, numpy as np, pandas as pd
>>> from scipy import sparse
>>> from anndata import AnnData
>>> pbmc = sc.datasets.pbmc68k_reduced()
>>> pbmc
AnnData object with n_obs × n_vars = 700 × 765
    obs: 'bulk_labels', 'n_genes', 'percent_mito', 'n_counts', 'S_score', 'G2M_score', 'phase', 'louvain'
    var: 'n_counts', 'means', 'dispersions', 'dispersions_norm', 'highly_variable'
    uns: 'bulk_labels_colors', 'louvain', 'louvain_colors', 'neighbors', 'pca', 'rank_genes_groups'
    obsm: 'X_pca', 'X_umap'
    varm: 'PCs'
    obsp: 'distances', 'connectivities'

If we split this object up by clusters of observations, then stack those subsets we’ll obtain the same values – just ordered differently.

>>> groups = pbmc.obs.groupby("louvain").indices
>>> pbmc_concat = ad.concat([pbmc[inds] for inds in groups.values()], merge="same")
>>> assert np.array_equal(pbmc.X, pbmc_concat[pbmc.obs_names].X)
>>> pbmc_concat
AnnData object with n_obs × n_vars = 700 × 765
    obs: 'bulk_labels', 'n_genes', 'percent_mito', 'n_counts', 'S_score', 'G2M_score', 'phase', 'louvain'
    var: 'n_counts', 'means', 'dispersions', 'dispersions_norm', 'highly_variable'
    obsm: 'X_pca', 'X_umap'
    varm: 'PCs'o

out=[]

out=[pbmc[inds] for inds in groups.values()]

anndata scanpy拆分成多个数据集python Linux 多个单细胞合并 切分数据集合 切割_第1张图片

 

Note that we concatenated along the observations by default, and that most elements aligned to the observations were concatenated as well. A notable exception is obsp, which can be re-enabled with the pairwise keyword argument. This is because it’s not obvious that combining graphs or distance matrices padded with 0s is particularly useful, and may be unintuitive.

 

 

你可能感兴趣的:(纸上得来终觉浅,python,开发语言)