daf.storage.anndata

This stores the data inside an AnnData AnnData object.

Since AnnData is not really powerful enough to satisfy our needs (this was the main motivation for creationg daf), this is mainly used to interface with other systems.

When accessing existing AnnData objects, you can either wrap it with an AnnDataWriter to allow modifying it through daf, or with an AnnDataReader object to provide just read-only access. In both cases, you will see the hard-wired obs, var and X names. As a convenience, you can call anndata_as_storage instead, which will create an AnnDataReader and wrap it with a StorageView to rename these to more meaningful names.

To create a new AnnData object, call storage_as_anndata, giving it a StorageReader that exposes only the data you wish to place in the result; this is typically done using a StorageView, which also renames some meaningful axes and data names to obs, var and X.

Representation

We use the following scheme to map between daf data and AnnData fields:

  • 0D data is easy, it is stored in the uns field of AnnData.

  • Axes other than obs and var require us to store their entries, which we do by using an uns entry with the name axis#.

  • 1D data other than per-obs and per-var data is stored in an uns entry named axis#property.

  • 2D data for obs and var axes is stored in the X, layers, obsp or varp AnnData fields, as appropriate.

  • 2D data for either obs or var and another axis is stored as a set of 1D annotations in the obs or var AnnData fields, one for each axis entry, named other_axis=entry#property. It is debatable whether this makes it easier or harder to access this data in systems that directly use AnnData, but it is at least “technically correct”.

  • 2D data where neither axis is obs or var is stored in an uns entry named row_axis,column_axis#property.

Classes:

AnnDataReader(adata, *[, name])

Implement the StorageReader interface for AnnData.

AnnDataWriter(adata, *[, name, copy, overwrite])

Implement the StorageWriter interface for AnnData.

Functions:

storage_as_anndata(storage)

Convert some storage into an AnnData object.

anndata_as_storage(adata, obs, var, X[, name])

View some adata (AnnData object) as a StorageReader to allow accessing it using daf.

class daf.storage.anndata.AnnDataReader(adata: AnnData, *, name: str = 'anndata#')[source]

Bases: StorageReader

Implement the StorageReader interface for AnnData.

If the name ends with #, we append the object id to it to make it unique.

Note

Do not modify the wrapped AnnData after creating a reader. Modifications may or may not be visible in the reader, causing subtle problems.

Attributes:

adata

The wrapped AnnData object.

adata

The wrapped AnnData object.

class daf.storage.anndata.AnnDataWriter(adata: AnnData, *, name: str = 'anndata#', copy: Optional[StorageReader] = None, overwrite: bool = False)[source]

Bases: AnnDataReader, StorageWriter

Implement the StorageWriter interface for AnnData.

If copy is specified, it is copied into the AnnData, using the overwrite. This requires the copy to have obs and var axes. Typically you would be better off calling storage_as_anndata.

Note

Do not modify the wrapped AnnData after creating a writer (other than through the writer object). Modifications may or may not be visible in the writer, causing subtle problems.

Setting large 2D data for axes other than obs and var will be inefficient.

daf.storage.anndata.storage_as_anndata(storage: StorageReader) AnnData[source]

Convert some storage into an AnnData object.

The storage needs to include an obs and a var axis, and an obs,var#X matrix. In general it should contain the data one wishes to store in the AnnData object. Typically this is achieved by creating a StorageView hide_implicit=True for the full data; this view is also used to rename the meaningful axes and data names to obs, var and X.

Note

It is often necessary to create several such views to extract several AnnData objects from the same storage, e.g. when the data contains both cells and cell-clusters/types, it is best to split it into two AnnData objects, one for the cells data and another for the clusters/types data. Other data (e.g. per gene data) can be replicated in both AnnData objects or stored in only one of them, as needed.

daf.storage.anndata.anndata_as_storage(adata: AnnData, obs: str, var: str, X: str, name: str = 'anndata#') StorageReader[source]

View some adata (AnnData object) as a StorageReader to allow accessing it using daf.

This creates a StorageView which maps the hard-wired obs and var axes names and the hard-wired X data to hopefully more meaningful names.

If the name ends with # we append the object id to it to make it unique.

Note

It is often necessary to split the same data set into multiple AnnData objects, e.g. when the data contains both cells and cell-clusters/types, it is best to split it into two AnnData objects, one for the cells data and another for the clusters/types data. To merge these back into a single storage, create a StorageChain by writing something like StorageChain([anndata_as_storage(..), anndata_as_storage(...), ...]).