daf.typing.freezing

In general daf assumes that stored data is not modified in-place, as this would break the caching mechanisms which are needed for efficiency. Modifying stored data is a bad idea in general regardless of caching as it would cause subtle bugs when analysis code is reproduced.

At the same time, Python doesn’t really have a notion of immutable data when it comes to complex data structures. However, numpy does have a concept of read-only data, so we make use of it here, and extend it to deal with pandas and scipy.sparse data as well (as they use numpy data under the hood).

In general, daf always freezes data when it is stored, and fetching data will return frozen data, to protect against accidental in-place modification of the stored data.

The code in this module allows to manually freeze, unfreeze, or test whether data is_frozen, using the numpy capabilities. In addition, in cases you really know what you are doing, it allows you to temporary modify unfrozen data.

Data:

ProperT

A TypeVar bound to Proper.

Functions:

freeze(data)

Ensure that some 1D/2D data is protected against modification.

unfreeze(data)

Ensure that some 1D/2D data is not protected against modification.

is_frozen(data)

Test whether some 1D/2D data is (known to be) protected against modification.

unfrozen(data)

Execute some in-place modification, temporarily unfreezing the 1D/2D data.

daf.typing.freezing.ProperT

A TypeVar bound to Proper.

alias of TypeVar(‘ProperT’, bound=Union[_vectors.Vector, Series, _dense.DenseInRows, SparseInRows, FrameInRows, _dense.DenseInColumns, SparseInColumns, FrameInColumns])

daf.typing.freezing.freeze(data: ProperT) ProperT[source]

Ensure that some 1D/2D data is protected against modification.

This tries to freeze the data in place, but because pandas has strange behavior, we are forced to return a new frozen object (this is only a wrapper, the data itself is not copied). Hence the safe idiom is data = freeze(data). Sigh.

daf.typing.freezing.unfreeze(data: ProperT) ProperT[source]

Ensure that some 1D/2D data is not protected against modification.

This tries to unfreeze the data in place, but because pandas has strange behavior, we are forced to return a new frozen object (this is only a wrapper, the data itself is not copied). Hence the safe idiom is data = unfreeze(data). Sigh.

daf.typing.freezing.is_frozen(data: Union[ndarray, _fake_sparse.spmatrix, Series, DataFrame]) bool[source]

Test whether some 1D/2D data is (known to be) protected against modification.

Note

This will fail for any scipy.sparse.spmatrix other than for scipy.sparse.csr_matrix or scipy.sparse.csc_matrix.

daf.typing.freezing.unfrozen(data: ProperT) Generator[ProperT, None, None][source]

Execute some in-place modification, temporarily unfreezing the 1D/2D data.

Expected usage is:

assert is_frozen(data)  # The ``data`` is immutable here.

with unfrozen(data) as melted:
    # ``melted`` data is writable here.
    # Do **not** leak the reference to the ``melted`` data to outside the block.
    # In particular, do **not** use the anti-pattern ``with unfrozen(data) as data: ...``.
    assert not is_frozen(melted)

assert is_frozen(data)  # The ``data`` is immutable here.