daf.typing.freezing¶
In general daf
assumes that stored data is not modified in-place, as this would break the caching mechanisms which
are needed for efficiency. Modifying stored data is a bad idea in general regardless of caching as it would cause subtle
bugs when analysis code is reproduced.
At the same time, Python doesn’t really have a notion of immutable data when it comes to complex data structures.
However, numpy
does have a concept of read-only data, so we make use of it here, and extend it to deal with
pandas
and scipy.sparse
data as well (as they use numpy
data under the hood).
In general, daf
always freezes data when it is stored, and fetching data will return frozen data, to protect against
accidental in-place modification of the stored data.
The code in this module allows to manually freeze
, unfreeze
, or test whether data is_frozen
, using the
numpy
capabilities. In addition, in cases you really know what you are doing, it allows you to temporary modify
unfrozen
data.
Data:
A |
Functions:
|
Ensure that some 1D/2D data is protected against modification. |
|
Ensure that some 1D/2D data is not protected against modification. |
|
Test whether some 1D/2D data is (known to be) protected against modification. |
|
Execute some in-place modification, temporarily unfreezing the 1D/2D data. |
- daf.typing.freezing.ProperT¶
A
TypeVar
bound toProper
.alias of TypeVar(‘ProperT’, bound=
Union
[_vectors.Vector
,Series
,_dense.DenseInRows
,SparseInRows
,FrameInRows
,_dense.DenseInColumns
,SparseInColumns
,FrameInColumns
])
- daf.typing.freezing.freeze(data: ProperT) ProperT [source]¶
Ensure that some 1D/2D data is protected against modification.
This tries to freeze the data in place, but because
pandas
has strange behavior, we are forced to return a new frozen object (this is only a wrapper, the data itself is not copied). Hence the safe idiom isdata = freeze(data)
. Sigh.
- daf.typing.freezing.unfreeze(data: ProperT) ProperT [source]¶
Ensure that some 1D/2D data is not protected against modification.
This tries to unfreeze the data in place, but because
pandas
has strange behavior, we are forced to return a new frozen object (this is only a wrapper, the data itself is not copied). Hence the safe idiom isdata = unfreeze(data)
. Sigh.
- daf.typing.freezing.is_frozen(data: Union[ndarray, _fake_sparse.spmatrix, Series, DataFrame]) bool [source]¶
Test whether some 1D/2D data is (known to be) protected against modification.
Note
This will fail for any
scipy.sparse.spmatrix
other than forscipy.sparse.csr_matrix
orscipy.sparse.csc_matrix
.
- daf.typing.freezing.unfrozen(data: ProperT) Generator[ProperT, None, None] [source]¶
Execute some in-place modification, temporarily unfreezing the 1D/2D data.
Expected usage is:
assert is_frozen(data) # The ``data`` is immutable here. with unfrozen(data) as melted: # ``melted`` data is writable here. # Do **not** leak the reference to the ``melted`` data to outside the block. # In particular, do **not** use the anti-pattern ``with unfrozen(data) as data: ...``. assert not is_frozen(melted) assert is_frozen(data) # The ``data`` is immutable here.