zarr.abc.codec ============== .. py:module:: zarr.abc.codec Attributes ---------- .. autoapisummary:: zarr.abc.codec.CodecInput zarr.abc.codec.CodecOutput Classes ------- .. autoapisummary:: zarr.abc.codec.ArrayArrayCodec zarr.abc.codec.ArrayBytesCodec zarr.abc.codec.ArrayBytesCodecPartialDecodeMixin zarr.abc.codec.ArrayBytesCodecPartialEncodeMixin zarr.abc.codec.BaseCodec zarr.abc.codec.BytesBytesCodec zarr.abc.codec.CodecPipeline Module Contents --------------- .. py:class:: ArrayArrayCodec Bases: :py:obj:`BaseCodec`\ [\ :py:obj:`zarr.core.buffer.NDBuffer`\ , :py:obj:`zarr.core.buffer.NDBuffer`\ ] Base class for array-to-array codecs. .. !! processed by numpydoc !! .. py:method:: compute_encoded_size(input_byte_length: int, chunk_spec: zarr.core.array_spec.ArraySpec) -> int :abstractmethod: Given an input byte length, this method returns the output byte length. Raises a NotImplementedError for codecs with variable-sized outputs (e.g. compressors). :Parameters: **input_byte_length** : int .. **chunk_spec** : ArraySpec .. :Returns: int .. .. !! processed by numpydoc !! .. py:method:: decode(chunks_and_specs: collections.abc.Iterable[tuple[CodecOutput | None, zarr.core.array_spec.ArraySpec]]) -> collections.abc.Iterable[CodecInput | None] :async: Decodes a batch of chunks. Chunks can be None in which case they are ignored by the codec. :Parameters: **chunks_and_specs** : Iterable[tuple[CodecOutput | None, ArraySpec]] Ordered set of encoded chunks with their accompanying chunk spec. :Returns: Iterable[CodecInput | None] .. .. !! processed by numpydoc !! .. py:method:: encode(chunks_and_specs: collections.abc.Iterable[tuple[CodecInput | None, zarr.core.array_spec.ArraySpec]]) -> collections.abc.Iterable[CodecOutput | None] :async: Encodes a batch of chunks. Chunks can be None in which case they are ignored by the codec. :Parameters: **chunks_and_specs** : Iterable[tuple[CodecInput | None, ArraySpec]] Ordered set of to-be-encoded chunks with their accompanying chunk spec. :Returns: Iterable[CodecOutput | None] .. .. !! processed by numpydoc !! .. py:method:: evolve_from_array_spec(array_spec: zarr.core.array_spec.ArraySpec) -> Self Fills in codec configuration parameters that can be automatically inferred from the array metadata. :Parameters: **array_spec** : ArraySpec .. :Returns: Self .. .. !! processed by numpydoc !! .. py:method:: from_dict(data: dict[str, zarr.core.common.JSON]) -> Self :classmethod: Create an instance of the model from a dictionary .. !! processed by numpydoc !! .. py:method:: resolve_metadata(chunk_spec: zarr.core.array_spec.ArraySpec) -> zarr.core.array_spec.ArraySpec Computed the spec of the chunk after it has been encoded by the codec. This is important for codecs that change the shape, data type or fill value of a chunk. The spec will then be used for subsequent codecs in the pipeline. :Parameters: **chunk_spec** : ArraySpec .. :Returns: ArraySpec .. .. !! processed by numpydoc !! .. py:method:: to_dict() -> dict[str, zarr.core.common.JSON] Recursively serialize this model to a dictionary. This method inspects the fields of self and calls `x.to_dict()` for any fields that are instances of `Metadata`. Sequences of `Metadata` are similarly recursed into, and the output of that recursion is collected in a list. .. !! processed by numpydoc !! .. py:method:: validate(*, shape: zarr.core.common.ChunkCoords, dtype: zarr.core.dtype.wrapper.ZDType[zarr.core.dtype.wrapper.TBaseDType, zarr.core.dtype.wrapper.TBaseScalar], chunk_grid: zarr.core.chunk_grids.ChunkGrid) -> None Validates that the codec configuration is compatible with the array metadata. Raises errors when the codec configuration is not compatible. :Parameters: **shape** : ChunkCoords The array shape **dtype** : np.dtype[Any] The array data type **chunk_grid** : ChunkGrid The array chunk grid .. !! processed by numpydoc !! .. py:attribute:: is_fixed_size :type: bool .. py:class:: ArrayBytesCodec Bases: :py:obj:`BaseCodec`\ [\ :py:obj:`zarr.core.buffer.NDBuffer`\ , :py:obj:`zarr.core.buffer.Buffer`\ ] Base class for array-to-bytes codecs. .. !! processed by numpydoc !! .. py:method:: compute_encoded_size(input_byte_length: int, chunk_spec: zarr.core.array_spec.ArraySpec) -> int :abstractmethod: Given an input byte length, this method returns the output byte length. Raises a NotImplementedError for codecs with variable-sized outputs (e.g. compressors). :Parameters: **input_byte_length** : int .. **chunk_spec** : ArraySpec .. :Returns: int .. .. !! processed by numpydoc !! .. py:method:: decode(chunks_and_specs: collections.abc.Iterable[tuple[CodecOutput | None, zarr.core.array_spec.ArraySpec]]) -> collections.abc.Iterable[CodecInput | None] :async: Decodes a batch of chunks. Chunks can be None in which case they are ignored by the codec. :Parameters: **chunks_and_specs** : Iterable[tuple[CodecOutput | None, ArraySpec]] Ordered set of encoded chunks with their accompanying chunk spec. :Returns: Iterable[CodecInput | None] .. .. !! processed by numpydoc !! .. py:method:: encode(chunks_and_specs: collections.abc.Iterable[tuple[CodecInput | None, zarr.core.array_spec.ArraySpec]]) -> collections.abc.Iterable[CodecOutput | None] :async: Encodes a batch of chunks. Chunks can be None in which case they are ignored by the codec. :Parameters: **chunks_and_specs** : Iterable[tuple[CodecInput | None, ArraySpec]] Ordered set of to-be-encoded chunks with their accompanying chunk spec. :Returns: Iterable[CodecOutput | None] .. .. !! processed by numpydoc !! .. py:method:: evolve_from_array_spec(array_spec: zarr.core.array_spec.ArraySpec) -> Self Fills in codec configuration parameters that can be automatically inferred from the array metadata. :Parameters: **array_spec** : ArraySpec .. :Returns: Self .. .. !! processed by numpydoc !! .. py:method:: from_dict(data: dict[str, zarr.core.common.JSON]) -> Self :classmethod: Create an instance of the model from a dictionary .. !! processed by numpydoc !! .. py:method:: resolve_metadata(chunk_spec: zarr.core.array_spec.ArraySpec) -> zarr.core.array_spec.ArraySpec Computed the spec of the chunk after it has been encoded by the codec. This is important for codecs that change the shape, data type or fill value of a chunk. The spec will then be used for subsequent codecs in the pipeline. :Parameters: **chunk_spec** : ArraySpec .. :Returns: ArraySpec .. .. !! processed by numpydoc !! .. py:method:: to_dict() -> dict[str, zarr.core.common.JSON] Recursively serialize this model to a dictionary. This method inspects the fields of self and calls `x.to_dict()` for any fields that are instances of `Metadata`. Sequences of `Metadata` are similarly recursed into, and the output of that recursion is collected in a list. .. !! processed by numpydoc !! .. py:method:: validate(*, shape: zarr.core.common.ChunkCoords, dtype: zarr.core.dtype.wrapper.ZDType[zarr.core.dtype.wrapper.TBaseDType, zarr.core.dtype.wrapper.TBaseScalar], chunk_grid: zarr.core.chunk_grids.ChunkGrid) -> None Validates that the codec configuration is compatible with the array metadata. Raises errors when the codec configuration is not compatible. :Parameters: **shape** : ChunkCoords The array shape **dtype** : np.dtype[Any] The array data type **chunk_grid** : ChunkGrid The array chunk grid .. !! processed by numpydoc !! .. py:attribute:: is_fixed_size :type: bool .. py:class:: ArrayBytesCodecPartialDecodeMixin Mixin for array-to-bytes codecs that implement partial decoding. .. !! processed by numpydoc !! .. py:method:: decode_partial(batch_info: collections.abc.Iterable[tuple[zarr.abc.store.ByteGetter, zarr.core.indexing.SelectorTuple, zarr.core.array_spec.ArraySpec]]) -> collections.abc.Iterable[zarr.core.buffer.NDBuffer | None] :async: Partially decodes a batch of chunks. This method determines parts of a chunk from the slice selection, fetches these parts from the store (via ByteGetter) and decodes them. :Parameters: **batch_info** : Iterable[tuple[ByteGetter, SelectorTuple, ArraySpec]] Ordered set of information about slices of encoded chunks. The slice selection determines which parts of the chunk will be fetched. The ByteGetter is used to fetch the necessary bytes. The chunk spec contains information about the construction of an array from the bytes. :Returns: Iterable[NDBuffer | None] .. .. !! processed by numpydoc !! .. py:class:: ArrayBytesCodecPartialEncodeMixin Mixin for array-to-bytes codecs that implement partial encoding. .. !! processed by numpydoc !! .. py:method:: encode_partial(batch_info: collections.abc.Iterable[tuple[zarr.abc.store.ByteSetter, zarr.core.buffer.NDBuffer, zarr.core.indexing.SelectorTuple, zarr.core.array_spec.ArraySpec]]) -> None :async: Partially encodes a batch of chunks. This method determines parts of a chunk from the slice selection, encodes them and writes these parts to the store (via ByteSetter). If merging with existing chunk data in the store is necessary, this method will read from the store first and perform the merge. :Parameters: **batch_info** : Iterable[tuple[ByteSetter, NDBuffer, SelectorTuple, ArraySpec]] Ordered set of information about slices of to-be-encoded chunks. The slice selection determines which parts of the chunk will be encoded. The ByteSetter is used to write the necessary bytes and fetch bytes for existing chunk data. The chunk spec contains information about the chunk. .. !! processed by numpydoc !! .. py:class:: BaseCodec Bases: :py:obj:`zarr.abc.metadata.Metadata`, :py:obj:`Generic`\ [\ :py:obj:`CodecInput`\ , :py:obj:`CodecOutput`\ ] Generic base class for codecs. Codecs can be registered via zarr.codecs.registry. .. warning:: This class is not intended to be directly, please use ArrayArrayCodec, ArrayBytesCodec or BytesBytesCodec for subclassing. .. !! processed by numpydoc !! .. py:method:: compute_encoded_size(input_byte_length: int, chunk_spec: zarr.core.array_spec.ArraySpec) -> int :abstractmethod: Given an input byte length, this method returns the output byte length. Raises a NotImplementedError for codecs with variable-sized outputs (e.g. compressors). :Parameters: **input_byte_length** : int .. **chunk_spec** : ArraySpec .. :Returns: int .. .. !! processed by numpydoc !! .. py:method:: decode(chunks_and_specs: collections.abc.Iterable[tuple[CodecOutput | None, zarr.core.array_spec.ArraySpec]]) -> collections.abc.Iterable[CodecInput | None] :async: Decodes a batch of chunks. Chunks can be None in which case they are ignored by the codec. :Parameters: **chunks_and_specs** : Iterable[tuple[CodecOutput | None, ArraySpec]] Ordered set of encoded chunks with their accompanying chunk spec. :Returns: Iterable[CodecInput | None] .. .. !! processed by numpydoc !! .. py:method:: encode(chunks_and_specs: collections.abc.Iterable[tuple[CodecInput | None, zarr.core.array_spec.ArraySpec]]) -> collections.abc.Iterable[CodecOutput | None] :async: Encodes a batch of chunks. Chunks can be None in which case they are ignored by the codec. :Parameters: **chunks_and_specs** : Iterable[tuple[CodecInput | None, ArraySpec]] Ordered set of to-be-encoded chunks with their accompanying chunk spec. :Returns: Iterable[CodecOutput | None] .. .. !! processed by numpydoc !! .. py:method:: evolve_from_array_spec(array_spec: zarr.core.array_spec.ArraySpec) -> Self Fills in codec configuration parameters that can be automatically inferred from the array metadata. :Parameters: **array_spec** : ArraySpec .. :Returns: Self .. .. !! processed by numpydoc !! .. py:method:: from_dict(data: dict[str, zarr.core.common.JSON]) -> Self :classmethod: Create an instance of the model from a dictionary .. !! processed by numpydoc !! .. py:method:: resolve_metadata(chunk_spec: zarr.core.array_spec.ArraySpec) -> zarr.core.array_spec.ArraySpec Computed the spec of the chunk after it has been encoded by the codec. This is important for codecs that change the shape, data type or fill value of a chunk. The spec will then be used for subsequent codecs in the pipeline. :Parameters: **chunk_spec** : ArraySpec .. :Returns: ArraySpec .. .. !! processed by numpydoc !! .. py:method:: to_dict() -> dict[str, zarr.core.common.JSON] Recursively serialize this model to a dictionary. This method inspects the fields of self and calls `x.to_dict()` for any fields that are instances of `Metadata`. Sequences of `Metadata` are similarly recursed into, and the output of that recursion is collected in a list. .. !! processed by numpydoc !! .. py:method:: validate(*, shape: zarr.core.common.ChunkCoords, dtype: zarr.core.dtype.wrapper.ZDType[zarr.core.dtype.wrapper.TBaseDType, zarr.core.dtype.wrapper.TBaseScalar], chunk_grid: zarr.core.chunk_grids.ChunkGrid) -> None Validates that the codec configuration is compatible with the array metadata. Raises errors when the codec configuration is not compatible. :Parameters: **shape** : ChunkCoords The array shape **dtype** : np.dtype[Any] The array data type **chunk_grid** : ChunkGrid The array chunk grid .. !! processed by numpydoc !! .. py:attribute:: is_fixed_size :type: bool .. py:class:: BytesBytesCodec Bases: :py:obj:`BaseCodec`\ [\ :py:obj:`zarr.core.buffer.Buffer`\ , :py:obj:`zarr.core.buffer.Buffer`\ ] Base class for bytes-to-bytes codecs. .. !! processed by numpydoc !! .. py:method:: compute_encoded_size(input_byte_length: int, chunk_spec: zarr.core.array_spec.ArraySpec) -> int :abstractmethod: Given an input byte length, this method returns the output byte length. Raises a NotImplementedError for codecs with variable-sized outputs (e.g. compressors). :Parameters: **input_byte_length** : int .. **chunk_spec** : ArraySpec .. :Returns: int .. .. !! processed by numpydoc !! .. py:method:: decode(chunks_and_specs: collections.abc.Iterable[tuple[CodecOutput | None, zarr.core.array_spec.ArraySpec]]) -> collections.abc.Iterable[CodecInput | None] :async: Decodes a batch of chunks. Chunks can be None in which case they are ignored by the codec. :Parameters: **chunks_and_specs** : Iterable[tuple[CodecOutput | None, ArraySpec]] Ordered set of encoded chunks with their accompanying chunk spec. :Returns: Iterable[CodecInput | None] .. .. !! processed by numpydoc !! .. py:method:: encode(chunks_and_specs: collections.abc.Iterable[tuple[CodecInput | None, zarr.core.array_spec.ArraySpec]]) -> collections.abc.Iterable[CodecOutput | None] :async: Encodes a batch of chunks. Chunks can be None in which case they are ignored by the codec. :Parameters: **chunks_and_specs** : Iterable[tuple[CodecInput | None, ArraySpec]] Ordered set of to-be-encoded chunks with their accompanying chunk spec. :Returns: Iterable[CodecOutput | None] .. .. !! processed by numpydoc !! .. py:method:: evolve_from_array_spec(array_spec: zarr.core.array_spec.ArraySpec) -> Self Fills in codec configuration parameters that can be automatically inferred from the array metadata. :Parameters: **array_spec** : ArraySpec .. :Returns: Self .. .. !! processed by numpydoc !! .. py:method:: from_dict(data: dict[str, zarr.core.common.JSON]) -> Self :classmethod: Create an instance of the model from a dictionary .. !! processed by numpydoc !! .. py:method:: resolve_metadata(chunk_spec: zarr.core.array_spec.ArraySpec) -> zarr.core.array_spec.ArraySpec Computed the spec of the chunk after it has been encoded by the codec. This is important for codecs that change the shape, data type or fill value of a chunk. The spec will then be used for subsequent codecs in the pipeline. :Parameters: **chunk_spec** : ArraySpec .. :Returns: ArraySpec .. .. !! processed by numpydoc !! .. py:method:: to_dict() -> dict[str, zarr.core.common.JSON] Recursively serialize this model to a dictionary. This method inspects the fields of self and calls `x.to_dict()` for any fields that are instances of `Metadata`. Sequences of `Metadata` are similarly recursed into, and the output of that recursion is collected in a list. .. !! processed by numpydoc !! .. py:method:: validate(*, shape: zarr.core.common.ChunkCoords, dtype: zarr.core.dtype.wrapper.ZDType[zarr.core.dtype.wrapper.TBaseDType, zarr.core.dtype.wrapper.TBaseScalar], chunk_grid: zarr.core.chunk_grids.ChunkGrid) -> None Validates that the codec configuration is compatible with the array metadata. Raises errors when the codec configuration is not compatible. :Parameters: **shape** : ChunkCoords The array shape **dtype** : np.dtype[Any] The array data type **chunk_grid** : ChunkGrid The array chunk grid .. !! processed by numpydoc !! .. py:attribute:: is_fixed_size :type: bool .. py:class:: CodecPipeline Base class for implementing CodecPipeline. A CodecPipeline implements the read and write paths for chunk data. On the read path, it is responsible for fetching chunks from a store (via ByteGetter), decoding them and assembling an output array. On the write path, it encodes the chunks and writes them to a store (via ByteSetter). .. !! processed by numpydoc !! .. py:method:: compute_encoded_size(byte_length: int, array_spec: zarr.core.array_spec.ArraySpec) -> int :abstractmethod: Given an input byte length, this method returns the output byte length. Raises a NotImplementedError for codecs with variable-sized outputs (e.g. compressors). :Parameters: **byte_length** : int .. **array_spec** : ArraySpec .. :Returns: int .. .. !! processed by numpydoc !! .. py:method:: decode(chunk_bytes_and_specs: collections.abc.Iterable[tuple[zarr.core.buffer.Buffer | None, zarr.core.array_spec.ArraySpec]]) -> collections.abc.Iterable[zarr.core.buffer.NDBuffer | None] :abstractmethod: :async: Decodes a batch of chunks. Chunks can be None in which case they are ignored by the codec. :Parameters: **chunk_bytes_and_specs** : Iterable[tuple[Buffer | None, ArraySpec]] Ordered set of encoded chunks with their accompanying chunk spec. :Returns: Iterable[NDBuffer | None] .. .. !! processed by numpydoc !! .. py:method:: encode(chunk_arrays_and_specs: collections.abc.Iterable[tuple[zarr.core.buffer.NDBuffer | None, zarr.core.array_spec.ArraySpec]]) -> collections.abc.Iterable[zarr.core.buffer.Buffer | None] :abstractmethod: :async: Encodes a batch of chunks. Chunks can be None in which case they are ignored by the codec. :Parameters: **chunk_arrays_and_specs** : Iterable[tuple[NDBuffer | None, ArraySpec]] Ordered set of to-be-encoded chunks with their accompanying chunk spec. :Returns: Iterable[Buffer | None] .. .. !! processed by numpydoc !! .. py:method:: evolve_from_array_spec(array_spec: zarr.core.array_spec.ArraySpec) -> Self :abstractmethod: Fills in codec configuration parameters that can be automatically inferred from the array metadata. :Parameters: **array_spec** : ArraySpec .. :Returns: Self .. .. !! processed by numpydoc !! .. py:method:: from_array_metadata_and_store(array_metadata: zarr.core.metadata.ArrayMetadata, store: zarr.abc.store.Store) -> Self :classmethod: :abstractmethod: Creates a codec pipeline from array metadata and a store path. Raises NotImplementedError by default, indicating the CodecPipeline must be created with from_codecs instead. :Parameters: **array_metadata** : ArrayMetadata .. **store** : Store .. :Returns: Self .. .. !! processed by numpydoc !! .. py:method:: from_codecs(codecs: collections.abc.Iterable[Codec]) -> Self :classmethod: :abstractmethod: Creates a codec pipeline from an iterable of codecs. :Parameters: **codecs** : Iterable[Codec] .. :Returns: Self .. .. !! processed by numpydoc !! .. py:method:: read(batch_info: collections.abc.Iterable[tuple[zarr.abc.store.ByteGetter, zarr.core.array_spec.ArraySpec, zarr.core.indexing.SelectorTuple, zarr.core.indexing.SelectorTuple, bool]], out: zarr.core.buffer.NDBuffer, drop_axes: tuple[int, Ellipsis] = ()) -> None :abstractmethod: :async: Reads chunk data from the store, decodes it and writes it into an output array. Partial decoding may be utilized if the codecs and stores support it. :Parameters: **batch_info** : Iterable[tuple[ByteGetter, ArraySpec, SelectorTuple, SelectorTuple]] Ordered set of information about the chunks. The first slice selection determines which parts of the chunk will be fetched. The second slice selection determines where in the output array the chunk data will be written. The ByteGetter is used to fetch the necessary bytes. The chunk spec contains information about the construction of an array from the bytes. **out** : NDBuffer .. .. !! processed by numpydoc !! .. py:method:: validate(*, shape: zarr.core.common.ChunkCoords, dtype: zarr.core.dtype.wrapper.ZDType[zarr.core.dtype.wrapper.TBaseDType, zarr.core.dtype.wrapper.TBaseScalar], chunk_grid: zarr.core.chunk_grids.ChunkGrid) -> None :abstractmethod: Validates that all codec configurations are compatible with the array metadata. Raises errors when a codec configuration is not compatible. :Parameters: **shape** : ChunkCoords The array shape **dtype** : np.dtype[Any] The array data type **chunk_grid** : ChunkGrid The array chunk grid .. !! processed by numpydoc !! .. py:method:: write(batch_info: collections.abc.Iterable[tuple[zarr.abc.store.ByteSetter, zarr.core.array_spec.ArraySpec, zarr.core.indexing.SelectorTuple, zarr.core.indexing.SelectorTuple, bool]], value: zarr.core.buffer.NDBuffer, drop_axes: tuple[int, Ellipsis] = ()) -> None :abstractmethod: :async: Encodes chunk data and writes it to the store. Merges with existing chunk data by reading first, if necessary. Partial encoding may be utilized if the codecs and stores support it. :Parameters: **batch_info** : Iterable[tuple[ByteSetter, ArraySpec, SelectorTuple, SelectorTuple]] Ordered set of information about the chunks. The first slice selection determines which parts of the chunk will be encoded. The second slice selection determines where in the value array the chunk data is located. The ByteSetter is used to fetch and write the necessary bytes. The chunk spec contains information about the chunk. **value** : NDBuffer .. .. !! processed by numpydoc !! .. py:property:: supports_partial_decode :type: bool :abstractmethod: .. py:property:: supports_partial_encode :type: bool :abstractmethod: .. py:data:: CodecInput .. py:data:: CodecOutput