shnitsel.data.dataset_containers.shared
=======================================

.. py:module:: shnitsel.data.dataset_containers.shared


Classes
-------

.. autoapisummary::

   shnitsel.data.dataset_containers.shared.ShnitselDataset
   shnitsel.data.dataset_containers.shared.ShnitselDerivedDataset


Module Contents
---------------

.. py:class:: ShnitselDataset(ds)

   Bases: :py:obj:`shnitsel.data.xr_io_compatibility.SupportsFromXrConversion`, :py:obj:`shnitsel.data.xr_io_compatibility.SupportsToXrConversion`


   Definition of the protocol to support instantiation from
   xarray dataset structs.


   .. py:attribute:: _raw_dataset
      :type:  xarray.Dataset


   .. py:property:: dataset
      :type: xarray.Dataset


   .. py:property:: leading_dimension
      :type: str


   .. py:property:: state_ids


   .. py:property:: state_names


   .. py:property:: state_types


   .. py:property:: state_magnetic_number


   .. py:property:: state_degeneracy_group


   .. py:property:: state_charges


   .. py:property:: active_state


   .. py:property:: state_diagonal


   .. py:property:: atom_names


   .. py:property:: atom_numbers


   .. py:property:: charge
      :type: float


      The charge of the molecule if set on the trajectory data.
      Loaded from `charge` attribute (or variable) or `state_charges` coordinate
      if provided.

      If no information is found, 0 is returned.


   .. py:method:: set_charge(value)

      Method to set the charge on a dataset, clear conflicting positions
      of charge info on the dataset and return a new instance of the wrapped dataset.


      :param value: Either a single value (optionally wrapped in a DataArray already) to indicate
                    the charge of the full molecule in all states (will be set to coordinate `charge`) or a DataArray that represents
                    state-dependent charges (which will be set to `state_charges`)
      :type value: float | xr.DataArray

      :returns: The updated object as a copy.
      :rtype: Self

      :raises ValueError: If an unsupported `value` was provided.


   .. py:property:: dims


   .. py:property:: coords


   .. py:property:: sizes


   .. py:property:: data_vars


   .. py:method:: has_variable(name)


   .. py:method:: has_dimension(name)


   .. py:method:: has_coordinate(name)


   .. py:method:: has_data(name)


   .. py:method:: has(name)


   .. py:property:: mol
      :type: rdkit.Chem.Mol


      Helper method to get a representative molecule object for the geometry within this dataset.

      :returns: Either a copy of a cached mol object (for partial substructures) or a newly constructed default object
      :rtype: rdkit.Chem.Mol


   .. py:method:: sel(indexers = None, method = None, tolerance = None, drop = False, **indexers_kwargs)

      Returns a new dataset with each data array indexed by tick labels
      along the specified dimension(s).

      In contrast to `.isel`, indexers for this method should use
      labels (i.e. explicit values in that dimension) instead of integers.

      Under the hood, this method is powered by using pandas's powerful Index
      objects. This makes label based indexing essentially just as fast as
      using integer indexing.

      It also means this method uses pandas's (well documented) logic for
      indexing. This means you can use string shortcuts for datetime indexes
      (e.g., '2000-01' to select all values in January 2000). It also means
      that slices are treated as inclusive of both the start and stop values,
      unlike normal Python indexing.

      :param indexers: A dict with keys matching dimensions and values given
                       by scalars, slices or arrays of tick labels. For dimensions with
                       multi-index, the indexer may also be a dict-like object with keys
                       matching index level names.
                       If DataArrays are passed as indexers, xarray-style indexing will be
                       carried out. See :ref:`indexing` for the details.
                       One of indexers or indexers_kwargs must be provided.
      :type indexers: dict, optional
      :param method: Method to use for inexact matches:

                     * None (default): only exact matches
                     * pad / ffill: propagate last valid index value forward
                     * backfill / bfill: propagate next valid index value backward
                     * nearest: use nearest valid index value
      :type method: {None, "nearest", "pad", "ffill", "backfill", "bfill"}, optional
      :param tolerance: Maximum distance between original and new labels for inexact
                        matches. The values of the index at the matching locations must
                        satisfy the equation ``abs(index[indexer] - target) <= tolerance``.
      :type tolerance: optional
      :param drop: If ``drop=True``, drop coordinates variables in `indexers` instead
                   of making them scalar.
      :type drop: bool, optional
      :param \*\*indexers_kwargs: The keyword arguments form of ``indexers``.
                                  One of indexers or indexers_kwargs must be provided.
      :type \*\*indexers_kwargs: {dim: indexer, ...}, optional

      :returns: **dataset** -- A new Dataset with the same contents as this dataset, except each
                variable and dimension is indexed by the appropriate indexers.
                If indexer DataArrays have coordinates that do not conflict with
                this object, then these coordinates will be attached.
                In general, each array's data will be a view of the array's data
                in this dataset, unless vectorized indexing was triggered by using
                an array indexer, in which case the data will be a copy.
      :rtype: Self

      .. seealso::

         :func:`ShnitselDataset.isel <ShnitselDataset.isel>`
         :func:`Dataset.sel <Dataset.sel>`
         :func:`Dataset.isel <Dataset.isel>`
         :func:`DataArray.sel <DataArray.sel>`

         :doc:`xarray-tutorial:intermediate/indexing/indexing`
             Tutorial material on indexing with Xarray objects

         :doc:`xarray-tutorial:fundamentals/02.1_indexing_Basic`
             Tutorial material on basics of indexing


   .. py:method:: isel(indexers = None, drop = False, missing_dims = 'raise', **indexers_kwargs)

      Returns a new dataset with each array indexed along the specified
      dimension(s).

      This method selects values from each array using its `__getitem__`
      method, except this method does not require knowing the order of
      each array's dimensions.

      :param indexers: A dict with keys matching dimensions and values given
                       by integers, slice objects or arrays.
                       indexer can be a integer, slice, array-like or DataArray.
                       If DataArrays are passed as indexers, xarray-style indexing will be
                       carried out. See :ref:`indexing` for the details.
                       One of indexers or indexers_kwargs must be provided.
      :type indexers: dict, optional
      :param drop: If ``drop=True``, drop coordinates variables indexed by integers
                   instead of making them scalar.
      :type drop: bool, default: False
      :param missing_dims: What to do if dimensions that should be selected from are not present in the
                           Dataset:
                           - "raise": raise an exception
                           - "warn": raise a warning, and ignore the missing dimensions
                           - "ignore": ignore the missing dimensions
      :type missing_dims: {"raise", "warn", "ignore"}, default: "raise"
      :param \*\*indexers_kwargs: The keyword arguments form of ``indexers``.
                                  One of indexers or indexers_kwargs must be provided.
      :type \*\*indexers_kwargs: {dim: indexer, ...}, optional

      :returns: **obj** -- A new Dataset with the same contents as this dataset, except each
                array and dimension is indexed by the appropriate indexers.
                If indexer DataArrays have coordinates that do not conflict with
                this object, then these coordinates will be attached.
                In general, each array's data will be a view of the array's data
                in this dataset, unless vectorized indexing was triggered by using
                an array indexer, in which case the data will be a copy.
      :rtype: Dataset

      .. rubric:: Examples

      # A specific element from the dataset is selected

      >>> dataset.isel(atom=1, time=0)
      <xarray.Dataset> Size:
      Dimensions:         (direction: 3)
      Coordinates:
          atom        int16 2B 1
          time        float64 8B 0.0
          direction   (direction) <U1 3B 'x' 'y' 'z'
      Data variables:
          energy  float64 8B -238.2
          forces  (direction) float64 24B 1.2 -0.2 0.1

      # Indexing with a slice using isel

      >>> slice_of_data = dataset.isel(atom=slice(0, 2), time=slice(0, 2))
      >>> slice_of_data
      <xarray.Dataset> Size:
      Dimensions:         (atom: 2, time: 2, direction: 3)
      Coordinates:
          * atom         (atom) int16 2B 1
          * time         (time) float64 16B 0.0 0.5
          * direction    <U1 3B 'x' 'y' 'z'
      Data variables:
          energy      (time) float64 24B -238.2
          forces      (time, atom, direction) float64 96B -0.5 -0.4 0.4 ...

      >>> index_array = xr.DataArray([0, 2], dims="atom")
      >>> indexed_data = dataset.isel(atom=index_array)
      >>> indexed_data
      <xarray.Dataset> Size:
      Dimensions:         (atom: 2, time: 3, direction: 3)
      Coordinates:
        * atom            (atom) int16 4B 1 3
        * time            (time) float64 16B 0.0 0.5 1.0
        * direction       <U1 3B 'x' 'y' 'z'
      Data variables:
          energy      (time) float64 24B -238.2 -238.4 -237.9
          forces      (time, atom, direction) float64 96B -0.5 -0.4 0.4 ...

      .. seealso::

         :func:`ShnitselDataset.sel <Dataset.sel>`
         :func:`Dataset.sel <Dataset.sel>`
         :func:`Dataset.isel <Dataset.sel>`
         :func:`DataArray.isel <DataArray.isel>`

         :doc:`xarray-tutorial:intermediate/indexing/indexing`
             Tutorial material on indexing with Xarray objects

         :doc:`xarray-tutorial:fundamentals/02.1_indexing_Basic`
             Tutorial material on basics of indexing


   .. py:property:: _attr_sources
      :type: Iterable[Mapping[Hashable, Any]]


      Places to look-up items for attribute-style access


   .. py:property:: _item_sources
      :type: Iterable[Mapping[Hashable, Any]]


      Places to look-up items for key-completion


   .. py:method:: __getattr__(name)


   .. py:method:: __contains__(a)


   .. py:method:: _repr_html_()


   .. py:method:: __getitem__(key)


   .. py:method:: __dir__()

      Provide method name lookup and completion. Only provide 'public'
      methods.


   .. py:method:: _ipython_key_completions_()

      Provide method for the key-autocompletions in IPython.
      See https://ipython.readthedocs.io/en/stable/config/integrating.html#tab-completion
      For the details.


   .. py:method:: convert(varname = None, unit = None)

      Convert an entry in this dataset to a specific unit.

      Returns a copy of the dataset with the entry updated.

      :param varname: Optionally the name of a single variable. If not provided, will apply to all variables.
      :type varname: str, optional
      :param unit: The target unit to convert to.
                   If not set, Will convert to default shnitsel units.
      :type unit: str | None

      :returns: The updated dataset with converted units.
      :rtype: Self


   .. py:method:: as_xr_dataset()

      Base function to implement by classes supporting this protocol
      to allow for standardized conversion to a dataset

      :returns: A tuple of the `io_type_tag` under which the deserializer is registered
                with the Shnitsel Tools framework (or `None` if no
                deserialization is desired/supported)/
                Then the `xr.Dataset that is the result of the conversion.
                And lastly a dict of metadata that might help with deserialization later on.
      :rtype: tuple[str, xr.Dataset, MetaData]

      :raises ValueError: If the conversion failed for some reason.


   .. py:method:: get_type_marker()
      :classmethod:


   .. py:method:: from_xr_dataset(dataset, metadata)
      :classmethod:


      Class method to support standardized deserialization of arbitrary classes.
      Implemented as a class method to avoid need to construct instance for
      deserialization.

      :param cls: The class executing the deserialization.
      :type cls: type[ResType]
      :param dataset: The dataset to be deserialized into the output type.
      :type dataset: xr.Dataset
      :param metadata: Metdatata from the serialization process.
      :type metadata: MetaData

      :returns: The deserialized instance of the target class.
      :rtype: instance of cls

      :raises TypeError: If deserialization of the object was not possible


.. py:class:: ShnitselDerivedDataset(base_ds, derived_ds)

   Bases: :py:obj:`ShnitselDataset`, :py:obj:`shnitsel.data.xr_io_compatibility.SupportsFromXrConversion`, :py:obj:`shnitsel.data.xr_io_compatibility.SupportsToXrConversion`


   Definition of the protocol to support instantiation from
   xarray dataset structs.


   .. py:attribute:: _base_dataset
      :type:  xarray.Dataset | None


   .. py:property:: base
      :type: xarray.Dataset | None


   .. py:property:: _item_sources
      :type: Iterable[Mapping[Hashable, Any]]


      Places to look-up items for key-completion


   .. py:method:: as_xr_dataset()
      :abstractmethod:


      Base function to implement by classes supporting this protocol
      to allow for standardized conversion to a dataset

      :returns: A tuple of the `io_type_tag` under which the deserializer is registered
                with the Shnitsel Tools framework (or `None` if no
                deserialization is desired/supported)/
                Then the `xr.Dataset that is the result of the conversion.
                And lastly a dict of metadata that might help with deserialization later on.
      :rtype: tuple[str, xr.Dataset, MetaData]

      :raises ValueError: If the conversion failed for some reason.


   .. py:method:: get_type_marker()
      :classmethod:

      :abstractmethod:


   .. py:method:: from_xr_dataset(dataset, metadata)
      :classmethod:

      :abstractmethod:


      Class method to support standardized deserialization of arbitrary classes.
      Implemented as a class method to avoid need to construct instance for
      deserialization.

      :param cls: The class executing the deserialization.
      :type cls: type[ResType]
      :param dataset: The dataset to be deserialized into the output type.
      :type dataset: xr.Dataset
      :param metadata: Metdatata from the serialization process.
      :type metadata: MetaData

      :returns: The deserialized instance of the target class.
      :rtype: instance of cls

      :raises TypeError: If deserialization of the object was not possible