shnitsel.data.dataset_containers.shared ======================================= .. py:module:: shnitsel.data.dataset_containers.shared Classes ------- .. autoapisummary:: shnitsel.data.dataset_containers.shared.ShnitselDataset shnitsel.data.dataset_containers.shared.ShnitselDerivedDataset Module Contents --------------- .. py:class:: ShnitselDataset(ds) Bases: :py:obj:`shnitsel.data.xr_io_compatibility.SupportsFromXrConversion`, :py:obj:`shnitsel.data.xr_io_compatibility.SupportsToXrConversion` Definition of the protocol to support instantiation from xarray dataset structs. .. py:attribute:: _raw_dataset :type: xarray.Dataset .. py:property:: dataset :type: xarray.Dataset .. py:property:: leading_dimension :type: str .. py:property:: state_ids .. py:property:: state_names .. py:property:: state_types .. py:property:: state_magnetic_number .. py:property:: state_degeneracy_group .. py:property:: state_charges .. py:property:: active_state .. py:property:: state_diagonal .. py:property:: atom_names .. py:property:: atom_numbers .. py:property:: charge :type: float The charge of the molecule if set on the trajectory data. Loaded from `charge` attribute (or variable) or `state_charges` coordinate if provided. If no information is found, 0 is returned. .. py:method:: set_charge(value) Method to set the charge on a dataset, clear conflicting positions of charge info on the dataset and return a new instance of the wrapped dataset. :param value: Either a single value (optionally wrapped in a DataArray already) to indicate the charge of the full molecule in all states (will be set to coordinate `charge`) or a DataArray that represents state-dependent charges (which will be set to `state_charges`) :type value: float | xr.DataArray :returns: The updated object as a copy. :rtype: Self :raises ValueError: If an unsupported `value` was provided. .. py:property:: dims .. py:property:: coords .. py:property:: sizes .. py:property:: data_vars .. py:method:: has_variable(name) .. py:method:: has_dimension(name) .. py:method:: has_coordinate(name) .. py:method:: has_data(name) .. py:method:: has(name) .. py:property:: mol :type: rdkit.Chem.Mol Helper method to get a representative molecule object for the geometry within this dataset. :returns: Either a copy of a cached mol object (for partial substructures) or a newly constructed default object :rtype: rdkit.Chem.Mol .. py:method:: sel(indexers = None, method = None, tolerance = None, drop = False, **indexers_kwargs) Returns a new dataset with each data array indexed by tick labels along the specified dimension(s). In contrast to `.isel`, indexers for this method should use labels (i.e. explicit values in that dimension) instead of integers. Under the hood, this method is powered by using pandas's powerful Index objects. This makes label based indexing essentially just as fast as using integer indexing. It also means this method uses pandas's (well documented) logic for indexing. This means you can use string shortcuts for datetime indexes (e.g., '2000-01' to select all values in January 2000). It also means that slices are treated as inclusive of both the start and stop values, unlike normal Python indexing. :param indexers: A dict with keys matching dimensions and values given by scalars, slices or arrays of tick labels. For dimensions with multi-index, the indexer may also be a dict-like object with keys matching index level names. If DataArrays are passed as indexers, xarray-style indexing will be carried out. See :ref:`indexing` for the details. One of indexers or indexers_kwargs must be provided. :type indexers: dict, optional :param method: Method to use for inexact matches: * None (default): only exact matches * pad / ffill: propagate last valid index value forward * backfill / bfill: propagate next valid index value backward * nearest: use nearest valid index value :type method: {None, "nearest", "pad", "ffill", "backfill", "bfill"}, optional :param tolerance: Maximum distance between original and new labels for inexact matches. The values of the index at the matching locations must satisfy the equation ``abs(index[indexer] - target) <= tolerance``. :type tolerance: optional :param drop: If ``drop=True``, drop coordinates variables in `indexers` instead of making them scalar. :type drop: bool, optional :param \*\*indexers_kwargs: The keyword arguments form of ``indexers``. One of indexers or indexers_kwargs must be provided. :type \*\*indexers_kwargs: {dim: indexer, ...}, optional :returns: **dataset** -- A new Dataset with the same contents as this dataset, except each variable and dimension is indexed by the appropriate indexers. If indexer DataArrays have coordinates that do not conflict with this object, then these coordinates will be attached. In general, each array's data will be a view of the array's data in this dataset, unless vectorized indexing was triggered by using an array indexer, in which case the data will be a copy. :rtype: Self .. seealso:: :func:`ShnitselDataset.isel ` :func:`Dataset.sel ` :func:`Dataset.isel ` :func:`DataArray.sel ` :doc:`xarray-tutorial:intermediate/indexing/indexing` Tutorial material on indexing with Xarray objects :doc:`xarray-tutorial:fundamentals/02.1_indexing_Basic` Tutorial material on basics of indexing .. py:method:: isel(indexers = None, drop = False, missing_dims = 'raise', **indexers_kwargs) Returns a new dataset with each array indexed along the specified dimension(s). This method selects values from each array using its `__getitem__` method, except this method does not require knowing the order of each array's dimensions. :param indexers: A dict with keys matching dimensions and values given by integers, slice objects or arrays. indexer can be a integer, slice, array-like or DataArray. If DataArrays are passed as indexers, xarray-style indexing will be carried out. See :ref:`indexing` for the details. One of indexers or indexers_kwargs must be provided. :type indexers: dict, optional :param drop: If ``drop=True``, drop coordinates variables indexed by integers instead of making them scalar. :type drop: bool, default: False :param missing_dims: What to do if dimensions that should be selected from are not present in the Dataset: - "raise": raise an exception - "warn": raise a warning, and ignore the missing dimensions - "ignore": ignore the missing dimensions :type missing_dims: {"raise", "warn", "ignore"}, default: "raise" :param \*\*indexers_kwargs: The keyword arguments form of ``indexers``. One of indexers or indexers_kwargs must be provided. :type \*\*indexers_kwargs: {dim: indexer, ...}, optional :returns: **obj** -- A new Dataset with the same contents as this dataset, except each array and dimension is indexed by the appropriate indexers. If indexer DataArrays have coordinates that do not conflict with this object, then these coordinates will be attached. In general, each array's data will be a view of the array's data in this dataset, unless vectorized indexing was triggered by using an array indexer, in which case the data will be a copy. :rtype: Dataset .. rubric:: Examples # A specific element from the dataset is selected >>> dataset.isel(atom=1, time=0) Size: Dimensions: (direction: 3) Coordinates: atom int16 2B 1 time float64 8B 0.0 direction (direction) >> slice_of_data = dataset.isel(atom=slice(0, 2), time=slice(0, 2)) >>> slice_of_data Size: Dimensions: (atom: 2, time: 2, direction: 3) Coordinates: * atom (atom) int16 2B 1 * time (time) float64 16B 0.0 0.5 * direction >> index_array = xr.DataArray([0, 2], dims="atom") >>> indexed_data = dataset.isel(atom=index_array) >>> indexed_data Size: Dimensions: (atom: 2, time: 3, direction: 3) Coordinates: * atom (atom) int16 4B 1 3 * time (time) float64 16B 0.0 0.5 1.0 * direction ` :func:`Dataset.sel ` :func:`Dataset.isel ` :func:`DataArray.isel ` :doc:`xarray-tutorial:intermediate/indexing/indexing` Tutorial material on indexing with Xarray objects :doc:`xarray-tutorial:fundamentals/02.1_indexing_Basic` Tutorial material on basics of indexing .. py:property:: _attr_sources :type: Iterable[Mapping[Hashable, Any]] Places to look-up items for attribute-style access .. py:property:: _item_sources :type: Iterable[Mapping[Hashable, Any]] Places to look-up items for key-completion .. py:method:: __getattr__(name) .. py:method:: __contains__(a) .. py:method:: _repr_html_() .. py:method:: __getitem__(key) .. py:method:: __dir__() Provide method name lookup and completion. Only provide 'public' methods. .. py:method:: _ipython_key_completions_() Provide method for the key-autocompletions in IPython. See https://ipython.readthedocs.io/en/stable/config/integrating.html#tab-completion For the details. .. py:method:: convert(varname = None, unit = None) Convert an entry in this dataset to a specific unit. Returns a copy of the dataset with the entry updated. :param varname: Optionally the name of a single variable. If not provided, will apply to all variables. :type varname: str, optional :param unit: The target unit to convert to. If not set, Will convert to default shnitsel units. :type unit: str | None :returns: The updated dataset with converted units. :rtype: Self .. py:method:: as_xr_dataset() Base function to implement by classes supporting this protocol to allow for standardized conversion to a dataset :returns: A tuple of the `io_type_tag` under which the deserializer is registered with the Shnitsel Tools framework (or `None` if no deserialization is desired/supported)/ Then the `xr.Dataset that is the result of the conversion. And lastly a dict of metadata that might help with deserialization later on. :rtype: tuple[str, xr.Dataset, MetaData] :raises ValueError: If the conversion failed for some reason. .. py:method:: get_type_marker() :classmethod: .. py:method:: from_xr_dataset(dataset, metadata) :classmethod: Class method to support standardized deserialization of arbitrary classes. Implemented as a class method to avoid need to construct instance for deserialization. :param cls: The class executing the deserialization. :type cls: type[ResType] :param dataset: The dataset to be deserialized into the output type. :type dataset: xr.Dataset :param metadata: Metdatata from the serialization process. :type metadata: MetaData :returns: The deserialized instance of the target class. :rtype: instance of cls :raises TypeError: If deserialization of the object was not possible .. py:class:: ShnitselDerivedDataset(base_ds, derived_ds) Bases: :py:obj:`ShnitselDataset`, :py:obj:`shnitsel.data.xr_io_compatibility.SupportsFromXrConversion`, :py:obj:`shnitsel.data.xr_io_compatibility.SupportsToXrConversion` Definition of the protocol to support instantiation from xarray dataset structs. .. py:attribute:: _base_dataset :type: xarray.Dataset | None .. py:property:: base :type: xarray.Dataset | None .. py:property:: _item_sources :type: Iterable[Mapping[Hashable, Any]] Places to look-up items for key-completion .. py:method:: as_xr_dataset() :abstractmethod: Base function to implement by classes supporting this protocol to allow for standardized conversion to a dataset :returns: A tuple of the `io_type_tag` under which the deserializer is registered with the Shnitsel Tools framework (or `None` if no deserialization is desired/supported)/ Then the `xr.Dataset that is the result of the conversion. And lastly a dict of metadata that might help with deserialization later on. :rtype: tuple[str, xr.Dataset, MetaData] :raises ValueError: If the conversion failed for some reason. .. py:method:: get_type_marker() :classmethod: :abstractmethod: .. py:method:: from_xr_dataset(dataset, metadata) :classmethod: :abstractmethod: Class method to support standardized deserialization of arbitrary classes. Implemented as a class method to avoid need to construct instance for deserialization. :param cls: The class executing the deserialization. :type cls: type[ResType] :param dataset: The dataset to be deserialized into the output type. :type dataset: xr.Dataset :param metadata: Metdatata from the serialization process. :type metadata: MetaData :returns: The deserialized instance of the target class. :rtype: instance of cls :raises TypeError: If deserialization of the object was not possible