shnitsel.data.dataset_containers.shared#
Classes#
Definition of the protocol to support instantiation from |
|
Definition of the protocol to support instantiation from |
Module Contents#
- class ShnitselDataset(ds)#
Bases:
shnitsel.data.xr_io_compatibility.SupportsFromXrConversion,shnitsel.data.xr_io_compatibility.SupportsToXrConversionDefinition of the protocol to support instantiation from xarray dataset structs.
- Parameters:
ds (xarray.Dataset)
- _raw_dataset: xarray.Dataset#
- property dataset: xarray.Dataset#
- Return type:
- property state_ids#
- property state_names#
- property state_types#
- property state_magnetic_number#
- property state_degeneracy_group#
- property state_charges#
- property active_state#
- property state_diagonal#
- property atom_names#
- property atom_numbers#
- property charge: float#
The charge of the molecule if set on the trajectory data. Loaded from charge attribute (or variable) or state_charges coordinate if provided.
If no information is found, 0 is returned.
- Return type:
- set_charge(value)#
Method to set the charge on a dataset, clear conflicting positions of charge info on the dataset and return a new instance of the wrapped dataset.
- Parameters:
value (float | xr.DataArray) – Either a single value (optionally wrapped in a DataArray already) to indicate the charge of the full molecule in all states (will be set to coordinate charge) or a DataArray that represents state-dependent charges (which will be set to state_charges)
- Returns:
The updated object as a copy.
- Return type:
Self
- Raises:
ValueError – If an unsupported value was provided.
- property dims#
- property coords#
- property sizes#
- property data_vars#
- property mol: rdkit.Chem.Mol#
Helper method to get a representative molecule object for the geometry within this dataset.
- Returns:
Either a copy of a cached mol object (for partial substructures) or a newly constructed default object
- Return type:
rdkit.Chem.Mol
- sel(indexers=None, method=None, tolerance=None, drop=False, **indexers_kwargs)#
Returns a new dataset with each data array indexed by tick labels along the specified dimension(s).
In contrast to .isel, indexers for this method should use labels (i.e. explicit values in that dimension) instead of integers.
Under the hood, this method is powered by using pandas’s powerful Index objects. This makes label based indexing essentially just as fast as using integer indexing.
It also means this method uses pandas’s (well documented) logic for indexing. This means you can use string shortcuts for datetime indexes (e.g., ‘2000-01’ to select all values in January 2000). It also means that slices are treated as inclusive of both the start and stop values, unlike normal Python indexing.
- Parameters:
indexers (dict, optional) – A dict with keys matching dimensions and values given by scalars, slices or arrays of tick labels. For dimensions with multi-index, the indexer may also be a dict-like object with keys matching index level names. If DataArrays are passed as indexers, xarray-style indexing will be carried out. See Indexing and selecting data for the details. One of indexers or indexers_kwargs must be provided.
method ({None, "nearest", "pad", "ffill", "backfill", "bfill"}, optional) –
Method to use for inexact matches:
None (default): only exact matches
pad / ffill: propagate last valid index value forward
backfill / bfill: propagate next valid index value backward
nearest: use nearest valid index value
tolerance (optional) – Maximum distance between original and new labels for inexact matches. The values of the index at the matching locations must satisfy the equation
abs(index[indexer] - target) <= tolerance.drop (bool, optional) – If
drop=True, drop coordinates variables in indexers instead of making them scalar.**indexers_kwargs ({dim: indexer, ...}, optional) – The keyword arguments form of
indexers. One of indexers or indexers_kwargs must be provided.
- Returns:
dataset – A new Dataset with the same contents as this dataset, except each variable and dimension is indexed by the appropriate indexers. If indexer DataArrays have coordinates that do not conflict with this object, then these coordinates will be attached. In general, each array’s data will be a view of the array’s data in this dataset, unless vectorized indexing was triggered by using an array indexer, in which case the data will be a copy.
- Return type:
Self
See also
ShnitselDataset.iselDataset.selDataset.iselDataArray.sel- xarray-tutorial:intermediate/indexing/indexing
Tutorial material on indexing with Xarray objects
- xarray-tutorial:fundamentals/02.1_indexing_Basic
Tutorial material on basics of indexing
- isel(indexers=None, drop=False, missing_dims='raise', **indexers_kwargs)#
Returns a new dataset with each array indexed along the specified dimension(s).
This method selects values from each array using its __getitem__ method, except this method does not require knowing the order of each array’s dimensions.
- Parameters:
indexers (dict, optional) – A dict with keys matching dimensions and values given by integers, slice objects or arrays. indexer can be a integer, slice, array-like or DataArray. If DataArrays are passed as indexers, xarray-style indexing will be carried out. See Indexing and selecting data for the details. One of indexers or indexers_kwargs must be provided.
drop (bool, default: False) – If
drop=True, drop coordinates variables indexed by integers instead of making them scalar.missing_dims ({"raise", "warn", "ignore"}, default: "raise") – What to do if dimensions that should be selected from are not present in the Dataset: - “raise”: raise an exception - “warn”: raise a warning, and ignore the missing dimensions - “ignore”: ignore the missing dimensions
**indexers_kwargs ({dim: indexer, ...}, optional) – The keyword arguments form of
indexers. One of indexers or indexers_kwargs must be provided.
- Returns:
obj – A new Dataset with the same contents as this dataset, except each array and dimension is indexed by the appropriate indexers. If indexer DataArrays have coordinates that do not conflict with this object, then these coordinates will be attached. In general, each array’s data will be a view of the array’s data in this dataset, unless vectorized indexing was triggered by using an array indexer, in which case the data will be a copy.
- Return type:
Dataset
Examples
# A specific element from the dataset is selected
>>> dataset.isel(atom=1, time=0) <xarray.Dataset> Size: Dimensions: (direction: 3) Coordinates: atom int16 2B 1 time float64 8B 0.0 direction (direction) <U1 3B 'x' 'y' 'z' Data variables: energy float64 8B -238.2 forces (direction) float64 24B 1.2 -0.2 0.1
# Indexing with a slice using isel
>>> slice_of_data = dataset.isel(atom=slice(0, 2), time=slice(0, 2)) >>> slice_of_data <xarray.Dataset> Size: Dimensions: (atom: 2, time: 2, direction: 3) Coordinates: * atom (atom) int16 2B 1 * time (time) float64 16B 0.0 0.5 * direction <U1 3B 'x' 'y' 'z' Data variables: energy (time) float64 24B -238.2 forces (time, atom, direction) float64 96B -0.5 -0.4 0.4 ...
>>> index_array = xr.DataArray([0, 2], dims="atom") >>> indexed_data = dataset.isel(atom=index_array) >>> indexed_data <xarray.Dataset> Size: Dimensions: (atom: 2, time: 3, direction: 3) Coordinates: * atom (atom) int16 4B 1 3 * time (time) float64 16B 0.0 0.5 1.0 * direction <U1 3B 'x' 'y' 'z' Data variables: energy (time) float64 24B -238.2 -238.4 -237.9 forces (time, atom, direction) float64 96B -0.5 -0.4 0.4 ...
See also
ShnitselDataset.selDataset.selDataset.iselDataArray.isel- xarray-tutorial:intermediate/indexing/indexing
Tutorial material on indexing with Xarray objects
- xarray-tutorial:fundamentals/02.1_indexing_Basic
Tutorial material on basics of indexing
- property _attr_sources: Iterable[Mapping[Hashable, Any]]#
Places to look-up items for attribute-style access
- Return type:
Iterable[Mapping[Hashable, Any]]
- property _item_sources: Iterable[Mapping[Hashable, Any]]#
Places to look-up items for key-completion
- Return type:
Iterable[Mapping[Hashable, Any]]
- __contains__(a)#
- _repr_html_()#
- Return type:
Any
- __getitem__(key)#
- __dir__()#
Provide method name lookup and completion. Only provide ‘public’ methods.
- _ipython_key_completions_()#
Provide method for the key-autocompletions in IPython. See https://ipython.readthedocs.io/en/stable/config/integrating.html#tab-completion For the details.
- convert(varname=None, unit=None)#
Convert an entry in this dataset to a specific unit.
Returns a copy of the dataset with the entry updated.
- as_xr_dataset()#
Base function to implement by classes supporting this protocol to allow for standardized conversion to a dataset
- Returns:
A tuple of the io_type_tag under which the deserializer is registered with the Shnitsel Tools framework (or None if no deserialization is desired/supported)/ Then the `xr.Dataset that is the result of the conversion. And lastly a dict of metadata that might help with deserialization later on.
- Return type:
- Raises:
ValueError – If the conversion failed for some reason.
- classmethod from_xr_dataset(dataset, metadata)#
Class method to support standardized deserialization of arbitrary classes. Implemented as a class method to avoid need to construct instance for deserialization.
- Parameters:
cls (type[ResType]) – The class executing the deserialization.
dataset (xr.Dataset) – The dataset to be deserialized into the output type.
metadata (MetaData) – Metdatata from the serialization process.
- Returns:
The deserialized instance of the target class.
- Return type:
instance of cls
- Raises:
TypeError – If deserialization of the object was not possible
- class ShnitselDerivedDataset(base_ds, derived_ds)#
Bases:
ShnitselDataset,shnitsel.data.xr_io_compatibility.SupportsFromXrConversion,shnitsel.data.xr_io_compatibility.SupportsToXrConversionDefinition of the protocol to support instantiation from xarray dataset structs.
- Parameters:
base_ds (xarray.Dataset | None)
derived_ds (xarray.Dataset)
- _base_dataset: xarray.Dataset | None#
- property base: xarray.Dataset | None#
- Return type:
xarray.Dataset | None
- property _item_sources: Iterable[Mapping[Hashable, Any]]#
Places to look-up items for key-completion
- Return type:
Iterable[Mapping[Hashable, Any]]
- abstractmethod as_xr_dataset()#
Base function to implement by classes supporting this protocol to allow for standardized conversion to a dataset
- Returns:
A tuple of the io_type_tag under which the deserializer is registered with the Shnitsel Tools framework (or None if no deserialization is desired/supported)/ Then the `xr.Dataset that is the result of the conversion. And lastly a dict of metadata that might help with deserialization later on.
- Return type:
- Raises:
ValueError – If the conversion failed for some reason.
- classmethod from_xr_dataset(dataset, metadata)#
- Abstractmethod:
- Parameters:
dataset (xarray.Dataset)
metadata (shnitsel.data.xr_io_compatibility.MetaData)
- Return type:
shnitsel.data.xr_io_compatibility.ResType
Class method to support standardized deserialization of arbitrary classes. Implemented as a class method to avoid need to construct instance for deserialization.
- Parameters:
cls (type[ResType]) – The class executing the deserialization.
dataset (xr.Dataset) – The dataset to be deserialized into the output type.
metadata (MetaData) – Metdatata from the serialization process.
- Returns:
The deserialized instance of the target class.
- Return type:
instance of cls
- Raises:
TypeError – If deserialization of the object was not possible