shnitsel.data.traj_combiner_methods#

Attributes#

Exceptions#

InconsistentAttributeError

Inappropriate argument value (of correct type).

MultipleCompoundsError

Inappropriate argument value (of correct type).

Classes#

MissingValue

Sentinel value for tree_to_frames.

Functions#

_check_matching_dimensions(datasets[, ...])

Function to check whether all/certain dimensions are equally sized.

_compare_dicts_of_values(curr_root_a, curr_root_b[, ...])

Compare two dicts and return the lists of matching and non-matching recursive keys.

_check_matching_var_meta(datasets)

Function to check if all of the variables have matching metadata.

_merge_traj_metadata(datasets)

Function to gather metadate from a set of trajectories.

concat_trajs(…)

Function to concatenate multiple trajectories along their time dimension.

db_from_data(datasets[, dtype])

Function to merge multiple trajectories of the same molecule into a single ShnitselDB instance.

layer_trajs(datasets[, dtype])

Function to combine trajectories into one Dataset by creating a new dimension 'trajectory' and indexing the different trajectories along that.

Module Contents#

_coordinate_meta_keys = ['trajid', 'delta_t', 'max_ts', 't_max', 'completed', 'nsteps']#
exception InconsistentAttributeError#

Bases: ValueError

Inappropriate argument value (of correct type).

exception MultipleCompoundsError#

Bases: ValueError

Inappropriate argument value (of correct type).

class MissingValue#

Sentinel value for tree_to_frames.

_check_matching_dimensions(datasets, excluded_dimensions=set(), limited_dimensions=None)#

Function to check whether all/certain dimensions are equally sized.

Excluded dimensions can be provided as a set of strings.

Parameters:
  • datasets (Iterable[xr.Dataset]) – The series of datasets to be checked for equal dimensions

  • excluded_dimensions (set[str], optional) – The set of dimension names to be excluded from the comparison. Defaults to set().

  • limited_dimensions (set[str], optional) – Optionally set a list of dimensions to which the analysis should be limited.

Returns:

True if all non-excluded (possibly limited) dimensions match in size. False otherwise.

Return type:

bool

_compare_dicts_of_values(curr_root_a, curr_root_b, base_key=[])#

Compare two dicts and return the lists of matching and non-matching recursive keys.

Parameters:
  • curr_root_a (Any) – Root of the first tree

  • curr_root_b (Any) – Root of the second tree

  • base_key (list[str]) – The current key associated with the root. Starts with [] for the initial call.

Returns:

A tuple, where the first list is the list of chains of keys of all matching sub-trees, the second entry is the same but for identifying distinct sub-trees. If a matching key points to a sub-tree, the entire sub-tree is identical.

Return type:

tuple[list[list[str]] | None, list[list[str]] | None]

_check_matching_var_meta(datasets)#

Function to check if all of the variables have matching metadata.

We do not want to merge trajectories with different metadata on variables.

TODO: Allow for variables being denoted that we do not care for.

Parameters:

datasets (Sequence[xr.Dataset | Trajectory | Frames]) – The trajectories to compare the variable metadata for.

Returns:

True if the metadata matches on all trajectories, False otherwise

Return type:

bool

_merge_traj_metadata(datasets)#

Function to gather metadate from a set of trajectories.

Used to combine trajectories into one aggregate Dataset.

Parameters:

datasets (Sequence[xr.Dataset | Trajectory | Frames]) – The sequence of trajctories for which metadata should be collected

Returns:

The resulting meta information shared across all trajectories (first), and then the distinct meta information (second) in a key -> Array_of_values fashion.

Return type:

tuple[dict[str, Any], dict[str, np.ndarray]]

DataType#
concat_trajs(datasets: Sequence[xarray.DataArray], dtype: type[DataType] | types.UnionType | None = None) xarray.DataArray#
concat_trajs(datasets: Sequence[shnitsel.data.dataset_containers.Trajectory | shnitsel.data.dataset_containers.Frames | xarray.Dataset], dtype: type[DataType] | types.UnionType | None = None) xarray.Dataset

Function to concatenate multiple trajectories along their time dimension.

Will create one continuous time dimension like an extended trajectory. The concatenated dimension will be renamed frame consisting of a time and a atrajectory component where the latter denotes the active trajectory.

Additionally, a dimension trajectory with accompanying trajectory ids as metadata and to index the remaining collected trajectory metadata will be introduced.

For a sequence of data arrays, we will just try and concatenate the arrays.

Parameters:
  • datasets (Iterable[Trajectory | Frames | xr.Dataset] | Sequence[xr.DataArray]) – Datasets representing the individual trajectories or a sequence of arrays to concatenate.

  • dtype (type[DataType] | UnionType | None) – Type hint for the data to be included in the resulting container type.

Raises:
  • ValueError – Raised if there is conflicting input dimensions.

  • ValueError – Raised if there is conflicting input variable meta data.

  • ValueError – Raised if there is conflicting global input attributes that are relevant to the merging process.

  • ValueError – Raised if there are no trajectories provided to this function.

Returns:

The combined and extended trajectory with a new leading frame dimension

Return type:

xr.Dataset

db_from_data(datasets, dtype=None)#

Function to merge multiple trajectories of the same molecule into a single ShnitselDB instance.

Parameters:
  • datasets (Sequence[DataType] | DataType) – The individual loaded data points, e.g. trajectories or a single data point/trajectory to turn into a tree.

  • dtype (type[DataType] | UnionType | None) – Type hint for the data to be included in the resulting tree.

Returns:

The resulting ShnitselDB structure with a ShnitselDBRoot, CompoundGroup and DataGroup layers.

Return type:

ShnitselDB[DataType]

layer_trajs(datasets, dtype=None)#

Function to combine trajectories into one Dataset by creating a new dimension ‘trajectory’ and indexing the different trajectories along that.

Will create one new trajectory dimension.

Parameters:
  • datasets (Sequence[xr.Dataset | Trajectory]) – Datasets representing the individual trajectories

  • dtype (type[DataType] | UnionType | None) – Type hint for the data to be included in the resulting container type.

  • Raises

  • ValueError – Raised if there is conflicting input meta data.

  • ValueError – Raised if there are no trajectories provided to this function or if there are non-trajectories provided to this function.

Returns:

The combined and extended trajectory with a new leading trajectory dimension to differentiate the trajectory data.

Return type:

xr.Dataset