shnitsel.data.shnitsel_db.combiner_methods

Attributes

_coordinate_meta_keys

Functions

_check_matching_dimensions(datasets[, ...])

Function to check whether all/certain dimensions are equally sized.

_compare_dicts_of_values(curr_root_a, curr_root_b[, ...])

Compare two dicts and return the lists of matching and non-matching recursive keys.

_check_matching_var_meta(datasets)

Function to check if all of the variables have matching metadata.

_merge_traj_metadata(datasets)

Function to gather metadate from a set of trajectories.

concat_trajs(datasets)

Function to concatenate multiple trajectories along their time dimension.

db_from_trajs(datasets)

Function to merge multiple trajectories of the same molecule into a single ShnitselDB instance.

layer_trajs(datasets)

Function to combine trajectories into one Dataset by creating a new dimension 'trajid' and indexing the different trajectories along that.

Module Contents

_coordinate_meta_keys = ['trajid', 'delta_t', 'max_ts', 't_max', 'completed', 'nsteps']
_check_matching_dimensions(datasets, excluded_dimensions=set(), limited_dimensions=None)

Function to check whether all/certain dimensions are equally sized.

Excluded dimensions can be provided as a set of strings.

Parameters:
  • datasets (Iterable[Trajectory]) – The series of datasets to be checked for equal dimensions

  • excluded_dimensions (Set[str], optional) – The set of dimension names to be excluded from the comparison. Defaults to set().

  • limited_dimensions (Set[str], optional) – Optionally set a list of dimensions to which the analysis should be limited.

Returns:

True if all non-excluded (possibly limited) dimensions match in size. False otherwise.

Return type:

bool

_compare_dicts_of_values(curr_root_a, curr_root_b, base_key=[])

Compare two dicts and return the lists of matching and non-matching recursive keys.

Parameters:
  • curr_root_a (Any) – Root of the first tree

  • curr_root_b (Any) – Root of the second tree

  • base_key (List[str]) – The current key associated with the root. Starts with [] for the initial call.

Returns:

A tuple, where the first list is the list of chains of keys of all matching sub-trees,

the second entry is the same but for identifying distinct sub-trees. If a matching key points to a sub-tree, the entire sub-tree is identical.

Return type:

Tuple[List[List[str]]|None, List[List[str]]|None]

_check_matching_var_meta(datasets)

Function to check if all of the variables have matching metadata.

We do not want to merge trajectories with different metadata on variables.

TODO: Allow for variables being denoted that we do not care for.

Parameters:

datasets (List[Trajectory]) – The trajectories to compare the variable metadata for.

Returns:

True if the metadata matches on all trajectories, False otherwise

Return type:

bool

_merge_traj_metadata(datasets)

Function to gather metadate from a set of trajectories.

Used to combine trajectories into one aggregate Dataset.

Parameters:

datasets (Iterable[Trajectory]) – The sequence of trajctories for which metadata should be collected

Returns:

The resulting meta information shared across all trajectories (first),

and then the distinct meta information (second) in a key -> Array_of_values fashion.

Return type:

Tuple[Dict[str,Any],Dict[str,np.ndarray]]

concat_trajs(datasets)

Function to concatenate multiple trajectories along their time dimension.

Will create one continuous time dimension like an extended trajectory. The concatenated dimension will be renamed frame

Parameters:

datasets (Iterable[Trajectory]) – Datasets representing the individual trajectories

Raises:
  • ValueError – Raised if there is conflicting input dimensions.

  • ValueError – Raised if there is conflicting input variable meta data.

  • ValueError – Raised if there is conflicting global input attributes that are relevant to the merging process.

  • ValueError – Raised if there are no trajectories provided to this function.

Returns:

The combined and extended trajectory with a new leading frame dimension

Return type:

Trajectory

db_from_trajs(datasets)

Function to merge multiple trajectories of the same molecule into a single ShnitselDB instance.

Parameters:

datasets (Iterable[Trajectory]) – The individual loaded trajectories.

Returns:

The resulting ShnitselDB structure with a ShnitselDBRoot, CompoundGroup and TrajectoryData layers.

Return type:

ShnitselDB

layer_trajs(datasets)

Function to combine trajectories into one Dataset by creating a new dimension ‘trajid’ and indexing the different trajectories along that.

Will create one new trajid dimension.

Parameters:

datasets (Iterable[xr.Dataset]) – Datasets representing the individual trajectories

Raises:
  • ValueError – Raised if there is conflicting input meta data.

  • ValueError – Raised if there are no trajectories provided to this function.

Returns:

The combined and extended trajectory with a new leading trajid dimension

Return type:

xr.Dataset