shnitsel.clean¶
Submodules¶
Functions¶
|
Derive energetic filtration targets from an xr.Dataset |
|
Filter trajectories according to energy to exclude unphysical (insane) behaviour |
|
Derive bond length filtration targets from an xr.Dataset |
|
Filter trajectories according to bond length |
|
|
|
|
|
|
|
|
|
|
|
Package Contents¶
- energy_filtranda(frames, *, etot_drift=None, etot_step=None, epot_step=None, ekin_step=None, hop_epot_step=None, units='eV')¶
Derive energetic filtration targets from an xr.Dataset
- Parameters:
frames – A xr.Dataset with
astate,energy, and ideallye_kinvariablesetot_drift (float | None) – Threshold for drift of total energy over an entire trajectory, by default 0.2 eV
optional – Threshold for drift of total energy over an entire trajectory, by default 0.2 eV
etot_step (float | None) – Threshold for difference in total energy from one frame to the next, ignoring hops , by default 0.1 eV
optional – Threshold for difference in total energy from one frame to the next, ignoring hops , by default 0.1 eV
epot_step (float | None) – Threshold for difference in potential energy from one frame to the next, ignoring hops, by default 0.7 eV
optional – Threshold for difference in potential energy from one frame to the next, ignoring hops, by default 0.7 eV
ekin_step (float | None) – Threshold for difference in kinetic energy from one frame to the next, ignoring hops, by default 0.7 eV
optional – Threshold for difference in kinetic energy from one frame to the next, ignoring hops, by default 0.7 eV
hop_epot_step (float | None) – Threshold for difference in potential energy across hops, by default 1.0 eV
optional – Threshold for difference in potential energy across hops, by default 1.0 eV
units – Units in which custom thresholds are given, and to which defaults and data will be converted, by default ‘eV’
optional – Units in which custom thresholds are given, and to which defaults and data will be converted, by default ‘eV’
- Returns:
An xr.DataArray of filtration targets stacked along the
criteriondimension;criteria comprise epot_step and hop_epot_step, as well as
etot_drift, etot_step and ekin_step if the input contains an e_kin variable
- sanity_check(frames, cut='truncate', *, units='eV', etot_drift=np.nan, etot_step=np.nan, epot_step=np.nan, ekin_step=np.nan, hop_epot_step=np.nan, plot_thresholds=False, plot_populations=False)¶
Filter trajectories according to energy to exclude unphysical (insane) behaviour
- Parameters:
frames – A xr.Dataset with
astate,energy, and ideallye_kinvariablescut (Literal['truncate', 'omit', False] | numbers.Number) –
Specifies the manner in which to remove data;
if ‘omit’, drop trajectories unless all frames meet criteria (
shnitsel.clean.omit())if ‘truncate’, cut each trajectory off just before the first frame that doesn’t meet criteria (
shnitsel.clean.truncate())if a number, interpret this number as a time, and cut all trajectories off at this time, discarding those which violate criteria before reaching the given limit, (
shnitsel.clean.transect())if
False, merely annotate the data;
see
shnitsel.clean.dispatch_cut().optional –
Specifies the manner in which to remove data;
if ‘omit’, drop trajectories unless all frames meet criteria (
shnitsel.clean.omit())if ‘truncate’, cut each trajectory off just before the first frame that doesn’t meet criteria (
shnitsel.clean.truncate())if a number, interpret this number as a time, and cut all trajectories off at this time, discarding those which violate criteria before reaching the given limit, (
shnitsel.clean.transect())if
False, merely annotate the data;
see
shnitsel.clean.dispatch_cut().units – Units in which custom thresholds are given, and to which defaults and data will be converted, by default ‘eV’
optional – Units in which custom thresholds are given, and to which defaults and data will be converted, by default ‘eV’
etot_drift (float) – Threshold for drift of total energy over an entire trajectory, by default 0.2 eV
optional – Threshold for drift of total energy over an entire trajectory, by default 0.2 eV
etot_step (float) – Threshold for difference in total energy from one frame to the next, ignoring hops , by default 0.1 eV
optional – Threshold for difference in total energy from one frame to the next, ignoring hops , by default 0.1 eV
epot_step (float) – Threshold for difference in potential energy from one frame to the next, ignoring hops, by default 0.7 eV
optional – Threshold for difference in potential energy from one frame to the next, ignoring hops, by default 0.7 eV
ekin_step (float) – Threshold for difference in kinetic energy from one frame to the next, ignoring hops, by default 0.7 eV
optional – Threshold for difference in kinetic energy from one frame to the next, ignoring hops, by default 0.7 eV
hop_epot_step (float) – Threshold for difference in potential energy across hops, by default 1.0 eV
optional – Threshold for difference in potential energy across hops, by default 1.0 eV
plot_thresholds (bool | Sequence[float]) –
See
shnitsel.vis.plot.filtration.check_thresholds().If
True, will plot usingcheck_thresholdswith
default quantiles - If a
Sequence, will plot usingcheck_thresholdswith specified quantiles - IfFalse, will not plot threshold plotplot_populations (bool | Literal['independent', 'intersections']) –
See
shnitsel.vis.plot.filtration.validity_populations().If
Trueor'intersections', will plot populations of
trajectories satisfying intersecting conditions - If
'independent', will plot populations of trajectories satisfying conditions taken independently - IfFalse, will not plot populations plot
- Return type:
The sanitized xr.Dataset
Notes
The resulting object has a
filtrandadata_var, representing the values by which the data were filtered. If the input has afiltrandadata_var, it is overwritten.
- bond_length_filtranda(frames, search_dict=None, units='angstrom', mol=None)¶
Derive bond length filtration targets from an xr.Dataset
- Parameters:
frames – A xr.Dataset with an
atXYZvariablesearch_dict (dict[str, numbers.Number] | None) –
A mapping from SMARTS-strings to length-thresholds.
The SMARTS-strings describe bonds which are searched for in an RDKit Mol object obtained via
shnitsel.bridges.default_mol()The thresholds describe maximal tolerable bond-lengths; if there are multiple matches for a given search, the longest bond-length will be considered for each frame
optional –
A mapping from SMARTS-strings to length-thresholds.
The SMARTS-strings describe bonds which are searched for in an RDKit Mol object obtained via
shnitsel.bridges.default_mol()The thresholds describe maximal tolerable bond-lengths; if there are multiple matches for a given search, the longest bond-length will be considered for each frame
units – Units in which custom thresholds are given, and to which defaults and data will be converted, by default ‘angstrom’
optional – Units in which custom thresholds are given, and to which defaults and data will be converted, by default ‘angstrom’
mol (rdkit.Chem.Mol | None)
- Returns:
An xr.DataArray of filtration targets stacked along the
criteriondimension;one criterion per
search_dictentry.
- filter_by_length(frames, cut='truncate', search_dict=None, units='angstrom', plot_thresholds=False, plot_populations=False, mol=None)¶
Filter trajectories according to bond length
- Parameters:
frames – A xr.Dataset with an
atXYZvariable (NB. this function takes an xr.Dataset as opposed to an xr.DataArray for consistency withshnitsel.clean.sanity_check())cut (Literal['truncate', 'omit', False] | numbers.Number) –
Specifies the manner in which to remove data;
if ‘omit’, drop trajectories unless all frames meet criteria (
shnitsel.clean.omit())if ‘truncate’, cut each trajectory off just before the first frame that doesn’t meet criteria (
shnitsel.clean.truncate())if a number, interpret this number as a time, and cut all trajectories off at this time, discarding those which violate criteria before reaching the given limit, (
shnitsel.clean.transect())if
False, merely annotate the data;
see
shnitsel.clean.dispatch_cut().search_dict (dict[str, numbers.Number] | None) –
A mapping from SMARTS-strings to length-thresholds.
The SMARTS-strings describe bonds which are searched for in an RDKit Mol object obtained via
shnitsel.bridges.default_mol()The thresholds describe maximal tolerable bond-lengths; if there are multiple matches for a given search, the longest bond-length will be considered for each frame
plot_thresholds (bool | Sequence[float]) –
See
shnitsel.vis.plot.filtration.check_thresholds().If
True, will plot usingcheck_thresholdswith
default quantiles - If a
Sequence, will plot usingcheck_thresholdswith specified quantiles - IfFalse, will not plot threshold plotplot_populations (bool | Literal['independent', 'intersections']) –
See
shnitsel.vis.plot.filtration.validity_populations().If
Trueor'intersections', will plot populations of
trajectories satisfying intersecting conditions - If
'independent', will plot populations of trajectories satisfying conditions taken independently - IfFalse, will not plot populations plotmol (rdkit.Chem.Mol | None) – An rdkit mol object, if not provided it will be generated from the XYZ coordinates in the data
units (str) – Units in which custom thresholds are given, and to which defaults and data will be converted, by default ‘angstrom’
- Return type:
The filtered Dataset
Notes
The resulting object has a
filtrandadata_var, representing the values by which the data were filtered. If the input has afiltrandadata_var, it is overwritten.
- omit(ds)¶
- Parameters:
ds (shnitsel.data.trajectory_format.Trajectory)
- truncate(ds)¶
- Parameters:
ds (shnitsel.data.trajectory_format.Trajectory)
- transect(ds, cutoff)¶
- Parameters:
ds (shnitsel.data.trajectory_format.Trajectory | shnitsel.core.typedefs.Frames)
cutoff (float)
- cum_max_quantiles(obj, quantiles=None)¶
- true_upto(mask, dim)¶
- cum_mask_from_dataset(ds)¶
- Parameters:
ds (shnitsel.core.typedefs.Stacked | shnitsel.core.typedefs.Unstacked)
- Return type:
shnitsel.core.typedefs.Unstacked
- cum_mask_from_filtranda(filtranda)¶
- Parameters:
filtranda (shnitsel.core.typedefs.Stacked | shnitsel.core.typedefs.Unstacked)
- Return type:
shnitsel.core.typedefs.Unstacked