shnitsel.clean#
Submodules#
Functions#
|
Filter trajectories according to energy to exclude unphysical (insane) behaviour |
|
Filter trajectories according to bond length |
|
Filter trajectories according to energy to exclude unphysical (insane) behaviour |
Package Contents#
- filter_by_energy(frames_or_trajectory, filter_method='truncate', *, energy_thresholds=None, plot_thresholds=False, plot_populations=False)#
Filter trajectories according to energy to exclude unphysical (insane) behaviour
- Parameters:
frames_or_trajectory (TrajectoryOrFrames) – A Frames or Trajectory object with
astate,energy, and ideallye_kinvariables. Ifastateis not set, no filtering will be performed and no filtranda assigned.filter_method (Literal['truncate', 'omit', 'annotate'] | float) –
Specifies the manner in which to remove data;
if ‘omit’, drop trajectories unless all frames meet criteria (
shnitsel.clean.omit())- if ‘truncate’, cut each trajectory off just before the first frame that doesn’t meet criteria
(
shnitsel.clean.truncate())
if ‘annotate’, merely annotate the data;
- if a float number, interpret this number as a time, and cut all trajectories off at this time,
discarding those which violate criteria before reaching the given limit, (
shnitsel.clean.transect())
see
shnitsel.clean.dispatch_filter().optional –
Specifies the manner in which to remove data;
if ‘omit’, drop trajectories unless all frames meet criteria (
shnitsel.clean.omit())- if ‘truncate’, cut each trajectory off just before the first frame that doesn’t meet criteria
(
shnitsel.clean.truncate())
if ‘annotate’, merely annotate the data;
- if a float number, interpret this number as a time, and cut all trajectories off at this time,
discarding those which violate criteria before reaching the given limit, (
shnitsel.clean.transect())
see
shnitsel.clean.dispatch_filter().energy_thresholds (dict[str, float] | EnergyFiltrationThresholds | None) – Threshold for total, potential and kinetic energy of the system. Can specify thresholds for overall drift and individual time step changes. Can also specify thresholds for energy steps at hops. Unit should be specified as a member variable. If not provided will default to some reasonable default values as seen in EnergyThresholds definition.
optional – Threshold for total, potential and kinetic energy of the system. Can specify thresholds for overall drift and individual time step changes. Can also specify thresholds for energy steps at hops. Unit should be specified as a member variable. If not provided will default to some reasonable default values as seen in EnergyThresholds definition.
plot_thresholds (bool | Sequence[float]) –
See
shnitsel.vis.plot.filtration.check_thresholds().If
True, will plot usingcheck_thresholdswith
default quantiles - If a
Sequence, will plot usingcheck_thresholdswith specified quantiles - IfFalse(the default), will not plot threshold plotplot_populations (Literal['independent', 'intersections', False]) –
See
shnitsel.vis.plot.filtration.validity_populations().If
'intersections', will plot populations of
trajectories satisfying intersecting conditions - If
'independent', will plot populations of trajectories satisfying conditions taken independently - IfFalse(the default), will not plot populations plot
- Return type:
The sanitized xr.Dataset
Notes
The resulting object has a
filtrandadata_var, representing the values by which the data were filtered. If the input has afiltrandadata_var, it is overwritten.
- filter_by_length(frames_or_trajectory, filter_method='truncate', *, geometry_thresholds=None, mol=None, plot_thresholds=False, plot_populations=False)#
Filter trajectories according to bond length
- Parameters:
frames_or_trajectory (Trajectory | Frames | xr.Dataset) – A Trajectory or Frames Dataset with an
atXYZvariable (NB. this function takes an xr.Dataset as opposed to an xr.DataArray for consistency withshnitsel.clean.filter_by_energy())filter_method (Literal["truncate", "omit", "annotate"] | float, optional) –
Specifies the manner in which to remove data;
if ‘omit’, drop trajectories unless all frames meet criteria (
shnitsel.clean.omit())- if ‘truncate’, cut each trajectory off just before the first frame that doesn’t meet criteria
(
shnitsel.clean.truncate())
if ‘annotate’, merely annotate the data;
- if a float number, interpret this number as a time, and cut all trajectories off at this time,
discarding those which violate criteria before reaching the given limit, (
shnitsel.clean.transect())
see
shnitsel.clean.dispatch_filter().geometry_thresholds (GeometryFiltrationThresholds, optional) –
A mapping from SMARTS-strings to length-thresholds.
- The SMARTS-strings describe bonds which are searched
for in an RDKit Mol object obtained via
shnitsel.bridges.default_mol()
- The thresholds describe maximal tolerable bond-lengths; if there are multiple matches
for a given search, the longest bond-length will be considered for each frame
The unit for the maximum length is provided in the member variable length_unit which defaults to angstrom.
If not provided will be initialized with thresholds for H-(C/N) bonds and one for all bonds.
mol (Mol, optional) –
An rdkit mol object, if not provided it will be generated from the XYZ coordinates in the data See
shnitsel.vis.plot.filtration.check_thresholds().If
True, will plot usingcheck_thresholdswith
default quantiles - If a
Sequence, will plot usingcheck_thresholdswith specified quantiles - IfFalse, will not plot threshold plotplot_populations (Literal["independent", "intersections", False], optional) –
See
shnitsel.vis.plot.filtration.validity_populations().If
'intersections', will plot populations of
trajectories satisfying intersecting conditions - If
'independent', will plot populations of trajectories satisfying conditions taken independently - IfFalse, will not plot populations plot
- Return type:
The filtered Dataset or None if the filter method results in the trajectory being rejected.
Notes
The resulting object has a
filtrandadata_var, representing the values by which the data were filtered. If the input has afiltrandadata_var, it is overwritten. An existing ‘criterion’ dimension will be dropped from the frames_or_trajectory parameter along with all variables and coordinates tied to it.
- sanity_check(trajectory_or_frames, filter_method='truncate', *, energy_thresholds=None, geometry_thresholds=None, plot_thresholds=False, plot_populations=False, mol=None, drop_empty_trajectories=False)#
Filter trajectories according to energy to exclude unphysical (insane) behaviour
- Parameters:
trajectory_or_frames (Trajectory | Frames | TreeNode[Any, Trajectory|Frames]) – A Trajectory or Frames object (or a ShnitselDB structure holding such objects) with an
atXYZvariable as well asastate,energy, and ideallye_kinvariablesfilter_method (Literal["truncate", "omit", "annotate"] | float, optional) –
- Specifies the manner in which to remove data;
if ‘omit’, drop trajectories unless all frames meet criteria (
shnitsel.clean.omit())- if ‘truncate’, cut each trajectory off just before the first frame that doesn’t meet criteria
(
shnitsel.clean.truncate())
if ‘annotate’, merely annotate the data;
- if a float number, interpret this number as a time, and cut all trajectories off at this time,
discarding those which violate criteria before reaching the given limit, (
shnitsel.clean.transect())
see
shnitsel.clean.dispatch_filter().energy_thresholds (EnergyFiltrationThresholds, optional) – Threshold for total, potential and kinetic energy of the system. Can specify thresholds for overall drift and individual time step changes. Can also specify thresholds for energy steps at hops. Unit should be specified as a member variable. If not provided will default to some reasonable default values as seen in EnergyThresholds definition.
geometry_thresholds (GeometryFiltrationThresholds, optional) –
A mapping from SMARTS-strings to length-thresholds.
- The SMARTS-strings describe bonds which are searched
for in an RDKit Mol object obtained via
shnitsel.bridges.default_mol()
- The thresholds describe maximal tolerable bond-lengths; if there are multiple matches
for a given search, the longest bond-length will be considered for each frame
The unit for the maximum length is provided in the member variable length_unit which defaults to angstrom.
If not provided will be initialized with thresholds for H-(C/N) bonds and one for all bonds.
plot_thresholds (bool, optional) –
See
shnitsel.vis.plot.filtration.check_thresholds().If
True, will plot usingcheck_thresholdswith
default quantiles - If a
Sequence, will plot usingcheck_thresholdswith specified quantiles - IfFalse, will not plot threshold plotplot_populations (Literal ['intersections', 'independent', False], optional) –
See
shnitsel.vis.plot.filtration.validity_populations().If
'intersections', will plot populations of
trajectories satisfying intersecting conditions - If
'independent', will plot populations of trajectories satisfying conditions taken independently - IfFalse, will not plot populations plotmol (rdkit.Chem.Mol, optional) – Optional parameter to provide a mol object to base structure analysis on, by default generated from the first frame in the trajectory or frameset.
drop_empty_trajectories (bool, optional) – Flag to not include trajectories for which the sanity check result was empty in the final result tree, by default False. Only used for tree-structure inputs.
- Returns:
The sanitized trajectory, frames or tree.
A tree is sanitized by applying the sanitization function to all individual data points in the tree.
- Return type:
shnitsel.data.tree.node.TreeNode[Any, TrajectoryOrFrames] | TrajectoryOrFrames | None
Notes
The resulting object has a
energy_filtrandaand ageometry_filtrandadata_var, representing the values by which the data were filtered. If the input has afiltrandadata_var, it is overwritten. If the input has a criterion dimension, it will be dropped.