shnitsel.geo.analogs

Functions

_find_atom_pairs(mol, atoms)

_substruct_match_to_submol(mol, substruct_match)

list_analogs(ensembles[, smarts, vis])

Extract a common moiety from a selection of ensembles

_combine_compounds_unstacked(compounds[, names, ...])

_combine_compounds_stacked(compounds[, names, concat_kws])

combine_analogs(ensembles[, smarts, names, vis, ...])

Combine ensembles for different compounds by finding the

Module Contents

_find_atom_pairs(mol, atoms)
_substruct_match_to_submol(mol, substruct_match)
list_analogs(ensembles, smarts='', vis=False)

Extract a common moiety from a selection of ensembles

Parameters:
  • ensembles (Iterable[xarray.DataArray]) – An Iterable of ``xr.DataArray``s, each containing the geometries of an ensemble of trajectories for a different compound; they

  • smarts (str) – A SMARTS-string indicating the moiety to cut out of each compound; in each case, the match returned by rdkit.Chem.Mol.GetSubstrucMatch() (not necessarily the only possible match) will be used; if no SMARTS is provided, a minimal common submol will be extracted using rdFMCS.FindMCS

  • optional – A SMARTS-string indicating the moiety to cut out of each compound; in each case, the match returned by rdkit.Chem.Mol.GetSubstrucMatch() (not necessarily the only possible match) will be used; if no SMARTS is provided, a minimal common submol will be extracted using rdFMCS.FindMCS

  • vis (bool) – Whether to display a visual indication of the match

  • optional – Whether to display a visual indication of the match

Return type:

An Iterable of ``xr.DataArray``s

_combine_compounds_unstacked(compounds, names=None, concat_kws=None)
_combine_compounds_stacked(compounds, names=None, concat_kws=None)
combine_analogs(ensembles, smarts='', names=None, vis=False, *, concat_kws=None)

Combine ensembles for different compounds by finding the moieties they have in common

Parameters:
  • ensembles (Iterable[xarray.DataArray]) –

    An Iterable of ``xr.DataArray``s, each containing the geometries of an ensemble of trajectories for a different compound; these trajectories should all be in the same format, i.e.:

    • all stacked (with ‘frames’ dimension indexed by’trajid’ and ‘time’ MultiIndex levels)

    • all unstacked (with independent ‘trajid’ and ‘time’ dimensions)

  • smarts (str) – A SMARTS-string indicating the moiety to cut out of each compound; in each case, the match returned by rdkit.Chem.Mol.GetSubstrucMatch() (not necessarily the only possible match) will be used; if no SMARTS is provided, a minimal common submol will be extracted using rdFMCS.FindMCS

  • names (Iterable[str] | None) – An Iterable of Hashable to identify the compounds; these values will end up in the compound coordinate, by default None

  • vis (bool) – Whether to display a visual indication of the match, by default False

  • concat_kws (dict[str, Any]) – Keyword arguments for internal calls to xr.concat

Returns:

  • An xr.Dataset of trajectories, with a MultiIndex level identifying each

  • trajectory by its compound name (or index, if no names were provided)

  • and trajid

Raises:

ValueError – If the ensembles provided are in a mixture of formats (i.e. some have trajectories stacked, others unstacked)

Return type:

xarray.DataArray