shnitsel.geo.geocalc

This module will contain two types of closely-related functionality: - generic geometry functions - functions that use RDKit to identify internal coordinates, and then the above to calculate the values Currently, the first category of function is found under postprocess

Functions

dnorm(a)

Calculate the norm along the direction dimension. All other dimensions are maintaned

dcross(a, b)

Generalized cross vector product in the dimension of direction.

ddot(a, b)

Dot product in the dimension of direction.

angle_(a, b)

Helper function to calculate the angle between the entries in a and b based on their coordinates in the direction dimension.

angle_cos_sin_(a, b)

returns the cosine and sine of the angle between two vectors

normal(a, b, c)

Calculate normal vectors on the planes through corresponding points in a, b and c.

dihedral_(a, b, c, d)

Function to calculate the limited (up to +-pi/2 radian) dihedral angle between the points in arrays a,b,c and d.

full_dihedral_(a, b, c, d)

Function to calculate the signed/full dihedral angle (up to +-pi radian) between the points in arrays a,b,c and d.

dihedral(atXYZ, i, j, k, l, *[, deg, full])

Calculate all dihedral angles between the atoms specified.

angle(atXYZ, i, j, k, *[, deg])

Method to calculate the angle between atoms i,j, and k in the positions DataArray throughout time.

distance(atXYZ, i, j)

Method to calculate the various distances between atoms i and j throughout time

_ndarray_of_tuples(da, dim)

_check_matches(matches_or_mol, atXYZ[, fn])

_positions(atXYZ, atom_idxs)

_assign_descriptor_coords(obj, atom_idxs, bond_idxs, ...)

_empty_results(atXYZ)

get_bond_lengths(atXYZ[, matches_or_mol])

Identify bonds (using RDKit) and find the length of each bond in each

get_bond_angles(atXYZ[, matches_or_mol, mol, ang])

Identify triples of bonded atoms (using RDKit) and calculate every bond angle for each

get_bond_torsions(atXYZ[, matches_or_mol, signed, ang])

Identify quadruples of bonded atoms (using RDKit) and calculate the corresponding proper bond torsion for each

get_bats(atXYZ[, matches_or_mol, signed, ang, pyr])

Get bond lengths, angles and torsions.

get_bla_chromophor(atXYZ[, matches_or_mol, mol, ang])

Identify triples of bonded atoms (using RDKit) and calculate every bond angle for each

identify_pyramids(mol)

Identify atoms with three bonds (using RDKit), specifically chain-internal atoms with

pyramid_(a, b, c, x)

Method to calculate the pyramidalization angle of a quadruple of atoms.

get_pyramids(atXYZ[, pyramid_idxs, mol, deg, signed])

Identify atoms with three bonds (using RDKit) and calculate the corresponding pyramidalization angles

center_geoms(atXYZ[, by_mass])

Helper function to set the center of the geometry (i.e. mean along the atom axis) to zero.

rotational_procrustes_(A, B[, weight])

Rotationally align the geometrie(s) in A to the single geometry in B.

rotational_procrustes(A, B[, dim0, dim1, weight])

Rotationally align the geometrie(s) in A to the single geometry in B.

kabsch(atXYZ[, reference_or_indexers])

Rotationally align the molecular geometries in atXYZ to a single molecular geometry

Module Contents

dnorm(a)

Calculate the norm along the direction dimension. All other dimensions are maintaned

Parameters:

a (xr.DataArray | xr.Variable) – The data array to perform the norming on

Returns:

Resulting dataarray after calculation of the norm in the dimension `direction.

Return type:

xr.DataArray | xr.Variable

dcross(a, b)

Generalized cross vector product in the dimension of direction.

Parameters:
  • a (xr.DataArray | xr.Variable) – The first array to use for the binary operation

  • b (xr.DataArray | xr.Variable) – The second array to use for the binary operation

Returns:

The resulting array of the cross-product

Return type:

xr.DataArray | xr.Variable

ddot(a, b)

Dot product in the dimension of direction.

Parameters:
  • a (xr.DataArray | xr.Variable) – The first array to use for the binary operation

  • b (xr.DataArray | xr.Variable) – The second array to use for the binary operation

Returns:

The resulting array of the dot-product still retaining all other dimensions except direction.

Return type:

xr.DataArray | xr.Variable

angle_(a, b)

Helper function to calculate the angle between the entries in a and b based on their coordinates in the direction dimension.

Parameters:
  • a (xr.DataArray | xr.Variable) – The first array to use for the binary operation

  • b (xr.DataArray | xr.Variable) – The second array to use for the binary operation

Returns:

The resulting array of the angle calculation still retaining all other dimensions except direction.

Return type:

xr.DataArray | xr.Variable

angle_cos_sin_(a, b)

returns the cosine and sine of the angle between two vectors

normal(a, b, c)

Calculate normal vectors on the planes through corresponding points in a, b and c.

The normal vector will be calculated based on the position in direction dimension.

Parameters:
  • a (xr.DataArray | xr.Variable) – The first array to use for the ternary operation

  • b (xr.DataArray | xr.Variable) – The second array to use for the ternary operation

  • c (xr.DataArray | xr.Variable) – The third array to use for the ternary operation

Returns:

An array with all dimensions equal to those of a, b, and c but holding normal vectors along the direction dimension.

Return type:

xr.DataArray | xr.Variable

dihedral_(a, b, c, d)

Function to calculate the limited (up to +-pi/2 radian) dihedral angle between the points in arrays a,b,c and d.

Parameters:
  • a (xr.DataArray | xr.Variable) – The first array to use for the operation

  • b (xr.DataArray | xr.Variable) – The second array to use for the operation

  • c (xr.DataArray | xr.Variable) – The third array to use for the operation

  • d (xr.DataArray | xr.Variable) – The fourth array to use for the operation

Returns:

The array of dihedral angels between the four input arrays.

Return type:

xr.DataArray | xr.Variable

full_dihedral_(a, b, c, d)

Function to calculate the signed/full dihedral angle (up to +-pi radian) between the points in arrays a,b,c and d.

Parameters:
  • a (xr.DataArray | xr.Variable) – The first array to use for the operation

  • b (xr.DataArray | xr.Variable) – The second array to use for the operation

  • c (xr.DataArray | xr.Variable) – The third array to use for the operation

  • d (xr.DataArray | xr.Variable) – The fourth array to use for the operation

Returns:

The array of full signed dihedral angels between the four input arrays.

Return type:

xr.DataArray | xr.Variable

dihedral(atXYZ, i, j, k, l, *, deg=False, full=False)

Calculate all dihedral angles between the atoms specified. The atoms specified needed be bonded.

Parameters:
  • atXYZ (shnitsel.core.typedefs.AtXYZ) – A DataArray of coordinates, with atom and direction dimensions

  • i (int) – The four atom indices

  • j (int) – The four atom indices

  • k (int) – The four atom indices

  • l (int) – The four atom indices

  • deg (bool) – Whether to return angles in degrees (True) or radians (False), by default False

  • full (bool) – Whether to return signed angles between -180° and 180° (True) or unsigned angles between 0 and 180° (False), by default False

Return type:

A DataArray containing dihedral angles

angle(atXYZ, i, j, k, *, deg=False)

Method to calculate the angle between atoms i,j, and k in the positions DataArray throughout time.

Can return results in radian (default) and degrees (if deg=True)

Parameters:
  • atXYZ (AtXYZ) – DataArray with positions

  • i (int) – Index of first atom

  • j (int) – Index of second atom

  • k (int) – Index of third atom

  • deg (bool, optional) – Flag whether the results should be in degrees instead of radian. Defaults to False.

Returns:

The resulting angles between the denoted atoms.

Return type:

xr.DataArray

distance(atXYZ, i, j)

Method to calculate the various distances between atoms i and j throughout time

Parameters:
  • atXYZ (AtXYZ) – Array with atom positions

  • i (int) – Index of the first atom

  • j (int) – Index of the second atom

Returns:

The resulting array holding the pairwise distance between i and j.

Return type:

xr.DataArray

_ndarray_of_tuples(da, dim)
_check_matches(matches_or_mol, atXYZ, fn=flag_bats)
_positions(atXYZ, atom_idxs)
_assign_descriptor_coords(obj, atom_idxs, bond_idxs, bond_types, fragment_objs, tex_pattern, per_atom_coords=True)
Parameters:
Return type:

xarray.DataArray

_empty_results(atXYZ)
get_bond_lengths(atXYZ, matches_or_mol=None)

Identify bonds (using RDKit) and find the length of each bond in each frame.

Parameters:
Return type:

An xarray.DataArray of bond lengths with dimensions frame and bond.

Raises:

UserWarning – If both matches and mol are specified.

get_bond_angles(atXYZ, matches_or_mol=None, mol=None, ang=False)

Identify triples of bonded atoms (using RDKit) and calculate every bond angle for each frame.

Parameters:
  • atXYZ (xarray.DataArray) – An xarray.DataArray of molecular coordinates, with dimensions frame, atom and direction

  • matches_or_mol (dict | rdkit.Chem.Mol | None) – A list containing information for each internal coordinate to be calculated. It may be convenient to use shnitsel.geo.geomatch.flag_angles() to create a dictionary in the correct format, and then customize it. Alternatively, you may supply an RDKit Mol object, which is passed to shnitsel.geo.geomatch.flag_bats(). If this argument is omitted altogether, an RDKit Mol object is generated using shnitsel.bridges.default_mol() and used as above.

  • optional – A list containing information for each internal coordinate to be calculated. It may be convenient to use shnitsel.geo.geomatch.flag_angles() to create a dictionary in the correct format, and then customize it. Alternatively, you may supply an RDKit Mol object, which is passed to shnitsel.geo.geomatch.flag_bats(). If this argument is omitted altogether, an RDKit Mol object is generated using shnitsel.bridges.default_mol() and used as above.

  • mol (rdkit.Chem.Mol | None) – An RDKit Mol object, which is generated from atXYZ if this argument is omitted.

  • optional – An RDKit Mol object, which is generated from atXYZ if this argument is omitted.

  • ang (Literal[False, 'deg', 'rad']) – If False (the default), returns sines and cosines; if set to ‘deg’, returns angles in degrees if set to ‘rad’, returns angles in radians

  • optional – If False (the default), returns sines and cosines; if set to ‘deg’, returns angles in degrees if set to ‘rad’, returns angles in radians

Return type:

An xarray.DataArray of bond angles with dimensions frame and angle.

Raises:

UserWarning – If both matches and mol are specified.

get_bond_torsions(atXYZ, matches_or_mol=None, signed=None, ang=False)

Identify quadruples of bonded atoms (using RDKit) and calculate the corresponding proper bond torsion for each frame.

Parameters:
  • atXYZ (xarray.DataArray) – An xarray.DataArray of molecular coordinates, with dimensions frame, atom and direction

  • matches_or_mol (dict | None) – A list containing information for each internal coordinate to be calculated. It may be convenient to use shnitsel.geo.geomatch.flag_dihedrals() to create a dictionary in the correct format, and then customize it. Alternatively, you may supply an RDKit Mol object, which is passed to shnitsel.geo.geomatch.flag_bats(). If this argument is omitted altogether, an RDKit Mol object is generated using shnitsel.bridges.default_mol() and used as above.

  • optional – A list containing information for each internal coordinate to be calculated. It may be convenient to use shnitsel.geo.geomatch.flag_dihedrals() to create a dictionary in the correct format, and then customize it. Alternatively, you may supply an RDKit Mol object, which is passed to shnitsel.geo.geomatch.flag_bats(). If this argument is omitted altogether, an RDKit Mol object is generated using shnitsel.bridges.default_mol() and used as above.

  • signed (bool | None) – Whether to distinguish between clockwise and anticlockwise rotation, when returning angles; by default, do not distinguish.

  • optional – Whether to distinguish between clockwise and anticlockwise rotation, when returning angles; by default, do not distinguish.

  • ang (Literal[False, 'deg', 'rad']) – If False (the default), returns sines and cosines; if set to ‘deg’, returns angles in degrees if set to ‘rad’, returns angles in radians

  • optional – If False (the default), returns sines and cosines; if set to ‘deg’, returns angles in degrees if set to ‘rad’, returns angles in radians

Return type:

An xarray.DataArray of bond torsions with dimensions frame and torsion.

get_bats(atXYZ, matches_or_mol=None, signed=None, ang=False, pyr=False)

Get bond lengths, angles and torsions.

Parameters:
  • atXYZ (xarray.DataArray) – The coordinates to use.

  • matches_or_mol (dict | rdkit.Chem.Mol | None) – An rdkit Mol object used to determine connectivity; by default this is determined automatically based on the first frame of atXYZ. Alternatively, a dictionary containing match information, as produced by one of the flag_* functions.

  • optional – An rdkit Mol object used to determine connectivity; by default this is determined automatically based on the first frame of atXYZ. Alternatively, a dictionary containing match information, as produced by one of the flag_* functions.

  • signed (bool | None) – Whether to distinguish between clockwise and anticlockwise rotation, when returning angles as opposed to cosine & sine values; by default, do not distinguish. NB. This applies only to the dihedrals, not to the three-center angles. The latter are always unsigned.

  • optional – Whether to distinguish between clockwise and anticlockwise rotation, when returning angles as opposed to cosine & sine values; by default, do not distinguish. NB. This applies only to the dihedrals, not to the three-center angles. The latter are always unsigned.

  • ang (Literal[False, 'deg', 'rad']) – If False (the default), returns sines and cosines; if set to ‘deg’, returns angles in degrees if set to ‘rad’, returns angles in radians

  • optional – If False (the default), returns sines and cosines; if set to ‘deg’, returns angles in degrees if set to ‘rad’, returns angles in radians

  • pyr – Whether to include pyramidalizations from shnitsel.core.geom.get_pyramids()

Return type:

An xarray.DataArray containing bond lengths, angles and tensions.

Examples

>>> import shnitsel as st
>>> from shnitsel.core import geom
>>> frames = st.open_frames('/tmp/A03_filtered.nc')
>>> geom.get_bats(frames['atXYZ'])
get_bla_chromophor(atXYZ, matches_or_mol=None, mol=None, ang=False)

Identify triples of bonded atoms (using RDKit) and calculate every bond angle for each frame.

Parameters:
Return type:

An xarray.DataArray of bond angles with dimensions frame and angle.

Raises:

UserWarning – If both matches and mol are specified.

identify_pyramids(mol)

Identify atoms with three bonds (using RDKit), specifically chain-internal atoms with a single carbon, and chain-terminal atoms with two carbons.

Each ‘pyramid’ consists of four atoms. Three of these (the “plane” atoms) are consecutive, and the fourth (the “bending” atom) is bonded to the middle atom of the plane atoms. The pyramidalization is the the angle between the plane of the plane atoms and the bond from the middle plane atom to the bending atom.

Two sorts of pyramids are currently handled: terminal and chain-internal.

  • Terminal pyramids are those where the central atom is bonded to two hydrogens and a single non-hydrogen; for these, the central atom and the hydrogens constitute the plane and the non-hydrogen becomes the bending atom.

  • Chain-internal pyramids are those where the central atom is bonded to non-hydrogens and a single hydrogen; for these, the central atom and the non-hydrogens constitute the plane and the hydrogen becomes the bending atom.

Parameters:

mol (rdkit.Chem.Mol) – An RDKit Mol object

Returns:

  • dict

    {

    x1: (a1, b1, c1), x2: (a2, b2, c2), …

    }

  • where

    • a, b, c are indices of contiguous atoms used to determine the plane

    • x is the index of an atom bonded to atom b, considered to be bending

    above and beneath the plane

Return type:

dict[int, list[int]]

pyramid_(a, b, c, x)

Method to calculate the pyramidalization angle of a quadruple of atoms.

The result will be pi/2 minus the angle between the normal of ABC and the vector BX. (I.e.: the pyramidalization at atom b)

Parameters:
  • a (xr.DataArray) – The first array to use for the operation

  • b (xr.DataArray) – The second array to use for the operation

  • c (xr.DataArray) – The third array to use for the operation

  • x (xr.DataArray) – The fourth array to use for the operation

Returns:

The array of pyramidalization angles

Return type:

xr.DataArray

get_pyramids(atXYZ, pyramid_idxs=None, mol=None, deg=False, signed=True)

Identify atoms with three bonds (using RDKit) and calculate the corresponding pyramidalization angles for each frame.

Each ‘pyramid’ consists of four atoms. Three of these (the “plane” atoms) are consecutive, and the fourth (the “bending” atom) is bonded to the middle atom of the plane atoms. The pyramidalization is the the angle between the plane of the plane atoms and the bond from the middle plane atom to the bending atom.

Two sorts of pyramids are currently handled: terminal and chain-internal.

  • Terminal pyramids are those where the central atom is bonded to two hydrogens and a single non-hydrogen; for these, the central atom and the hydrogens constitute the plane and the non-hydrogen becomes the bending atom.

  • Chain-internal pyramids are those where the central atom is bonded to non-hydrogens and a single hydrogen; for these, the central atom and the non-hydrogens constitute the plane and the hydrogen becomes the bending atom.

Parameters:
  • atXYZ (xarray.DataArray) – An xarray.DataArray of molecular coordinates, with dimensions frame, atom and direction

  • pyramid_idxs (dict[int, list[int]] | None) – A dictionary containing types of bonds as keys, and lists of atom index pair as values. It may be convenient to use shnitsel.core.geom.identify_pyramids() to create a dictionary in the correct format, and then customize it. If omitted, relevant sets of atoms are identified automatically based on the mol argument.

  • mol (rdkit.Chem.Mol | None) – An RDKit Mol object, which is generated from atXYZ if this argument is omitted.

  • deg (bool) – Whether to return angles in degrees (as opposed to radians), by default False

Return type:

An xarray.DataArray of pyramidalizations with dimensions frame and atom.

Raises:

UserWarning – If both pyramid_idxs and mol are specified.

center_geoms(atXYZ, by_mass=False)

Helper function to set the center of the geometry (i.e. mean along the atom axis) to zero.

Parameters:
  • atXYZ (AtXYZ) – Array of positional data

  • by_mass (Literal[False], optional) – Flag whether the centering/average should be center of mass or just plain average of positions. Defaults to False.

Raises:

NotImplementedError – Centering the COM instead of the mean is currently not implemented.

Returns:

Resulting positions after centering.

Return type:

AtXYZ

rotational_procrustes_(A, B, weight=None)

Rotationally align the geometrie(s) in A to the single geometry in B.

Parameters:
  • A (numpy.ndarray) – The geometries to process with shape (n_geometries, n_points, n_coordinates)

  • B (numpy.ndarray) – The reference geometry with shape (n_points, n_coordinates)

  • weight (Iterable[float] | None) – How much importance should be given to the alignment of each point, by default equal importance

  • optional – How much importance should be given to the alignment of each point, by default equal importance

Return type:

An array with the same shape as A

rotational_procrustes(A, B, dim0='atom', dim1='direction', weight=None)

Rotationally align the geometrie(s) in A to the single geometry in B.

Parameters:
  • A (xarray.DataArray) – The geometries to process

  • B (xarray.DataArray) – The reference geometry

  • dim0 (str) – The name of the dimension over points to be rotated; must be present in A and ``B`; by default ‘atom’

  • optional – The name of the dimension over points to be rotated; must be present in A and ``B`; by default ‘atom’

  • dim1 (str) – The name of the dimension over the coordinates of the aforementioned points; must be present in A and ``B`; by default ‘direction’

  • optional – The name of the dimension over the coordinates of the aforementioned points; must be present in A and ``B`; by default ‘direction’

  • weight (Iterable[float] | None) – How much importance should be given to the alignment of each point (atom), by default equal importance

  • optional – How much importance should be given to the alignment of each point (atom), by default equal importance

Return type:

An xr.DataArray with the same shape as A

kabsch(atXYZ, reference_or_indexers=None, **indexers_kwargs)

Rotationally align the molecular geometries in atXYZ to a single molecular geometry

Parameters:
  • atXYZ (xarray.DataArray) – The geometries to process (with dims ‘atom’, ‘direction’)

  • reference_or_indexers (xarray.DataArray | dict | None) – Either a reference geometry (with dims ‘atom’, ‘direction’) or an indexer dictionary which will be passed to atXYZ.sel()

  • optional – Either a reference geometry (with dims ‘atom’, ‘direction’) or an indexer dictionary which will be passed to atXYZ.sel()

  • **indexer_kwargs – The keyword-argument form of the indexer to be passed to atXYZ.sel()

Return type:

The aligned geometries

Raises:

ValueError – If nothing is done to indicate a reference geometry, i.e. neither reference_or_indexers nor indexer_kwargs are passed