shnitsel.bridges#

This submodule contains functions used to interface with other packages and programs, especially RDKit.

Attributes#

Functions#

to_xyz(da[, comment, units])

Convert an xr.DataArray of molecular geometry to an XYZ string

traj_to_xyz(traj_atXYZ[, units])

Convert an entire trajectory's worth of geometries to an XYZ string

to_mol(atXYZ_frame[, charge, covFactor, to2D, ...])

Convert a single frame's geometry to an RDKit Mol object

numbered_smiles_to_mol(smiles)

Convert a numbered SMILES-string to a analogically-numbered Mol object

_most_stable_frame(atXYZ, obj)

Find the frame, out of all the initial conditions,

construct_default_mol(obj[, to2D, charge, ...])

Try many ways to get a representative Mol object for an ensemble:

smiles_map(atXYZ_frame[, charge, covFactor])

Convert a geometry to a SMILES-string, retaining atom order

Module Contents#

to_xyz(da, comment='#', units='angstrom')#

Convert an xr.DataArray of molecular geometry to an XYZ string

Parameters:
  • da (shnitsel.core.typedefs.AtXYZ) – A molecular geometry – should have dimensions ‘atom’ and ‘direction’

  • comment – The comment line for the XYZ, by default ‘#’

  • units – The units to which to convert before creating the XYZ string

Return type:

The XYZ data as a string

Notes

The units of the outputs will be the same as the array; consider converting to angstrom first, as most tools will expect this.

traj_to_xyz(traj_atXYZ, units='angstrom')#

Convert an entire trajectory’s worth of geometries to an XYZ string

Parameters:
  • traj_atXYZ (shnitsel.core.typedefs.AtXYZ) – Molecular geometries – should have dimensions ‘atom’ and ‘direction’; should also be groupable by ‘time’ (i.e. either have a ‘time’ dimension or a ‘time’ coordinate)

  • units – The units to which to convert before creating the XYZ string

Return type:

The XYZ data as a string, with time indicated in the comment line of each frame

Notes

The units of the outputs will be the same as the array; consider converting to angstrom first, as most tools will expect this.

to_mol(atXYZ_frame, charge=None, covFactor=1.2, to2D=True, molAtomMapNumber=None, atomNote=None, atomLabel=None)#

Convert a single frame’s geometry to an RDKit Mol object

Parameters:
  • atXYZ_frame (shnitsel.core.typedefs.AtXYZ | xarray.Dataset | shnitsel.data.dataset_containers.shared.ShnitselDataset) – The xr.DataArray object to be converted; must have ‘atom’ and ‘direction’ dims, must not have ‘frame’ dim.

  • charge (int | None) – Charge of the molecule, used by RDKit to determine bond orders; if None (the default), this function will try charge=0 and leave the bond orders undetermined if that causes an error; otherwise failure to determine bond order will raise an error.

  • covFactor (float) – Scales the distance at which atoms are considered bonded, by default 1.2

  • to2D (bool) – Discard 3D information and generate 2D conformer (useful for displaying), by default True

  • molAtomMapNumber (list | Literal[True] | None) – Set the molAtomMapNumber properties to values provided in a list, or (if True is passed) set the properties to the respective atom indices

  • atomNote (list | Literal[True] | None) – Behaves like the molAtomMapNumber parameter above, but for the atomNote properties

  • atomLabel (list | Literal[True] | None) – Behaves like the molAtomMapNumber parameter above, but for the atomLabel properties

Return type:

An RDKit Mol object

Raises:

ValueError – If charge is not None and bond order determination fails

numbered_smiles_to_mol(smiles)#

Convert a numbered SMILES-string to a analogically-numbered Mol object

Parameters:

smiles (str) – A SMILES string in which each atom is associated with a mapping index, e.g. ‘[H:3][C:1]#[C:0][H:2]’

Returns:

  • An rdkit.Chem.Mol() object with atom indices numbered according

  • to the indices from the SMILES-string

Return type:

rdkit.Chem.Mol

_most_stable_frame(atXYZ, obj)#

Find the frame, out of all the initial conditions, with the lowest ground-state energy; failing that, return the first frame in atXYZ

Parameters:

obj (xarray.Dataset | shnitsel.data.dataset_containers.shared.ShnitselDataset)

Return type:

xarray.DataArray

construct_default_mol(obj, to2D=True, charge=None, molAtomMapNumber=None, atomNote=None, atomLabel=None, silent_mode=False)#

Try many ways to get a representative Mol object for an ensemble:

  1. Use the mol attr (of either obj or obj[‘atXYZ’]) directly

2. Feed the smiles_map attr (of either obj or obj['atXYZ']) to shnitsel.bridges.default_mol() 3. Take the geometry from the first frame of the molecule and the charge specified in the charge attr (charge=0 assumed if not specified) and feed these to shnitsel.bridges.to_mol()

Parameters:
  • obj (xr.Dataset | xr.DataArray | Trajectory | Frames | rc.Mol) – An ‘atXYZ’ xr.DataArray with molecular geometries or an xr.Dataset containing the above as one of its variables or an rc.Mol object that will just be returned.

  • to2D (bool, optional) – Discard 3D information and generate 2D conformer (useful for displaying), by default True

  • charge (int, float or None, optional) – Optional parameter to set the charge of the molecule if not present within the molecule data. If provided as an int, will be interpreted as number of elemental charges. Float will be converted to int and interpreted the same way. If not provided, will attempt to extract charge info from the xarray or Mol object and default to 0 charge if none can be found.

  • molAtomMapNumber (list[str] | Literal[True], optional) – Set the molAtomMapNumber properties to values provided in a list, or (if True is passed) set the properties to the respective atom indices

  • atomNote (list[str] | Literal[True], optional) – Behaves like the molAtomMapNumber parameter above, but for the atomNote properties

  • atomLabel (list[str] | Literal[True], optional) – Behaves like the molAtomMapNumber parameter above, but for the atomLabel properties

  • silent_mode (bool, optional) – Flag to disable logging outputs. Used for internal constructions of molecular structures. By default, status of mol construction is logged (i.e. silent_mode=False).

Return type:

An rdkit.Chem.Mol object

Raises:

ValueError – If the final approach fails

Notes

If this function uses an existing Mol object, it returns a copy. One consequence is that the decoration parameters molAtomMapNumber, atomNote and atomLabel do not affect the existing Mol object.

smiles_map(atXYZ_frame, charge=0, covFactor=1.5)#

Convert a geometry to a SMILES-string, retaining atom order

Parameters:
  • atXYZ_frame – An xr.DataArray of molecular geometry

  • charge – The charge of the molcule, by default 0

  • optional – The charge of the molcule, by default 0

  • covFactor – Scales the distance at which atoms are considered bonded, by default 1.5

  • optional – Scales the distance at which atoms are considered bonded, by default 1.5

Returns:

  • A SMILES-string in which the mapping number indicates the order in which the

  • atoms appeared in the input matrix, e.g. ‘[H (3][C:1]#[C:0][H:2]’)

Return type:

str

default_mol#