shnitsel.geo.geomatch¶
Attributes¶
Functions¶
|
Find all substructure matches of a SMARTS pattern in a molecule. |
|
Extend flagged tuple of bonds, angles or dihedrals by a tuple of bond indices and |
|
|
|
|
|
|
|
Select the appropriate list of tuples based on hierarchy: |
|
Convert a list of flagged atom tuples into bonds and atoms for highlighting. |
|
Return all atoms in the molecule as a dictionary with flags. |
|
Set flags for atoms based on whether they are contained within substructure matches. |
|
Flag atoms in a molecule based on substructure patterns or atom indices. |
|
Return all bonds in the molecule as a dictionary with flags. |
|
Set flags for bonds based on whether they are contained within substructure matches. |
|
Flag bonds in a molecule based on substructure patterns or atom indices. |
|
Return all angles in the molecule as a dictionary with flags. |
|
Flag angles as active (1) or inactive (0) based on substructure matches. |
|
Flag molecule angles based on SMARTS patterns and/or atom indices. |
|
Return all dihedrals in the molecule as a dictionary with flags. |
|
Flag dihedrals as active (1) if all four atoms (i, j, k, l) |
|
Flag dihedrals in a molecule based on SMARTS and/or atom indices. |
|
Compute and flag bonds, angles, and dihedrals in a single call, |
|
Compute and flag bonds, angles, and dihedrals in a single call, |
|
Generate a single SVG image containing multiple copies of a molecule, |
|
Build a SMARTS pattern for a linear conjugated system with n_double |
|
Detect conjugated chromophores defined either by SMARTS, number of |
|
Module Contents¶
- st_yellow¶
- __match_pattern(mol, smarts)¶
Find all substructure matches of a SMARTS pattern in a molecule.
- __get_bond_info(mol, flagged_tuples)¶
Extend flagged tuple of bonds, angles or dihedrals by a tuple of bond indices and an information of their respective bond types as double (1.0: SINGLE, 1.5: AROMATIC, 2.0: DOUBLE, and 3.0: TRIPLE)
- Parameters:
mol (RDKit Mol) – Molecule object.
flagged_tuples (list of tuples) – Each entry: (flag, atom_tuple) Example: [(1, (0,1,2,3)), (0, (1,2,3,4))]
- Returns:
flagged_tuples_binfo – Each entry: (flag, atom_tuple, bond_tuple, bondtype_tuple, rdkit.Chem.rdchem.Mol of submoli/subgraph ) Example: [(0, (5, 0, 1), (6, 0), (1.0, 2.0), rdkit.Chem.rdchem.Mol object)]
- Return type:
list of tuples
- __level_to_geo(flag_level)¶
- __get_color_atoms(d_flag, flag_level)¶
- __get_color_bonds(d_flag, flag_level)¶
- __collect_tuples(d_flag)¶
Select the appropriate list of tuples based on hierarchy: dihedrals -> angles -> bonds.
- Each value is a list of tuples of the form:
(flag, atom_tuple, bond_tuple, bondtype_tuple, Mol)
- __get_highlight_molimg(mol, d_flag, highlight_color=st_yellow, width=300, height=300)¶
Convert a list of flagged atom tuples into bonds and atoms for highlighting.
- Parameters:
mol (RDKit Mol) – Molecule object.
d_flag (dictionary) – keys: ‘atoms’, ‘bonds’, ‘angles’ or ‘dihedrals’ values: list of tuples Each entry: (flag, atom_tuple, bond_tuple, bondtype_tuple, Mol) Example: [(1, (0,1), (0), (1.0)), (0, (1,2), (1), (2.0))]
highlight_color (tuple, optional) – RGB tuple for highlighting bonds/atoms.
- Returns:
img – RDKit molecule image with highlighted atoms and bonds.
- Return type:
PIL.Image
- __get_all_atoms(mol)¶
Return all atoms in the molecule as a dictionary with flags.
All atoms are initially flagged as active (1).
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – RDKit molecule object.
- Returns:
Dictionary with key ‘atoms’ mapping to a list of tuples. Each tuple has the form (flag, (atom_idx, atom_symbol)), where:
- flagint
1 indicates active atom.
- (atom_idx, atom_symbol)tuple
The atom index in the molecule and its atomic symbol (e.g., (0, ‘C’), (1, ‘H’)).
- Return type:
- __get_atoms_by_indices(match_indices_list, d_atoms)¶
Set flags for atoms based on whether they are contained within substructure matches.
Atoms are active (1) if they represent also nodes in the matched substructures. All other atoms are flagged as inactive (0).
- Parameters:
- Returns:
Updated atoms dictionary with flags updated based on the substructure matches.
- Return type:
- flag_atoms(mol, smarts=None, t_idxs=(), draw=False)¶
Flag atoms in a molecule based on substructure patterns or atom indices.
- Atoms can be flagged in four ways:
No SMARTS or target indices provided: all atoms are returned as active.
SMARTS provided: atoms belonging to the substructure matches are active; others are inactive.
Target atom indices provided: only atoms in the provided tuple are active; others are inactive.
Both SMARTS and target indices provided: only the intersection of SMARTS matches and t_idxs are considered active. Warnings are issued if there is no overlap or partial overlap.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – RDKit molecule object.
smarts (str, optional) – SMARTS pattern to filter atoms. Default is None.
t_idxs (tuple, optional) – Tuple of atom indices to filter atoms. Default is empty tuple.
draw (bool, optional) – If True, returns an image highlighting the active atoms. Default is True.
- Returns:
If draw=False: dictionary with key ‘atoms’ mapping to a list of tuples (flag, (atom_idx, atom_symbol)). If draw=True: tuple of (atom dictionary, highlighted molecule image).
- Return type:
- __get_all_bonds(mol)¶
Return all bonds in the molecule as a dictionary with flags.
All bonds are initially flagged as active (1).
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – RDKit molecule object.
- Returns:
Dictionary with key ‘bonds’ mapping to a list of tuples. Each tuple has the form (flag, (atom_idx1, atom_idx2)), where:
- flagint
0 indicates active bond.
- (atom_idx1, atom_idx2)tuple
Pair of atom indices defining the bond.
- Return type:
- __get_bonds_by_indices(match_indices_list, d_bonds)¶
Set flags for bonds based on whether they are contained within substructure matches.
Bonds are active (1) if both atoms belong to any of the matched substructures. All other bonds are flagged as inactive (0).
- Parameters:
- Returns:
Updated bond dictionary with the same structure as d_bonds, but flags updated based on the substructure matches.
- Return type:
- flag_bonds(mol, smarts=None, t_idxs=(), draw=True)¶
Flag bonds in a molecule based on substructure patterns or atom indices.
- Bonds can be flagged in three ways:
No SMARTS or target indices provided: all bonds are returned as active.
SMARTS provided: bonds belonging to the substructure matches are active; others are inactive.
Target atom indices provided: bonds entirely contained in the atom index tuple are active; others are inactive.
Both SMARTS and target indices provided: only the intersection of SMARTS matches and t_idxs are considered active. Warnings are issued if there is no overlap or partial overlap.
- Parameters:
- Returns:
Dictionary with key ‘bonds’ mapping to a list of tuples (flag, (atom_idx1, atom_idx2)).
- Return type:
- __get_all_angles(mol)¶
Return all angles in the molecule as a dictionary with flags.
Angles are defined as atom triples (i, j, k) where i–j and j–k are both bonded.
All angles are initially flagged as active (1).
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – RDKit molecule object.
- Returns:
Dictionary with key ‘angles’ mapping to a list of tuples: (flag, (i, j, k))
- Return type:
- __get_angles_by_indices(match_list, d_angles)¶
Flag angles as active (1) or inactive (0) based on substructure matches.
An angle (i, j, k) is active if all three indices belong to the same match tuple.
- flag_angles(mol, smarts=None, t_idxs=(), draw=False)¶
Flag molecule angles based on SMARTS patterns and/or atom indices.
Modes of operation¶
No SMARTS and no t_idxs: return all angles as active.
SMARTS only: angles part of any SMARTS match are active.
t_idxs only: only angles fully inside t_idxs are active.
- SMARTS + t_idxs:
If SMARTS fails: warn, return angles from t_idxs only.
If SMARTS matches but no overlap: warn, return angles from t_idxs.
If partial overlap: warn, return only angles in the intersection.
- param mol:
RDKit molecule object.
- type mol:
rdkit.Chem.rdchem.Mol
- param smarts:
SMARTS pattern representing the angle substructure.
- type smarts:
str, optional
- param t_idxs:
Atom indices to filter angles by.
- type t_idxs:
tuple, optional
- returns:
Dictionary with key ‘angles’ mapping to list of (flag, (i, j, k)). Active = 1, inactive = 0.
- rtype:
dict
- __get_all_dihedrals(mol)¶
Return all dihedrals in the molecule as a dictionary with flags.
- Dihedral quadruples are of the form (i, j, k, l) where:
i–j, j–k, and k–l are all bonds.
All dihedrals are initially flagged as active (1).
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – RDKit molecule object.
- Returns:
Dictionary with key ‘dihedrals’ mapping to a list of: (flag, (i, j, k, l))
- Return type:
- __get_dihedrals_by_indices(match_list, d_dihedrals)¶
Flag dihedrals as active (1) if all four atoms (i, j, k, l) belong to the same SMARTS match.
- flag_dihedrals(mol, smarts=None, t_idxs=(), draw=False)¶
Flag dihedrals in a molecule based on SMARTS and/or atom indices.
Modes¶
No SMARTS + no t_idxs: return all dihedrals active
SMARTS only: dihedrals part of SMARTS matches are active
t_idxs only: dihedrals fully inside t_idxs are active
- SMARTS + t_idxs: Find intersection behavior:
No SMARTS match: return t_idxs only.
No overlap: return t_idxs only.
Overlap: return only intersecting dihedrals.
- param mol:
Molecule under study.
- type mol:
RDKit Mol
- param smarts:
SMARTS pattern.
- type smarts:
str, optional
- param t_idxs:
Atom index tuple for filtering.
- type t_idxs:
tuple, optional
- returns:
{‘dihedrals’: [(flag, (i,j,k,l)), …]}
- rtype:
dict
- flag_bats(mol, smarts=None, t_idxs=(), draw=False)¶
Compute and flag bonds, angles, and dihedrals in a single call, automatically determining which interactions can be filtered based on the size of the SMARTS pattern and/or the number of atom indices supplied in t_idxs.
Rules¶
- If SMARTS has:
2 atoms: only bonds can be filtered 3 atoms: bonds + angles can be filtered >=4 atoms: bonds + angles + dihedrals can be filtered
- If t_idxs has:
len=2: only bonds len=3: bonds + angles len>=4: bonds + angles + dihedrals
If both SMARTS and t_idxs are provided, the maximal allowed degree is the minimum of the two limits.
- param mol:
Molecule under study.
- type mol:
rdkit.Chem.rdchem.Mol
- param smarts:
SMARTS pattern for filtering interactions.
- type smarts:
str, optional
- param t_idxs:
Atom indices for filtering interactions.
- type t_idxs:
tuple[int], optional
- param draw:
rdkit.Chem.rdchem.Mol object
- type draw:
True or False flag for returning flagged
- returns:
dict –
- {
‘bonds’: […], ‘angles’: […], ‘dihedrals’: […]
}
Interaction types that cannot be filtered due to SMARTS/t_idx size
are returned fully active.
if draw (rdkit.Chem.rdchem.Mol object of filtered features)
- flag_bats_multiple(mol, l_smarts=None, l_t_idxs=(), draw=False)¶
Compute and flag bonds, angles, and dihedrals in a single call, ifor multiple structural features
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – Molecule under study.
l_smarts (list[str], optional) – SMARTS patterns for filtering interactions.
t_idxs (list[tuple[int]], optional) – list of atom indices tuples for filtering interactions.
draw (True or False flag for returning flagged) – rdkit.Chem.rdchem.Mol object
- Returns:
dict –
- {
‘bonds’: […], ‘angles’: […], ‘dihedrals’: […]
}
Interaction types that cannot be filtered due to SMARTS/t_idx size
are returned fully active.
if draw (rdkit.Chem.rdchem.Mol object of filtered features)
- Return type:
- __get_img_multiple_mols(mol, d_multi_flag, l_patterns, l_levels)¶
Generate a single SVG image containing multiple copies of a molecule, each highlighted according to supplied atom/bond pattern levels.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object. The same molecule is drawn once for each entry in l_patterns / l_levels.
d_multi_flag (dict) –
Dictionary with mapping information on bonds, angles torsion Each entry: {‘bonds’: [(0, (5, 0, 1), (6, 0), (1.0, 2.0), rdkit.Chem.rdchem.Mol object)]}
Each entry contains information on the atom indexes (2nd) and bond indexes (3rd) element needed by the helper functions __get_color_atoms() and __get_color_bonds() to extract the atoms and bonds to be highlighted.
l_patterns (Iterable) – A list of patterns (smarts or tuples of atom indices) indexing into d_multi_flag.
l_levels (Iterable) – List of highlight “levels” corresponding to l_patterns. Each level is passed to the highlight extraction helpers to control the highlight color intensity or style.
- Returns:
A grid image produced by rdkit.Chem.Draw.MolsToGridImage, containing all molecule renderings arranged in a single row. Each copy of the molecule is highlighted with its own atom and bond sets determined by the input patterns (smarts or atom indices).
- Return type:
PIL.Image.Image or IPython.display.SVG
- __build_conjugated_smarts(n_double, elems='#6,#7,#8,#15,#16')¶
Build a SMARTS pattern for a linear conjugated system with n_double alternating double bonds.
Example (n_double=2): [#6,#7]=[#6,#7]-[#6,#7]=[#6,#7]
- __match_bla_chromophor(mol, smarts=None, n_double=None, elems='#6,#7,#8,#15,#16')¶
Detect conjugated chromophores defined either by SMARTS, number of alternating double bonds, or automatically by maximum extension.
Decision logic¶
If smarts is given → use SMARTS
If both smarts and n_double are given → validate consistency
If only n_double is given → generate SMARTS
If neither is given → search for maximum conjugated chromophore
- param mol:
Molecule to search.
- type mol:
rdkit.Chem.Mol
- param smarts:
Explicit SMARTS pattern defining the chromophore.
- type smarts:
str, optional
- param n_double:
Number of alternating double bonds.
- type n_double:
int, optional
- param elems:
Allowed atoms in conjugation.
- type elems:
str
- returns:
- {
“smarts”: SMARTS used, “n_double”: number of double bonds, “matches”: substructure matches
}
- rtype:
dict