{ "cells": [ { "cell_type": "markdown", "id": "f19022e7", "metadata": {}, "source": [ "# General IO with Shnitsel tools\n", "\n", "## Reading input data\n", "\n", "Shnitsel tools is available in the ```shnitsel``` package.\n", "Within the ```shnitsel.io``` module, we offer the handy ```read()``` function to read in a multitude of different formats. \n", "Currently, we support the following file types:\n", "- SHARC outputs, both ICOND and TRAJ formats which have been tested for version 2.0, 2.1 and 3.0 outputs.\n", "- PyrAI2md outputs, reading of NACs and SOCs is currently still limited, but the reading has been tested on version 2.4 outputs and is expected to work well on version 2.5 outputs as well.\n", "- NewtonX outputs. Testing has been performed up to version 2.2\n", "We plan to support ASE database files soon.\n", "\n", "During the call to `read()`, all input data will be converted into standard units as documented by the `shnitsel-tools` package, e.g. times are converted to `fs`, lengths to `Bohr`, forces to `Hartree/Bohr` and energies to `Hartree`. This allows for standardized and comparable processing independent of the input format, where different unit conventions are common.\n", "\n", "To use Shnitsel tools, we import the shnitsel package:" ] }, { "cell_type": "code", "execution_count": 1, "id": "e703d6d2-8033-40b5-a267-7686dda80c7a", "metadata": {}, "outputs": [], "source": [ "import shnitsel as st" ] }, { "cell_type": "markdown", "id": "f25db3f7", "metadata": {}, "source": [ "The ```st.io.read()``` function handles all of the details of input of the different formats. \n", "Its only essential requirement is a ```path``` to the input that is supposed to be read.\n", "Here, we have multiple options:\n", "- ```path``` can point to the directory of a single trajectory, file or initial condition. Then only this one trajectory will be read." ] }, { "cell_type": "code", "execution_count": 2, "id": "6d736810", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
Trajectory(_raw_dataset=<xarray.Dataset> Size: 2kB\n",
       "Dimensions:                 (time: 1, state: 3, atom: 12, direction: 3,\n",
       "                             statecomb: 3)\n",
       "Coordinates: (12/18)\n",
       "  * time                    (time) float64 8B 0.0\n",
       "  * state                   (state) int64 24B 1 2 3\n",
       "  * atom                    (atom) int64 96B 0 1 2 3 4 5 6 7 8 9 10 11\n",
       "  * direction               (direction) <U1 12B 'x' 'y' 'z'\n",
       "  * statecomb               (statecomb) object 24B MultiIndex\n",
       "    state_names             (state) <U8 96B 'S0' 'S1' 'S2'\n",
       "    ...                      ...\n",
       "  * from                    (statecomb) int64 24B 1 1 2\n",
       "  * to                      (statecomb) int64 24B 2 3 3\n",
       "    delta_t                 float64 8B 0.0\n",
       "    max_ts                  int64 8B 1\n",
       "    t_max                   float64 8B 0.0\n",
       "    charge                  float64 8B 0.0\n",
       "Data variables:\n",
       "    energy                  (time, state) float32 12B -156.1 -155.8 -155.6\n",
       "    forces                  (time, state, atom, direction) float32 432B 0.010...\n",
       "    atXYZ                   (time, atom, direction) float32 144B 0.0 ... 1.656\n",
       "    nacs                    (time, statecomb, atom, direction) float32 432B -...\n",
       "    dip_perm                (time, state, direction) float32 36B -0.1828 ... ...\n",
       "    dip_trans               (time, statecomb, direction) float32 36B -2.591 ....\n",
       "Attributes: (12/13)\n",
       "    input_format:                 sharc\n",
       "    _shnitsel_setup_for_cleanup:  True\n",
       "    completed:                    True\n",
       "    input_type:                   static\n",
       "    input_format_version:         unknown\n",
       "    num_singlets:                 3\n",
       "    ...                           ...\n",
       "    num_triplets:                 0\n",
       "    has_forces:                   True\n",
       "    misc_input_settings:          {'QM.in': {'num_atoms': 12, 'atXYZ': array(...\n",
       "    trajectory_input_path:        test_data/sharc/iconds_butene/ICOND_00000\n",
       "    trajectory_id:                0\n",
       "    __mol:                        <rdkit.Chem.rdchem.Mol object at 0x7cc44ad6..., _is_multi_trajectory=False)
" ], "text/plain": [ "Trajectory(_raw_dataset= Size: 2kB\n", "Dimensions: (time: 1, state: 3, atom: 12, direction: 3,\n", " statecomb: 3)\n", "Coordinates: (12/18)\n", " * time (time) float64 8B 0.0\n", " * state (state) int64 24B 1 2 3\n", " * atom (atom) int64 96B 0 1 2 3 4 5 6 7 8 9 10 11\n", " * direction (direction) \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<class 'shnitsel.data.tree.tree.ShnitselDBRoot[Trajectory]'> [{'level': 'ShnitselDBRoot', 'children': '1: {\\'pyrazene\\': \\'ABCMeta(_name=\\\\\\'pyrazene\\\\\\', _dtype=<class \\\\\\'shnitsel.data.dataset_containers.trajectory.Trajectory\\\\\\'>, _data=None, _children={\\\\\\'1\\\\\\': ABCMeta(_name=1, _dtype=<class \\\\\\'shnitsel.data.dataset_containers.trajectory.Trajectory\\\\\\'>, _data=Trajectory(_raw_dataset=<xarray.Dataset> Size: 451kB\\\\nDimensions:                 (time: 201, state: 7, atom: 10, direction: 3,\\\\n                             statecomb: 21)\\\\nCoordinates: (12/19)\\\\n  * time                    (time) float64 2kB 0.0 0.5 1.0 ... 99.0 99.5 100.0\\\\n  * state                   (state) int64 56B 1 2 3 4 5 6 7\\\\n  * atom                    (atom) int64 80B 0 1 2 3 4 5 6 7 8 9\\\\n  * direction               (direction) <U1 12B \\\\\\'x\\\\\\' \\\\\\'y\\\\\\' \\\\\\'z\\\\\\'\\\\n  * statecomb               (statecomb) object 168B MultiIndex\\\\n    state_names             (state) <U2 56B \\\\\\'S0\\\\\\' \\\\\\'S1\\\\\\' \\\\\\'S2\\\\\\' \\\\\\'S3\\\\\\' \\\\\\'S4\\\\\\' \\\\\\'S5\\\\\\' \\\\\\'S6\\\\\\'\\\\n    ...                      ...\\\\n  * from                    (statecomb) int64 168B 1 1 1 1 1 1 2 ... 4 4 4 5 5 6\\\\n  * to                      (statecomb) int64 168B 2 3 4 5 6 7 3 ... 5 6 7 6 7 7\\\\n    delta_t                 float64 8B 0.5\\\\n    max_ts                  int64 8B 201\\\\n    t_max                   float64 8B 201.0\\\\n    charge                  float64 8B 0.0\\\\nData variables:\\\\n    energy                  (time, state) float64 11kB -263.9 -263.8 ... -263.6\\\\n    e_kin                   (time) float64 2kB 0.06349 0.06314 ... 0.06262\\\\n    velocities              (time, atom, direction) float64 48kB 1.56e-06 ......\\\\n    forces                  (time, state, atom, direction) float64 338kB 0.0 ...\\\\n    atXYZ                   (time, atom, direction) float64 48kB 0.01591 ... ...\\\\nAttributes: (12/13)\\\\n    input_format:                 newtonx\\\\n    _shnitsel_setup_for_cleanup:  True\\\\n    completed:                    True\\\\n    input_type:                   dynamic\\\\n    input_format_version:         2.5\\\\n    num_singlets:                 7\\\\n    ...                           ...\\\\n    num_triplets:                 0\\\\n    has_forces:                   active_only\\\\n    misc_input_settings:          {\\\\\\'control.dyn\\\\\\': {\\\\\\'nat\\\\\\': 10, \\\\\\'nstat\\\\\\': 7, \\\\\\'ns...\\\\n    trajectory_input_path:        test_data/newtonx/test_pyrazene_v2.6/TRAJ1\\\\n    trajectory_id:                1\\\\n    __mol:                        <rdkit.Chem.rdchem.Mol object at 0x7cc44adc..., _is_multi_trajectory=False), _children={}, _attrs={}, _parent=..., _level_name=\\\\\\'DataLeaf\\\\\\'), \\\\\\'2\\\\\\': ABCMeta(_name=2, _dtype=<class \\\\\\'shnitsel.data.dataset_containers.trajectory.Trajectory\\\\\\'>, _data=Trajectory(_raw_dataset=<xarray.Dataset> Size: 451kB\\\\nDimensions:                 (time: 201, state: 7, atom: 10, direction: 3,\\\\n                             statecomb: 21)\\\\nCoordinates: (12/19)\\\\n  * time                    (time) float64 2kB 0.0 0.5 1.0 ... 99.0 99.5 100.0\\\\n  * state                   (state) int64 56B 1 2 3 4 5 6 7\\\\n  * atom                    (atom) int64 80B 0 1 2 3 4 5 6 7 8 9\\\\n  * direction               (direction) <U1 12B \\\\\\'x\\\\\\' \\\\\\'y\\\\\\' \\\\\\'z\\\\\\'\\\\n  * statecomb               (statecomb) object 168B MultiIndex\\\\n    state_names             (state) <U2 56B \\\\\\'S0\\\\\\' \\\\\\'S1\\\\\\' \\\\\\'S2\\\\\\' \\\\\\'S3\\\\\\' \\\\\\'S4\\\\\\' \\\\\\'S5\\\\\\' \\\\\\'S6\\\\\\'\\\\n    ...                      ...\\\\n  * from                    (statecomb) int64 168B 1 1 1 1 1 1 2 ... 4 4 4 5 5 6\\\\n  * to                      (statecomb) int64 168B 2 3 4 5 6 7 3 ... 5 6 7 6 7 7\\\\n    delta_t                 float64 8B 0.5\\\\n    max_ts                  int64 8B 201\\\\n    t_max                   float64 8B 201.0\\\\n    charge                  float64 8B 0.0\\\\nData variables:\\\\n    energy                  (time, state) float64 11kB -263.9 -263.8 ... -263.6\\\\n    e_kin                   (time) float64 2kB 0.04829 0.04631 ... 0.06287\\\\n    velocities              (time, atom, direction) float64 48kB -5.529e-05 ....\\\\n    forces                  (time, state, atom, direction) float64 338kB 0.0 ...\\\\n    atXYZ                   (time, atom, direction) float64 48kB -0.05122 ......\\\\nAttributes: (12/13)\\\\n    input_format:                 newtonx\\\\n    _shnitsel_setup_for_cleanup:  True\\\\n    completed:                    True\\\\n    input_type:                   dynamic\\\\n    input_format_version:         2.5\\\\n    num_singlets:                 7\\\\n    ...                           ...\\\\n    num_triplets:                 0\\\\n    has_forces:                   active_only\\\\n    misc_input_settings:          {\\\\\\'control.dyn\\\\\\': {\\\\\\'nat\\\\\\': 10, \\\\\\'nstat\\\\\\': 7, \\\\\\'ns...\\\\n    trajectory_input_path:        test_data/newtonx/test_pyrazene_v2.6/TRAJ2\\\\n    trajectory_id:                2\\\\n    __mol:                        <rdkit.Chem.rdchem.Mol object at 0x7cc44ada..., _is_multi_trajectory=False), _children={}, _attrs={}, _parent=..., _level_name=\\\\\\'DataLeaf\\\\\\'), \\\\\\'3\\\\\\': ABCMeta(_name=3, _dtype=<class \\\\\\'shnitsel.data.dataset_containers.trajectory.Trajectory\\\\\\'>, _data=Trajectory(_raw_dataset=<xarray.Dataset> Size: 451kB\\\\nDimensions:                 (time: 201, state: 7, atom: 10, direction: 3,\\\\n                             statecomb: 21)\\\\nCoordinates: (12/19)\\\\n  * time                    (time) float64 2kB 0.0 0.5 1.0 ... 99.0 99.5 100.0\\\\n  * state                   (state) int64 56B 1 2 3 4 5 6 7\\\\n  * atom                    (atom) int64 80B 0 1 2 3 4 5 6 7 8 9\\\\n  * direction               (direction) <U1 12B \\\\\\'x\\\\\\' \\\\\\'y\\\\\\' \\\\\\'z\\\\\\'\\\\n  * statecomb               (statecomb) object 168B MultiIndex\\\\n    state_names             (state) <U2 56B \\\\\\'S0\\\\\\' \\\\\\'S1\\\\\\' \\\\\\'S2\\\\\\' \\\\\\'S3\\\\\\' \\\\\\'S4\\\\\\' \\\\\\'S5\\\\\\' \\\\\\'S6\\\\\\'\\\\n    ...                      ...\\\\n  * from                    (statecomb) int64 168B 1 1 1 1 1 1 2 ... 4 4 4 5 5 6\\\\n  * to                      (statecomb) int64 168B 2 3 4 5 6 7 3 ... 5 6 7 6 7 7\\\\n    delta_t                 float64 8B 0.5\\\\n    max_ts                  int64 8B 201\\\\n    t_max                   float64 8B 201.0\\\\n    charge                  float64 8B 0.0\\\\nData variables:\\\\n    energy                  (time, state) float64 11kB -263.9 -263.8 ... -263.7\\\\n    e_kin                   (time) float64 2kB 0.03085 0.02938 ... 0.04899\\\\n    velocities              (time, atom, direction) float64 48kB -2.549e-05 ....\\\\n    forces                  (time, state, atom, direction) float64 338kB 0.0 ...\\\\n    atXYZ                   (time, atom, direction) float64 48kB 0.02058 ... ...\\\\nAttributes: (12/13)\\\\n    input_format:                 newtonx\\\\n    _shnitsel_setup_for_cleanup:  True\\\\n    completed:                    True\\\\n    input_type:                   dynamic\\\\n    input_format_version:         2.5\\\\n    num_singlets:                 7\\\\n    ...                           ...\\\\n    num_triplets:                 0\\\\n    has_forces:                   active_only\\\\n    misc_input_settings:          {\\\\\\'control.dyn\\\\\\': {\\\\\\'nat\\\\\\': 10, \\\\\\'nstat\\\\\\': 7, \\\\\\'ns...\\\\n    trajectory_input_path:        test_data/newtonx/test_pyrazene_v2.6/TRAJ3\\\\n    trajectory_id:                3\\\\n    __mol:                        <rdkit.Chem.rdchem.Mol object at 0x7cc44adc..., _is_multi_trajectory=False), _children={}, _attrs={}, _parent=..., _level_name=\\\\\\'DataLeaf\\\\\\'), \\\\\\'14\\\\\\': ABCMeta(_name=14, _dtype=<class \\\\\\'shnitsel.data.dataset_containers.trajectory.Trajectory\\\\\\'>, _data=Trajectory(_raw_dataset=<xarray.Dataset> Size: 451kB\\\\nDimensions:                 (time: 201, state: 7, atom: 10, direction: 3,\\\\n                             statecomb: 21)\\\\nCoordinates: (12/19)\\\\n  * time                    (time) float64 2kB 0.0 0.5 1.0 ... 99.0 99.5 100.0\\\\n  * state                   (state) int64 56B 1 2 3 4 5 6 7\\\\n  * atom                    (atom) int64 80B 0 1 2 3 4 5 6 7 8 9\\\\n  * direction               (direction) <U1 12B \\\\\\'x\\\\\\' \\\\\\'y\\\\\\' \\\\\\'z\\\\\\'\\\\n  * statecomb               (statecomb) object 168B MultiIndex\\\\n    state_names             (state) <U2 56B \\\\\\'S0\\\\\\' \\\\\\'S1\\\\\\' \\\\\\'S2\\\\\\' \\\\\\'S3\\\\\\' \\\\\\'S4\\\\\\' \\\\\\'S5\\\\\\' \\\\\\'S6\\\\\\'\\\\n    ...                      ...\\\\n  * from                    (statecomb) int64 168B 1 1 1 1 1 1 2 ... 4 4 4 5 5 6\\\\n  * to                      (statecomb) int64 168B 2 3 4 5 6 7 3 ... 5 6 7 6 7 7\\\\n    delta_t                 float64 8B 0.5\\\\n    max_ts                  int64 8B 201\\\\n    t_max                   float64 8B 201.0\\\\n    charge                  float64 8B 0.0\\\\nData variables:\\\\n    energy                  (time, state) float64 11kB -263.9 -263.8 ... -263.7\\\\n    e_kin                   (time) float64 2kB 0.03367 0.03294 ... 0.03612\\\\n    velocities              (time, atom, direction) float64 48kB 5.496e-05 .....\\\\n    forces                  (time, state, atom, direction) float64 338kB 0.0 ...\\\\n    atXYZ                   (time, atom, direction) float64 48kB -0.00611 ......\\\\nAttributes: (12/13)\\\\n    input_format:                 newtonx\\\\n    _shnitsel_setup_for_cleanup:  True\\\\n    completed:                    True\\\\n    input_type:                   dynamic\\\\n    input_format_version:         2.5\\\\n    num_singlets:                 7\\\\n    ...                           ...\\\\n    num_triplets:                 0\\\\n    has_forces:                   active_only\\\\n    misc_input_settings:          {\\\\\\'control.dyn\\\\\\': {\\\\\\'nat\\\\\\': 10, \\\\\\'nstat\\\\\\': 7, \\\\\\'ns...\\\\n    trajectory_input_path:        test_data/newtonx/test_pyrazene_v2.6/TRAJ14\\\\n    trajectory_id:                14\\\\n    __mol:                        <rdkit.Chem.rdchem.Mol object at 0x7cc44adc..., _is_multi_trajectory=False), _children={}, _attrs={}, _parent=..., _level_name=\\\\\\'DataLeaf\\\\\\'), \\\\\\'99\\\\\\': ABCMeta(_name=99, _dtype=<class \\\\\\'shnitsel.data.dataset_containers.trajectory.Trajectory\\\\\\'>, _data=Trajectory(_raw_dataset=<xarray.Dataset> Size: 451kB\\\\nDimensions:                 (time: 201, state: 7, atom: 10, direction: 3,\\\\n                             statecomb: 21)\\\\nCoordinates: (12/19)\\\\n  * time                    (time) float64 2kB 0.0 0.5 1.0 ... 99.0 99.5 100.0\\\\n  * state                   (state) int64 56B 1 2 3 4 5 6 7\\\\n  * atom                    (atom) int64 80B 0 1 2 3 4 5 6 7 8 9\\\\n  * direction               (direction) <U1 12B \\\\\\'x\\\\\\' \\\\\\'y\\\\\\' \\\\\\'z\\\\\\'\\\\n  * statecomb               (statecomb) object 168B MultiIndex\\\\n    state_names             (state) <U2 56B \\\\\\'S0\\\\\\' \\\\\\'S1\\\\\\' \\\\\\'S2\\\\\\' \\\\\\'S3\\\\\\' \\\\\\'S4\\\\\\' \\\\\\'S5\\\\\\' \\\\\\'S6\\\\\\'\\\\n    ...                      ...\\\\n  * from                    (statecomb) int64 168B 1 1 1 1 1 1 2 ... 4 4 4 5 5 6\\\\n  * to                      (statecomb) int64 168B 2 3 4 5 6 7 3 ... 5 6 7 6 7 7\\\\n    delta_t                 float64 8B 0.5\\\\n    max_ts                  int64 8B 201\\\\n    t_max                   float64 8B 201.0\\\\n    charge                  float64 8B 0.0\\\\nData variables:\\\\n    energy                  (time, state) float64 11kB -264.0 -263.8 ... -263.7\\\\n    e_kin                   (time) float64 2kB 0.03544 0.03565 ... 0.03024\\\\n    velocities              (time, atom, direction) float64 48kB -0.0001967 ....\\\\n    forces                  (time, state, atom, direction) float64 338kB 0.0 ...\\\\n    atXYZ                   (time, atom, direction) float64 48kB -0.1052 ... ...\\\\nAttributes: (12/13)\\\\n    input_format:                 newtonx\\\\n    _shnitsel_setup_for_cleanup:  True\\\\n    completed:                    True\\\\n    input_type:                   dynamic\\\\n    input_format_version:         2.5\\\\n    num_singlets:                 7\\\\n    ...                           ...\\\\n    num_triplets:                 0\\\\n    has_forces:                   active_only\\\\n    misc_input_settings:          {\\\\\\'control.dyn\\\\\\': {\\\\\\'nat\\\\\\': 10, \\\\\\'nstat\\\\\\': 7, \\\\\\'ns...\\\\n    trajectory_input_path:        test_data/newtonx/test_pyrazene_v2.6/TRAJ99\\\\n    trajectory_id:                99\\\\n    __mol:                        <rdkit.Chem.rdchem.Mol object at 0x7cc44ada..., _is_multi_trajectory=False), _children={}, _attrs={}, _parent=..., _level_name=\\\\\\'DataLeaf\\\\\\')}, _attrs={}, _parent=<class \\\\\\'shnitsel.data.tree.tree.ShnitselDBRoot[Trajectory]\\\\\\'> [{\\\\\\'level\\\\\\': \\\\\\'ShnitselDBRoot\\\\\\', \\\\\\'children\\\\\\': "1: {\\\\\\'pyrazene\\\\\\': \\\\\\'...\\\\\\'}"}], _level_name=\\\\\\'CompoundGroup\\\\\\', _group_info=None)\\'}'}]
" ], "text/plain": [ " [{'level': 'ShnitselDBRoot', 'children': '1: {\\'pyrazene\\': \\'ABCMeta(_name=\\\\\\'pyrazene\\\\\\', _dtype=, _data=None, _children={\\\\\\'1\\\\\\': ABCMeta(_name=1, _dtype=, _data=Trajectory(_raw_dataset= Size: 451kB\\\\nDimensions: (time: 201, state: 7, atom: 10, direction: 3,\\\\n statecomb: 21)\\\\nCoordinates: (12/19)\\\\n * time (time) float64 2kB 0.0 0.5 1.0 ... 99.0 99.5 100.0\\\\n * state (state) int64 56B 1 2 3 4 5 6 7\\\\n * atom (atom) int64 80B 0 1 2 3 4 5 6 7 8 9\\\\n * direction (direction) , _data=Trajectory(_raw_dataset= Size: 451kB\\\\nDimensions: (time: 201, state: 7, atom: 10, direction: 3,\\\\n statecomb: 21)\\\\nCoordinates: (12/19)\\\\n * time (time) float64 2kB 0.0 0.5 1.0 ... 99.0 99.5 100.0\\\\n * state (state) int64 56B 1 2 3 4 5 6 7\\\\n * atom (atom) int64 80B 0 1 2 3 4 5 6 7 8 9\\\\n * direction (direction) , _data=Trajectory(_raw_dataset= Size: 451kB\\\\nDimensions: (time: 201, state: 7, atom: 10, direction: 3,\\\\n statecomb: 21)\\\\nCoordinates: (12/19)\\\\n * time (time) float64 2kB 0.0 0.5 1.0 ... 99.0 99.5 100.0\\\\n * state (state) int64 56B 1 2 3 4 5 6 7\\\\n * atom (atom) int64 80B 0 1 2 3 4 5 6 7 8 9\\\\n * direction (direction) , _data=Trajectory(_raw_dataset= Size: 451kB\\\\nDimensions: (time: 201, state: 7, atom: 10, direction: 3,\\\\n statecomb: 21)\\\\nCoordinates: (12/19)\\\\n * time (time) float64 2kB 0.0 0.5 1.0 ... 99.0 99.5 100.0\\\\n * state (state) int64 56B 1 2 3 4 5 6 7\\\\n * atom (atom) int64 80B 0 1 2 3 4 5 6 7 8 9\\\\n * direction (direction) , _data=Trajectory(_raw_dataset= Size: 451kB\\\\nDimensions: (time: 201, state: 7, atom: 10, direction: 3,\\\\n statecomb: 21)\\\\nCoordinates: (12/19)\\\\n * time (time) float64 2kB 0.0 0.5 1.0 ... 99.0 99.5 100.0\\\\n * state (state) int64 56B 1 2 3 4 5 6 7\\\\n * atom (atom) int64 80B 0 1 2 3 4 5 6 7 8 9\\\\n * direction (direction) [{\\\\\\'level\\\\\\': \\\\\\'ShnitselDBRoot\\\\\\', \\\\\\'children\\\\\\': \"1: {\\\\\\'pyrazene\\\\\\': \\\\\\'...\\\\\\'}\"}], _level_name=\\\\\\'CompoundGroup\\\\\\', _group_info=None)\\'}'}]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# This call loads all initial conditions within the `./test_data/sharc/iconds_butene` directory, because the directory itself is not a trajectory.\n", "iconds_butene = st.io.read(path='./test_data/sharc/iconds_butene')\n", "\n", "# This call loads all trajectories within `./test_data/sharc/traj_butene_v2.1` and combines them into a single object\n", "# Here, we also set the compound name\n", "trajectories_butene_sharc = st.io.read(path='./test_data/sharc/traj_butene_v2.1/').set_compound_info('butene')\n", "\n", "# We can similarly load a collection of newtonx trajectories\n", "# We again explicitly set the compound name\n", "trajectories_pyrazene_newtonx = st.io.read(path='./test_data/newtonx/test_pyrazene_v2.6').set_compound_info('pyrazene')\n", "trajectories_pyrazene_newtonx" ] }, { "cell_type": "markdown", "id": "21993b7e", "metadata": {}, "source": [ "When loading various data structures, from different formats, Shnitsel tools performs some sophisticated type-autodetection to apply the correct import logic. \n", "However, if the detection fails, because it detects multiple fitting formats, the ```read()``` function offers the parameter ```kind```, which can be set to ```sharc```,```newtonx```, or ```pyrai2md``` to make it pick the specified input format without performing type detection:" ] }, { "cell_type": "code", "execution_count": 4, "id": "d3206aa3", "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "9d973c717fea4bc2b143b1289db90c9d", "version_major": 2, "version_minor": 0 }, "text/plain": [ " 0%| | 0/2 [00:00\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<class 'shnitsel.data.tree.tree.ShnitselDBRoot[Trajectory]'> [{'level': 'ShnitselDBRoot', 'children': '1: {\\'I01\\': \\'ABCMeta(_name=\\\\\\'I01\\\\\\', _dtype=<class \\\\\\'shnitsel.data.dataset_containers.trajectory.Trajectory\\\\\\'>, _data=None, _children={\\\\\\'2\\\\\\': ABCMeta(_name=2, _dtype=<class \\\\\\'shnitsel.data.dataset_containers.trajectory.Trajectory\\\\\\'>, _data=Trajectory(_raw_dataset=<xarray.Dataset> Size: 141kB\\\\nDimensions:                 (time: 201, state: 3, atom: 6, direction: 3,\\\\n                             statecomb: 3, full_statecomb: 6)\\\\nCoordinates: (12/23)\\\\n  * time                    (time) float64 2kB 0.0 0.5 1.0 ... 99.0 99.5 100.0\\\\n  * state                   (state) int64 24B 1 2 3\\\\n  * atom                    (atom) int64 48B 0 1 2 3 4 5\\\\n  * direction               (direction) <U1 12B \\\\\\'x\\\\\\' \\\\\\'y\\\\\\' \\\\\\'z\\\\\\'\\\\n  * statecomb               (statecomb) object 24B MultiIndex\\\\n  * full_statecomb          (full_statecomb) object 48B MultiIndex\\\\n    ...                      ...\\\\n  * full_statecomb_from     (full_statecomb) int64 48B 1 1 2 2 3 3\\\\n  * full_statecomb_to       (full_statecomb) int64 48B 2 3 1 3 1 2\\\\n    delta_t                 float64 8B 0.5\\\\n    max_ts                  int64 8B 201\\\\n    t_max                   float64 8B 100.0\\\\n    charge                  float64 8B 1.0\\\\nData variables:\\\\n    energy                  (time, state) float32 2kB -94.69 -94.38 ... -94.34\\\\n    forces                  (time, state, atom, direction) float32 43kB 0.091...\\\\n    atXYZ                   (time, atom, direction) float32 14kB 0.02999 ... ...\\\\n    nacs                    (time, statecomb, atom, direction) float32 43kB 0...\\\\n    dip_perm                (time, state, direction) float32 7kB -0.002852 .....\\\\n    dip_trans               (time, statecomb, direction) float32 7kB 0.001566...\\\\n    socs                    (time, full_statecomb) complex128 19kB 0j 0j ... 0j\\\\nAttributes: (12/13)\\\\n    input_format:                 sharc\\\\n    _shnitsel_setup_for_cleanup:  True\\\\n    completed:                    True\\\\n    input_type:                   dynamic\\\\n    input_format_version:         2.0\\\\n    num_singlets:                 3\\\\n    ...                           ...\\\\n    num_triplets:                 0\\\\n    has_forces:                   True\\\\n    misc_input_settings:          {\\\\\\'input\\\\\\': {\\\\\\'printlevel\\\\\\': \\\\\\'2\\\\\\', \\\\\\'geomfile\\\\\\': \\\\\\'...\\\\n    trajectory_input_path:        test_data/sharc/traj_I01_v2.0/TRAJ_00002\\\\n    trajectory_id:                2\\\\n    __mol:                        <rdkit.Chem.rdchem.Mol object at 0x7cc49b51..., _is_multi_trajectory=False), _children={}, _attrs={}, _parent=..., _level_name=\\\\\\'DataLeaf\\\\\\'), \\\\\\'6\\\\\\': ABCMeta(_name=6, _dtype=<class \\\\\\'shnitsel.data.dataset_containers.trajectory.Trajectory\\\\\\'>, _data=Trajectory(_raw_dataset=<xarray.Dataset> Size: 141kB\\\\nDimensions:                 (time: 201, state: 3, atom: 6, direction: 3,\\\\n                             statecomb: 3, full_statecomb: 6)\\\\nCoordinates: (12/23)\\\\n  * time                    (time) float64 2kB 0.0 0.5 1.0 ... 99.0 99.5 100.0\\\\n  * state                   (state) int64 24B 1 2 3\\\\n  * atom                    (atom) int64 48B 0 1 2 3 4 5\\\\n  * direction               (direction) <U1 12B \\\\\\'x\\\\\\' \\\\\\'y\\\\\\' \\\\\\'z\\\\\\'\\\\n  * statecomb               (statecomb) object 24B MultiIndex\\\\n  * full_statecomb          (full_statecomb) object 48B MultiIndex\\\\n    ...                      ...\\\\n  * full_statecomb_from     (full_statecomb) int64 48B 1 1 2 2 3 3\\\\n  * full_statecomb_to       (full_statecomb) int64 48B 2 3 1 3 1 2\\\\n    delta_t                 float64 8B 0.5\\\\n    max_ts                  int64 8B 201\\\\n    t_max                   float64 8B 100.0\\\\n    charge                  float64 8B 1.0\\\\nData variables:\\\\n    energy                  (time, state) float32 2kB -94.66 -94.37 ... -94.32\\\\n    forces                  (time, state, atom, direction) float32 43kB 0.007...\\\\n    atXYZ                   (time, atom, direction) float32 14kB 0.01689 ... ...\\\\n    nacs                    (time, statecomb, atom, direction) float32 43kB -...\\\\n    dip_perm                (time, state, direction) float32 7kB 0.4447 ... 2...\\\\n    dip_trans               (time, statecomb, direction) float32 7kB -0.05107...\\\\n    socs                    (time, full_statecomb) complex128 19kB 0j 0j ... 0j\\\\nAttributes: (12/13)\\\\n    input_format:                 sharc\\\\n    _shnitsel_setup_for_cleanup:  True\\\\n    completed:                    True\\\\n    input_type:                   dynamic\\\\n    input_format_version:         2.0\\\\n    num_singlets:                 3\\\\n    ...                           ...\\\\n    num_triplets:                 0\\\\n    has_forces:                   True\\\\n    misc_input_settings:          {\\\\\\'input\\\\\\': {\\\\\\'printlevel\\\\\\': \\\\\\'2\\\\\\', \\\\\\'geomfile\\\\\\': \\\\\\'...\\\\n    trajectory_input_path:        test_data/sharc/traj_I01_v2.0/TRAJ_00006\\\\n    trajectory_id:                6\\\\n    __mol:                        <rdkit.Chem.rdchem.Mol object at 0x7cc49b51..., _is_multi_trajectory=False), _children={}, _attrs={}, _parent=..., _level_name=\\\\\\'DataLeaf\\\\\\')}, _attrs={}, _parent=<class \\\\\\'shnitsel.data.tree.tree.ShnitselDBRoot[Trajectory]\\\\\\'> [{\\\\\\'level\\\\\\': \\\\\\'ShnitselDBRoot\\\\\\', \\\\\\'children\\\\\\': "1: {\\\\\\'I01\\\\\\': \\\\\\'...\\\\\\'}"}], _level_name=\\\\\\'CompoundGroup\\\\\\', _group_info=None)\\'}'}]
" ], "text/plain": [ " [{'level': 'ShnitselDBRoot', 'children': '1: {\\'I01\\': \\'ABCMeta(_name=\\\\\\'I01\\\\\\', _dtype=, _data=None, _children={\\\\\\'2\\\\\\': ABCMeta(_name=2, _dtype=, _data=Trajectory(_raw_dataset= Size: 141kB\\\\nDimensions: (time: 201, state: 3, atom: 6, direction: 3,\\\\n statecomb: 3, full_statecomb: 6)\\\\nCoordinates: (12/23)\\\\n * time (time) float64 2kB 0.0 0.5 1.0 ... 99.0 99.5 100.0\\\\n * state (state) int64 24B 1 2 3\\\\n * atom (atom) int64 48B 0 1 2 3 4 5\\\\n * direction (direction) , _data=Trajectory(_raw_dataset= Size: 141kB\\\\nDimensions: (time: 201, state: 3, atom: 6, direction: 3,\\\\n statecomb: 3, full_statecomb: 6)\\\\nCoordinates: (12/23)\\\\n * time (time) float64 2kB 0.0 0.5 1.0 ... 99.0 99.5 100.0\\\\n * state (state) int64 24B 1 2 3\\\\n * atom (atom) int64 48B 0 1 2 3 4 5\\\\n * direction (direction) [{\\\\\\'level\\\\\\': \\\\\\'ShnitselDBRoot\\\\\\', \\\\\\'children\\\\\\': \"1: {\\\\\\'I01\\\\\\': \\\\\\'...\\\\\\'}\"}], _level_name=\\\\\\'CompoundGroup\\\\\\', _group_info=None)\\'}'}]" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from shnitsel.data.dataset_containers.trajectory import Trajectory\n", "from shnitsel.data.tree.tree import ShnitselDB\n", "\n", "\n", "# The `read()` call will automatically attempt to guess the charge of our molecule\n", "traj_I01:ShnitselDB[Trajectory] = st.io.read(\n", " path='./test_data/sharc/traj_I01_v2.0', kind='sharc',\n", " expect_dtype=Trajectory\n", ").set_compound_info('I01')\n", "\n", "# If the guess of the charge was not successful or does not meet our expectations, \n", "# you can set the charge with `set_charge()` explicitly.\n", "traj_I01 = traj_I01.set_charge(1.)\n", "traj_I01" ] }, { "cell_type": "markdown", "id": "94239330", "metadata": {}, "source": [ "### Advanced `read()` configuration\n", "\n", "While ```st.io.read()``` will generally try and configure the parsing routines in a way that should fit most userst, there are several settings that can be adapted:\n", "- ```sub_pattern```: This is a path pattern to match files or subdirectory names in the ```path``` location to. Let us assume, we have ```TRAJ_0,TRAJ_1,...TRAJ_99,``` and ```ICOND_0,...ICOND_4``` directories within the same folder, but we only want to load ```ICOND_*``` files. Then we can set ```sub_pattern='ICOND_*'``` and ```read()``` will only consider the matching entries in `path`.\n", "- `input_units`: This is an optional setting to determine the units in which different input variables are provided, e.g. `atXYZ` is the variable name for positional data, `forces` is the variable name for forces and `energy` is the variable name for absolute system energy. The units are provided in a `{variable_name:'unit_string'}` fashion, where the variable names can be taken from the declaration of the standard Shnitsel format and the units values for different unit types can be found in `shnitsel.units.definitions`, \n", "- `input_state_types` and `input_state_names`: For various formats, it is not entirely possible to derive the multiplicity/type of states and a reasonable readable state name label from the trajectory format. `read()` will make a best effort to apply reasonable state names and state multiplicities based on the input information, but if the resulting state types (or names) are not according to your expectations, you can set them by providing either a list of types or names to the `input_state_types` and `input_state_names` or by providing a function in either of those parameters that takes the already parsed `xarray.Dataset` and can set the `ds.state_types` or `ds.state_names` values based on additional information from the dataset. Note that shnitsel tools will always apply state types first before setting state names and it will use the values of the settings `input_state_types` and `input_state_names` in this order as well, if they are provided.\n", "- ```concat_method='db'```: When multiple trajectories are loaded in parallel, they need to be combined into a single result. The default behavior is to pack them into a ShnitselDB structure (based on `xarray.DataTree`) to preserve all meta-data of the individual trajectories, allowing for advanced filtering and storage according to the FAIR principles. \n", "Alternatives are \n", "- - `'concat'`, which attaches the trajctories along the `time` dimension, which is then referred to as `frame`, effectively building one long, concatenated trajectory. This will lose insights into different meta-data of the concatenated trajectories.\n", "- - `'layers'`, which packs all trajectories into a shared `xarray.Dataset` object that has a new `trajid` dimension (meaning: `trajectory id`), along which the different trajectories can be indexed. As in `'concat'`, we will lose individual differences in metadata on the combined object compared to the individual trajectories.\n", "- - `'list'`, which simply returnes the trajectories as a list of `xarray.Dataset` objects.\n", "- `input_trajectory_id_maps`: This option allows for controlled setting of trajectory ids for individual datasets. If a dict is provided here, the keys are meant to be absolute paths in posix-format (specifically of type `str`) with the associated values being the resulting ids. Alternatively, a function can be provided that receives the `pathlib.Path` object of the input path of the trajectory and is supposed to yield an integer id that is distinct from all other trajectories loaded with the same call to `read()`\n", " \n", "### More techincal `read()` configuration options\n", "\n", "- ```multiple=True```: This flag determines whether `read()` should automatically look for entries within the `path` folder to load as input datasets. If set to `False`, only `path` will be attempted to be loaded as a single dataset.\n", "- `parallel=True`: A flag that by default allows parallel processing of input in multiple child processes to accelerate multi-trajectory input.\n", "- `error_reporting='log'`: A setting to determine whether errors should be thrown as an exception or just logged within the read routine. If parallel reading is enabled, only `log` is supported. \n" ] }, { "cell_type": "markdown", "id": "4bb90d01", "metadata": {}, "source": [ "## Storing data on disk in Shnitsel/NetCDF format\n", "\n", "Within the `shnitsel.io` module, we provide the function `write_shnitsel_file(dataset, savepath)` to write Shnitsel-format files using the NetCDF4 format. \n", "Writing is performed using the `to_netcdf()` functionality provided by the `xarray` package for its datatypes, which can be controlled using the option `complevel=9`, which sets the compression level on a scale of `0` (not compressed) to `9` (maximum compression).\n", "Higher `complevel` values result in smaller output files but take longer time to compute. \n", "Lower compression levels are therefor preferable if writing latency is relevant to the application." ] }, { "cell_type": "code", "execution_count": 7, "id": "7826447a", "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "b47e7423c59d44d0a4371f0757279d91", "version_major": 2, "version_minor": 0 }, "text/plain": [ " 0%| | 0/4 [00:00\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<class 'shnitsel.data.tree.tree.ShnitselDBRoot[Frames]'> [{'level': 'ShnitselDBRoot', 'children': '1: {\\'C-N-H-H-H-H\\': \\'ABCMeta(_name=\\\\\\'C-N-H-H-H-H\\\\\\', _dtype=<class \\\\\\'shnitsel.data.dataset_containers.frames.Frames\\\\\\'>, _data=None, _children={\\\\\\'0\\\\\\': ABCMeta(_name=\\\\\\'0\\\\\\', _dtype=<class \\\\\\'shnitsel.data.dataset_containers.frames.Frames\\\\\\'>, _data=None, _children={\\\\\\'0\\\\\\': ABCMeta(_name=\\\\\\'0\\\\\\', _dtype=<class \\\\\\'shnitsel.data.dataset_containers.frames.Frames\\\\\\'>, _data=Frames(_raw_dataset=<xarray.Dataset> Size: 166kB\\\\nDimensions:              (frame: 99, state: 3, full_statecomb: 6, atom: 6,\\\\n                          direction: 3, statecomb: 3)\\\\nCoordinates: (12/18)\\\\n  * frame                (frame) int64 792B 0 1 2 3 4 5 6 ... 93 94 95 96 97 98\\\\n  * state                (state) int64 24B 1 2 3\\\\n  * full_statecomb       (full_statecomb) object 48B MultiIndex\\\\n  * direction            (direction) <U1 12B \\\\\\'x\\\\\\' \\\\\\'y\\\\\\' \\\\\\'z\\\\\\'\\\\n  * statecomb            (statecomb) object 24B MultiIndex\\\\n    atNames              (atom) <U1 24B \\\\\\'C\\\\\\' \\\\\\'N\\\\\\' \\\\\\'H\\\\\\' \\\\\\'H\\\\\\' \\\\\\'H\\\\\\' \\\\\\'H\\\\\\'\\\\n    ...                   ...\\\\n    state_names          (state) <U8 96B \\\\\\'S\\\\\\' \\\\\\'S\\\\\\' \\\\\\'S\\\\\\'\\\\n  * from                 (statecomb) int64 24B 1 1 2\\\\n  * to                   (statecomb) int64 24B 2 3 3\\\\n  * full_statecomb_from  (full_statecomb) int64 48B 1 1 2 2 3 3\\\\n  * full_statecomb_to    (full_statecomb) int64 48B 2 3 1 3 1 2\\\\n    charge               float64 8B 1.0\\\\nDimensions without coordinates: atom\\\\nData variables:\\\\n    energy               (frame, state) float64 2kB -94.67 -94.37 ... -94.29\\\\n    socs                 (frame, full_statecomb) float64 5kB 0.0 0.0 ... 0.0\\\\n    forces               (frame, state, atom, direction) float64 43kB -0.1414...\\\\n    nacs                 (frame, statecomb, atom, direction) float64 43kB -0....\\\\n    smooth_nacs          (frame, statecomb, atom, direction) float64 43kB 0.0...\\\\n    atXYZ                (frame, atom, direction) float64 14kB 0.00093 ... -0...\\\\n    dip_perm             (frame, state, direction) float64 7kB 0.133 ... 1.088\\\\n    dip_trans            (frame, statecomb, direction) float64 7kB -0.004758 ...\\\\nAttributes:\\\\n    completed:              False\\\\n    input_format:           ase\\\\n    input_type:             unknown\\\\n    input_format_version:   unknown\\\\n    num_singlets:           3\\\\n    num_doublets:           0\\\\n    num_triplets:           0\\\\n    est_level:              MR-CISD(6,4)\\\\n    theory_basis_set:       aug-cc-pVDZ, program: COLUMBUS\\\\n    trajectory_input_path:  test_data/ase/spainn_ch2nh2+.db\\\\n    trajectory_id:          1375288398, _is_multi_trajectory=False), _children={}, _attrs={}, _parent=..., _level_name=\\\\\\'DataLeaf\\\\\\')}, _attrs={}, _parent=..., _level_name=\\\\\\'DataGroup\\\\\\', _group_info=None)}, _attrs={}, _parent=<class \\\\\\'shnitsel.data.tree.tree.ShnitselDBRoot[Frames]\\\\\\'> [{\\\\\\'level\\\\\\': \\\\\\'ShnitselDBRoot\\\\\\', \\\\\\'children\\\\\\': "1: {\\\\\\'C-N-H-H-H-H\\\\\\': \\\\\\'...\\\\\\'}"}], _level_name=\\\\\\'CompoundGroup\\\\\\', _group_info=None)\\'}'}]
" ], "text/plain": [ " [{'level': 'ShnitselDBRoot', 'children': '1: {\\'C-N-H-H-H-H\\': \\'ABCMeta(_name=\\\\\\'C-N-H-H-H-H\\\\\\', _dtype=, _data=None, _children={\\\\\\'0\\\\\\': ABCMeta(_name=\\\\\\'0\\\\\\', _dtype=, _data=None, _children={\\\\\\'0\\\\\\': ABCMeta(_name=\\\\\\'0\\\\\\', _dtype=, _data=Frames(_raw_dataset= Size: 166kB\\\\nDimensions: (frame: 99, state: 3, full_statecomb: 6, atom: 6,\\\\n direction: 3, statecomb: 3)\\\\nCoordinates: (12/18)\\\\n * frame (frame) int64 792B 0 1 2 3 4 5 6 ... 93 94 95 96 97 98\\\\n * state (state) int64 24B 1 2 3\\\\n * full_statecomb (full_statecomb) object 48B MultiIndex\\\\n * direction (direction) [{\\\\\\'level\\\\\\': \\\\\\'ShnitselDBRoot\\\\\\', \\\\\\'children\\\\\\': \"1: {\\\\\\'C-N-H-H-H-H\\\\\\': \\\\\\'...\\\\\\'}\"}], _level_name=\\\\\\'CompoundGroup\\\\\\', _group_info=None)\\'}'}]" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import shnitsel as st\n", "from shnitsel.analyze.generic import keep_norming\n", "# Reading an ase-style sql database input:\n", "ase_input = st.io.read(\"./test_data/ase/spainn_ch2nh2+.db\").set_charge(1)\n", "ase_input" ] }, { "cell_type": "markdown", "id": "01c54c22", "metadata": {}, "source": [ "We support the output of shnitsel-style datasets to a `spainn` or `schnet` database in a `schnetpack` conforming format via the method `st.io.write_ase_db()`:" ] }, { "cell_type": "code", "execution_count": 12, "id": "47958a4c", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/home/tpadmin/git/shnitsel-tools-official/shnitsel/data/traj_combiner_methods.py:491: FutureWarning: In a future version of xarray the default value for compat will change from compat='equals' to compat='override'. This change will result in the following ValueError: Cannot specify both coords='different' and compat='override'. The recommendation is to set compat explicitly for this case.\n", " frames = xr.concat(\n" ] } ], "source": [ "# Writing in spainn format\n", "# We set the `force=True` option to overwrite an existing database.\n", "st.io.write_ase_db(ase_input, \"./test_data/ase/spainn_conversion.db\", db_format=\"spainn\", force=True)\n", "\n", "# Writing in schnet format\n", "# Here, we do not force the overwriting. This command may fail if you have executed it before. \n", "# You can add `force=True` here as well if you are sure that you want to drop the existing database.\n", "st.io.write_ase_db(ase_input, \"./test_data/ase/schnet_conversion.db\", db_format=\"schnet\")" ] }, { "cell_type": "markdown", "id": "8fe118a2", "metadata": {}, "source": [ "If we combine our import support for e.g. `SHARC` files with the output to `ASE` databases, we see, that certain structural data is not preserved through writing to and loading from `ASE` databases:" ] }, { "cell_type": "code", "execution_count": 13, "id": "9904f18c", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "WARNING:root:The path `test_data/sharc/traj_I01_v2.0/TRAJ_00009` does contain a `DONT_ANALYZE` file and will therefore be skipped. Please remove that file if you want the directory to be read.\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "15f5380f1b9f4bf1b5196f6a7a7f4dc1", "version_major": 2, "version_minor": 0 }, "text/plain": [ " 0%| | 0/2 [00:00\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<class 'shnitsel.data.tree.tree.ShnitselDBRoot[Trajectory]'> [{'level': 'ShnitselDBRoot', 'children': '1: {\\'I01\\': \\'ABCMeta(_name=\\\\\\'unknown\\\\\\', _dtype=<class \\\\\\'shnitsel.data.dataset_containers.trajectory.Trajectory\\\\\\'>, _data=None, _children={\\\\\\'special_group\\\\\\': ABCMeta(_name=\\\\\\'special_group\\\\\\', _dtype=<class \\\\\\'shnitsel.data.dataset_containers.trajectory.Trajectory\\\\\\'>, _data=None, _children={\\\\\\'2\\\\\\': ABCMeta(_name=2, _dtype=<class \\\\\\'shnitsel.data.dataset_containers.trajectory.Trajectory\\\\\\'>, _data=Trajectory(_raw_dataset=<xarray.Dataset> Size: 141kB\\\\nDimensions:                 (time: 201, state: 3, atom: 6, direction: 3,\\\\n                             statecomb: 3, full_statecomb: 6)\\\\nCoordinates: (12/23)\\\\n  * time                    (time) float64 2kB 0.0 0.5 1.0 ... 99.0 99.5 100.0\\\\n  * state                   (state) int64 24B 1 2 3\\\\n  * atom                    (atom) int64 48B 0 1 2 3 4 5\\\\n  * direction               (direction) <U1 12B \\\\\\'x\\\\\\' \\\\\\'y\\\\\\' \\\\\\'z\\\\\\'\\\\n  * statecomb               (statecomb) object 24B MultiIndex\\\\n  * full_statecomb          (full_statecomb) object 48B MultiIndex\\\\n    ...                      ...\\\\n  * full_statecomb_from     (full_statecomb) int64 48B 1 1 2 2 3 3\\\\n  * full_statecomb_to       (full_statecomb) int64 48B 2 3 1 3 1 2\\\\n    delta_t                 float64 8B 0.5\\\\n    max_ts                  int64 8B 201\\\\n    t_max                   float64 8B 100.0\\\\n    charge                  float64 8B 1.0\\\\nData variables:\\\\n    energy                  (time, state) float32 2kB -94.69 -94.38 ... -94.34\\\\n    forces                  (time, state, atom, direction) float32 43kB 0.091...\\\\n    atXYZ                   (time, atom, direction) float32 14kB 0.02999 ... ...\\\\n    nacs                    (time, statecomb, atom, direction) float32 43kB 0...\\\\n    dip_perm                (time, state, direction) float32 7kB -0.002852 .....\\\\n    dip_trans               (time, statecomb, direction) float32 7kB 0.001566...\\\\n    socs                    (time, full_statecomb) complex128 19kB 0j 0j ... 0j\\\\nAttributes: (12/13)\\\\n    input_format:                 sharc\\\\n    _shnitsel_setup_for_cleanup:  True\\\\n    completed:                    True\\\\n    input_type:                   dynamic\\\\n    input_format_version:         2.0\\\\n    num_singlets:                 3\\\\n    ...                           ...\\\\n    num_triplets:                 0\\\\n    has_forces:                   True\\\\n    misc_input_settings:          {\\\\\\'input\\\\\\': {\\\\\\'printlevel\\\\\\': \\\\\\'2\\\\\\', \\\\\\'geomfile\\\\\\': \\\\\\'...\\\\n    trajectory_input_path:        test_data/sharc/traj_I01_v2.0/TRAJ_00002\\\\n    trajectory_id:                2\\\\n    __mol:                        <rdkit.Chem.rdchem.Mol object at 0x7cc44acf..., _is_multi_trajectory=False), _children={}, _attrs={}, _parent=..., _level_name=\\\\\\'DataLeaf\\\\\\'), \\\\\\'6\\\\\\': ABCMeta(_name=6, _dtype=<class \\\\\\'shnitsel.data.dataset_containers.trajectory.Trajectory\\\\\\'>, _data=Trajectory(_raw_dataset=<xarray.Dataset> Size: 141kB\\\\nDimensions:                 (time: 201, state: 3, atom: 6, direction: 3,\\\\n                             statecomb: 3, full_statecomb: 6)\\\\nCoordinates: (12/23)\\\\n  * time                    (time) float64 2kB 0.0 0.5 1.0 ... 99.0 99.5 100.0\\\\n  * state                   (state) int64 24B 1 2 3\\\\n  * atom                    (atom) int64 48B 0 1 2 3 4 5\\\\n  * direction               (direction) <U1 12B \\\\\\'x\\\\\\' \\\\\\'y\\\\\\' \\\\\\'z\\\\\\'\\\\n  * statecomb               (statecomb) object 24B MultiIndex\\\\n  * full_statecomb          (full_statecomb) object 48B MultiIndex\\\\n    ...                      ...\\\\n  * full_statecomb_from     (full_statecomb) int64 48B 1 1 2 2 3 3\\\\n  * full_statecomb_to       (full_statecomb) int64 48B 2 3 1 3 1 2\\\\n    delta_t                 float64 8B 0.5\\\\n    max_ts                  int64 8B 201\\\\n    t_max                   float64 8B 100.0\\\\n    charge                  float64 8B 1.0\\\\nData variables:\\\\n    energy                  (time, state) float32 2kB -94.66 -94.37 ... -94.32\\\\n    forces                  (time, state, atom, direction) float32 43kB 0.007...\\\\n    atXYZ                   (time, atom, direction) float32 14kB 0.01689 ... ...\\\\n    nacs                    (time, statecomb, atom, direction) float32 43kB -...\\\\n    dip_perm                (time, state, direction) float32 7kB 0.4447 ... 2...\\\\n    dip_trans               (time, statecomb, direction) float32 7kB -0.05107...\\\\n    socs                    (time, full_statecomb) complex128 19kB 0j 0j ... 0j\\\\nAttributes: (12/13)\\\\n    input_format:                 sharc\\\\n    _shnitsel_setup_for_cleanup:  True\\\\n    completed:                    True\\\\n    input_type:                   dynamic\\\\n    input_format_version:         2.0\\\\n    num_singlets:                 3\\\\n    ...                           ...\\\\n    num_triplets:                 0\\\\n    has_forces:                   True\\\\n    misc_input_settings:          {\\\\\\'input\\\\\\': {\\\\\\'printlevel\\\\\\': \\\\\\'2\\\\\\', \\\\\\'geomfile\\\\\\': \\\\\\'...\\\\n    trajectory_input_path:        test_data/sharc/traj_I01_v2.0/TRAJ_00006\\\\n    trajectory_id:                6\\\\n    __mol:                        <rdkit.Chem.rdchem.Mol object at 0x7cc499fc..., _is_multi_trajectory=False), _children={}, _attrs={}, _parent=..., _level_name=\\\\\\'DataLeaf\\\\\\')}, _attrs={}, _parent=..., _level_name=\\\\\\'DataGroup\\\\\\', _group_info=GroupInfo(group_name=\\\\\\'special_group\\\\\\', group_attributes={\\\\\\'fancy\\\\\\': True}, grouped_properties=None))}, _attrs={}, _parent=<class \\\\\\'shnitsel.data.tree.tree.ShnitselDBRoot[Trajectory]\\\\\\'> [{\\\\\\'level\\\\\\': \\\\\\'ShnitselDBRoot\\\\\\', \\\\\\'children\\\\\\': "1: {\\\\\\'I01\\\\\\': \\\\\\'...\\\\\\'}"}], _level_name=\\\\\\'CompoundGroup\\\\\\', _group_info=None)\\'}'}]
" ], "text/plain": [ " [{'level': 'ShnitselDBRoot', 'children': '1: {\\'I01\\': \\'ABCMeta(_name=\\\\\\'unknown\\\\\\', _dtype=, _data=None, _children={\\\\\\'special_group\\\\\\': ABCMeta(_name=\\\\\\'special_group\\\\\\', _dtype=, _data=None, _children={\\\\\\'2\\\\\\': ABCMeta(_name=2, _dtype=, _data=Trajectory(_raw_dataset= Size: 141kB\\\\nDimensions: (time: 201, state: 3, atom: 6, direction: 3,\\\\n statecomb: 3, full_statecomb: 6)\\\\nCoordinates: (12/23)\\\\n * time (time) float64 2kB 0.0 0.5 1.0 ... 99.0 99.5 100.0\\\\n * state (state) int64 24B 1 2 3\\\\n * atom (atom) int64 48B 0 1 2 3 4 5\\\\n * direction (direction) , _data=Trajectory(_raw_dataset= Size: 141kB\\\\nDimensions: (time: 201, state: 3, atom: 6, direction: 3,\\\\n statecomb: 3, full_statecomb: 6)\\\\nCoordinates: (12/23)\\\\n * time (time) float64 2kB 0.0 0.5 1.0 ... 99.0 99.5 100.0\\\\n * state (state) int64 24B 1 2 3\\\\n * atom (atom) int64 48B 0 1 2 3 4 5\\\\n * direction (direction) [{\\\\\\'level\\\\\\': \\\\\\'ShnitselDBRoot\\\\\\', \\\\\\'children\\\\\\': \"1: {\\\\\\'I01\\\\\\': \\\\\\'...\\\\\\'}\"}], _level_name=\\\\\\'CompoundGroup\\\\\\', _group_info=None)\\'}'}]" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from shnitsel.data.tree.data_group import GroupInfo\n", "\n", "\n", "sharc_input = st.io.read(\"./test_data/sharc/traj_I01_v2.0\", parallel=False)\n", "assert isinstance(sharc_input, ShnitselDB)\n", "sharc_input = sharc_input.set_compound_info(\"I01\",overwrite_all=True)\n", "sharc_input = sharc_input.add_data_group(group_info=GroupInfo(\"special_group\", group_attributes={\"fancy\":True}))\n", "sharc_input" ] }, { "cell_type": "markdown", "id": "a1d8459e", "metadata": {}, "source": [ "The output in `spainn` format loses structural information like the specific `compound` name and `group` levels of the hierarchy." ] }, { "cell_type": "code", "execution_count": 14, "id": "117241ef", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/home/tpadmin/git/shnitsel-tools-official/shnitsel/data/traj_combiner_methods.py:491: FutureWarning: In a future version of xarray the default value for compat will change from compat='equals' to compat='override'. This change will result in the following ValueError: Cannot specify both coords='different' and compat='override'. The recommendation is to set compat explicitly for this case.\n", " frames = xr.concat(\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<class 'shnitsel.data.tree.tree.ShnitselDBRoot[Trajectory]'> [{'level': 'ShnitselDBRoot', 'children': '1: {\\'C-N-H-H-H-H\\': \\'ABCMeta(_name=\\\\\\'C-N-H-H-H-H\\\\\\', _dtype=<class \\\\\\'shnitsel.data.dataset_containers.trajectory.Trajectory\\\\\\'>, _data=None, _children={\\\\\\'0\\\\\\': ABCMeta(_name=\\\\\\'0\\\\\\', _dtype=<class \\\\\\'shnitsel.data.dataset_containers.trajectory.Trajectory\\\\\\'>, _data=None, _children={\\\\\\'0\\\\\\': ABCMeta(_name=\\\\\\'0\\\\\\', _dtype=<class \\\\\\'shnitsel.data.dataset_containers.trajectory.Trajectory\\\\\\'>, _data=Trajectory(_raw_dataset=<xarray.Dataset> Size: 159kB\\\\nDimensions:                 (time: 201, statecomb: 3, atom: 6, direction: 3,\\\\n                             state: 3, full_statecomb: 6)\\\\nCoordinates: (12/24)\\\\n  * time                    (time) float64 2kB 0.0 0.5 1.0 ... 99.0 99.5 100.0\\\\n  * statecomb               (statecomb) object 24B MultiIndex\\\\n  * direction               (direction) <U1 12B \\\\\\'x\\\\\\' \\\\\\'y\\\\\\' \\\\\\'z\\\\\\'\\\\n  * state                   (state) int64 24B 1 2 3\\\\n  * full_statecomb          (full_statecomb) object 48B MultiIndex\\\\n    atNames                 (atom) <U1 24B \\\\\\'C\\\\\\' \\\\\\'N\\\\\\' \\\\\\'H\\\\\\' \\\\\\'H\\\\\\' \\\\\\'H\\\\\\' \\\\\\'H\\\\\\'\\\\n    ...                      ...\\\\n    from                    (statecomb) int64 24B 1 1 2\\\\n    to                      (statecomb) int64 24B 2 3 3\\\\n    full_statecomb_from     (full_statecomb) int64 48B 1 1 2 2 3 3\\\\n    full_statecomb_to       (full_statecomb) int64 48B 2 3 1 3 1 2\\\\n    frame                   (time) int64 2kB 0 1 2 3 4 5 ... 196 197 198 199 200\\\\n    trajectory              int64 8B 2\\\\nDimensions without coordinates: atom\\\\nData variables:\\\\n    nacs                    (time, statecomb, atom, direction) float32 43kB 0...\\\\n    forces                  (time, state, atom, direction) float32 43kB 0.091...\\\\n    energy                  (time, state) float32 2kB -94.69 -94.38 ... -94.34\\\\n    socs                    (time, full_statecomb) complex128 19kB 0j 0j ... 0j\\\\n    atXYZ                   (time, atom, direction) float64 29kB 0.02999 ... ...\\\\n    dip_perm                (time, state, direction) float32 7kB -0.00725 ......\\\\n    dip_trans               (time, statecomb, direction) float32 7kB 0.00398 ...\\\\nAttributes: (12/13)\\\\n    input_format:                 sharc\\\\n    input_type:                   dynamic\\\\n    input_format_version:         2.0\\\\n    completed:                    True\\\\n    num_singlets:                 3\\\\n    num_doublets:                 0\\\\n    ...                           ...\\\\n    _shnitsel_setup_for_cleanup:  True\\\\n    has_forces:                   True\\\\n    misc_input_settings:          {\\\\\\'input\\\\\\': {\\\\\\'printlevel\\\\\\': \\\\\\'2\\\\\\', \\\\\\'geomfile\\\\\\': \\\\\\'...\\\\n    trajectory_input_path:        test_data/sharc/traj_I01_v2.0/TRAJ_00002\\\\n    trajectory_id:                2125878255\\\\n    is_multi_trajectory:          True, _is_multi_trajectory=False), _children={}, _attrs={}, _parent=..., _level_name=\\\\\\'DataLeaf\\\\\\'), \\\\\\'1\\\\\\': ABCMeta(_name=\\\\\\'1\\\\\\', _dtype=<class \\\\\\'shnitsel.data.dataset_containers.trajectory.Trajectory\\\\\\'>, _data=Trajectory(_raw_dataset=<xarray.Dataset> Size: 159kB\\\\nDimensions:                 (time: 201, statecomb: 3, atom: 6, direction: 3,\\\\n                             state: 3, full_statecomb: 6)\\\\nCoordinates: (12/24)\\\\n  * time                    (time) float64 2kB 0.0 0.5 1.0 ... 99.0 99.5 100.0\\\\n  * statecomb               (statecomb) object 24B MultiIndex\\\\n  * direction               (direction) <U1 12B \\\\\\'x\\\\\\' \\\\\\'y\\\\\\' \\\\\\'z\\\\\\'\\\\n  * state                   (state) int64 24B 1 2 3\\\\n  * full_statecomb          (full_statecomb) object 48B MultiIndex\\\\n    atNames                 (atom) <U1 24B \\\\\\'C\\\\\\' \\\\\\'N\\\\\\' \\\\\\'H\\\\\\' \\\\\\'H\\\\\\' \\\\\\'H\\\\\\' \\\\\\'H\\\\\\'\\\\n    ...                      ...\\\\n    from                    (statecomb) int64 24B 1 1 2\\\\n    to                      (statecomb) int64 24B 2 3 3\\\\n    full_statecomb_from     (full_statecomb) int64 48B 1 1 2 2 3 3\\\\n    full_statecomb_to       (full_statecomb) int64 48B 2 3 1 3 1 2\\\\n    frame                   (time) int64 2kB 0 1 2 3 4 5 ... 196 197 198 199 200\\\\n    trajectory              int64 8B 2\\\\nDimensions without coordinates: atom\\\\nData variables:\\\\n    nacs                    (time, statecomb, atom, direction) float32 43kB -...\\\\n    forces                  (time, state, atom, direction) float32 43kB 0.007...\\\\n    energy                  (time, state) float32 2kB -94.66 -94.37 ... -94.32\\\\n    socs                    (time, full_statecomb) complex128 19kB 0j 0j ... 0j\\\\n    atXYZ                   (time, atom, direction) float64 29kB 0.01689 ... ...\\\\n    dip_perm                (time, state, direction) float32 7kB 1.13 ... 6.798\\\\n    dip_trans               (time, statecomb, direction) float32 7kB -0.1298 ...\\\\nAttributes: (12/13)\\\\n    input_format:                 sharc\\\\n    input_type:                   dynamic\\\\n    input_format_version:         2.0\\\\n    completed:                    True\\\\n    num_singlets:                 3\\\\n    num_doublets:                 0\\\\n    ...                           ...\\\\n    _shnitsel_setup_for_cleanup:  True\\\\n    has_forces:                   True\\\\n    misc_input_settings:          {\\\\\\'input\\\\\\': {\\\\\\'printlevel\\\\\\': \\\\\\'2\\\\\\', \\\\\\'geomfile\\\\\\': \\\\\\'...\\\\n    trajectory_input_path:        test_data/sharc/traj_I01_v2.0/TRAJ_00002\\\\n    trajectory_id:                349367038\\\\n    is_multi_trajectory:          True, _is_multi_trajectory=False), _children={}, _attrs={}, _parent=..., _level_name=\\\\\\'DataLeaf\\\\\\')}, _attrs={}, _parent=..., _level_name=\\\\\\'DataGroup\\\\\\', _group_info=None)}, _attrs={}, _parent=<class \\\\\\'shnitsel.data.tree.tree.ShnitselDBRoot[Trajectory]\\\\\\'> [{\\\\\\'level\\\\\\': \\\\\\'ShnitselDBRoot\\\\\\', \\\\\\'children\\\\\\': "1: {\\\\\\'C-N-H-H-H-H\\\\\\': \\\\\\'...\\\\\\'}"}], _level_name=\\\\\\'CompoundGroup\\\\\\', _group_info=None)\\'}'}]
" ], "text/plain": [ " [{'level': 'ShnitselDBRoot', 'children': '1: {\\'C-N-H-H-H-H\\': \\'ABCMeta(_name=\\\\\\'C-N-H-H-H-H\\\\\\', _dtype=, _data=None, _children={\\\\\\'0\\\\\\': ABCMeta(_name=\\\\\\'0\\\\\\', _dtype=, _data=None, _children={\\\\\\'0\\\\\\': ABCMeta(_name=\\\\\\'0\\\\\\', _dtype=, _data=Trajectory(_raw_dataset= Size: 159kB\\\\nDimensions: (time: 201, statecomb: 3, atom: 6, direction: 3,\\\\n state: 3, full_statecomb: 6)\\\\nCoordinates: (12/24)\\\\n * time (time) float64 2kB 0.0 0.5 1.0 ... 99.0 99.5 100.0\\\\n * statecomb (statecomb) object 24B MultiIndex\\\\n * direction (direction) , _data=Trajectory(_raw_dataset= Size: 159kB\\\\nDimensions: (time: 201, statecomb: 3, atom: 6, direction: 3,\\\\n state: 3, full_statecomb: 6)\\\\nCoordinates: (12/24)\\\\n * time (time) float64 2kB 0.0 0.5 1.0 ... 99.0 99.5 100.0\\\\n * statecomb (statecomb) object 24B MultiIndex\\\\n * direction (direction) [{\\\\\\'level\\\\\\': \\\\\\'ShnitselDBRoot\\\\\\', \\\\\\'children\\\\\\': \"1: {\\\\\\'C-N-H-H-H-H\\\\\\': \\\\\\'...\\\\\\'}\"}], _level_name=\\\\\\'CompoundGroup\\\\\\', _group_info=None)\\'}'}]" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# print(sharc_input)\n", "st.io.write_ase_db(sharc_input,\"./test_data/ase/spainn_I01.db\",db_format=\"spainn\", force=True) \n", "sharc_input_spainn = st.io.read(\"./test_data/ase/spainn_I01.db\", parallel=False) \n", "sharc_input_spainn" ] }, { "cell_type": "markdown", "id": "32e73db3", "metadata": {}, "source": [ "The same information is also lost in the `schnet` format.\n", "In both formats, the structure can be somewhat reconstructed, but compounds may need to manually named using `set_compound_info()` on the import result and the group structure may be lost entirely." ] }, { "cell_type": "code", "execution_count": 15, "id": "15767f02", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/home/tpadmin/git/shnitsel-tools-official/shnitsel/data/traj_combiner_methods.py:491: FutureWarning: In a future version of xarray the default value for compat will change from compat='equals' to compat='override'. This change will result in the following ValueError: Cannot specify both coords='different' and compat='override'. The recommendation is to set compat explicitly for this case.\n", " frames = xr.concat(\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<class 'shnitsel.data.tree.tree.ShnitselDBRoot[Trajectory]'> [{'level': 'ShnitselDBRoot', 'children': '1: {\\'C-N-H-H-H-H\\': \\'ABCMeta(_name=\\\\\\'C-N-H-H-H-H\\\\\\', _dtype=<class \\\\\\'shnitsel.data.dataset_containers.trajectory.Trajectory\\\\\\'>, _data=None, _children={\\\\\\'0\\\\\\': ABCMeta(_name=\\\\\\'0\\\\\\', _dtype=<class \\\\\\'shnitsel.data.dataset_containers.trajectory.Trajectory\\\\\\'>, _data=None, _children={\\\\\\'0\\\\\\': ABCMeta(_name=\\\\\\'0\\\\\\', _dtype=<class \\\\\\'shnitsel.data.dataset_containers.trajectory.Trajectory\\\\\\'>, _data=Trajectory(_raw_dataset=<xarray.Dataset> Size: 159kB\\\\nDimensions:                 (time: 201, statecomb: 3, atom: 6, direction: 3,\\\\n                             state: 3, full_statecomb: 6)\\\\nCoordinates: (12/24)\\\\n  * time                    (time) float64 2kB 0.0 0.5 1.0 ... 99.0 99.5 100.0\\\\n  * statecomb               (statecomb) object 24B MultiIndex\\\\n  * direction               (direction) <U1 12B \\\\\\'x\\\\\\' \\\\\\'y\\\\\\' \\\\\\'z\\\\\\'\\\\n  * state                   (state) int64 24B 1 2 3\\\\n  * full_statecomb          (full_statecomb) object 48B MultiIndex\\\\n    atNames                 (atom) <U1 24B \\\\\\'C\\\\\\' \\\\\\'N\\\\\\' \\\\\\'H\\\\\\' \\\\\\'H\\\\\\' \\\\\\'H\\\\\\' \\\\\\'H\\\\\\'\\\\n    ...                      ...\\\\n    from                    (statecomb) int64 24B 1 1 2\\\\n    to                      (statecomb) int64 24B 2 3 3\\\\n    full_statecomb_from     (full_statecomb) int64 48B 1 1 2 2 3 3\\\\n    full_statecomb_to       (full_statecomb) int64 48B 2 3 1 3 1 2\\\\n    frame                   (time) int64 2kB 0 1 2 3 4 5 ... 196 197 198 199 200\\\\n    trajectory              int64 8B 2\\\\nDimensions without coordinates: atom\\\\nData variables:\\\\n    nacs                    (time, statecomb, atom, direction) float32 43kB 0...\\\\n    forces                  (time, state, atom, direction) float32 43kB 0.091...\\\\n    energy                  (time, state) float32 2kB -94.69 -94.38 ... -94.34\\\\n    socs                    (time, full_statecomb) complex128 19kB 0j 0j ... 0j\\\\n    atXYZ                   (time, atom, direction) float64 29kB 0.02999 ... ...\\\\n    dip_perm                (time, state, direction) float32 7kB -0.00725 ......\\\\n    dip_trans               (time, statecomb, direction) float32 7kB 0.00398 ...\\\\nAttributes: (12/13)\\\\n    input_format:                 sharc\\\\n    input_type:                   dynamic\\\\n    input_format_version:         2.0\\\\n    completed:                    True\\\\n    num_singlets:                 3\\\\n    num_doublets:                 0\\\\n    ...                           ...\\\\n    _shnitsel_setup_for_cleanup:  True\\\\n    has_forces:                   True\\\\n    misc_input_settings:          {\\\\\\'input\\\\\\': {\\\\\\'printlevel\\\\\\': \\\\\\'2\\\\\\', \\\\\\'geomfile\\\\\\': \\\\\\'...\\\\n    trajectory_input_path:        test_data/sharc/traj_I01_v2.0/TRAJ_00002\\\\n    trajectory_id:                635819650\\\\n    is_multi_trajectory:          True, _is_multi_trajectory=False), _children={}, _attrs={}, _parent=..., _level_name=\\\\\\'DataLeaf\\\\\\'), \\\\\\'1\\\\\\': ABCMeta(_name=\\\\\\'1\\\\\\', _dtype=<class \\\\\\'shnitsel.data.dataset_containers.trajectory.Trajectory\\\\\\'>, _data=Trajectory(_raw_dataset=<xarray.Dataset> Size: 159kB\\\\nDimensions:                 (time: 201, statecomb: 3, atom: 6, direction: 3,\\\\n                             state: 3, full_statecomb: 6)\\\\nCoordinates: (12/24)\\\\n  * time                    (time) float64 2kB 0.0 0.5 1.0 ... 99.0 99.5 100.0\\\\n  * statecomb               (statecomb) object 24B MultiIndex\\\\n  * direction               (direction) <U1 12B \\\\\\'x\\\\\\' \\\\\\'y\\\\\\' \\\\\\'z\\\\\\'\\\\n  * state                   (state) int64 24B 1 2 3\\\\n  * full_statecomb          (full_statecomb) object 48B MultiIndex\\\\n    atNames                 (atom) <U1 24B \\\\\\'C\\\\\\' \\\\\\'N\\\\\\' \\\\\\'H\\\\\\' \\\\\\'H\\\\\\' \\\\\\'H\\\\\\' \\\\\\'H\\\\\\'\\\\n    ...                      ...\\\\n    from                    (statecomb) int64 24B 1 1 2\\\\n    to                      (statecomb) int64 24B 2 3 3\\\\n    full_statecomb_from     (full_statecomb) int64 48B 1 1 2 2 3 3\\\\n    full_statecomb_to       (full_statecomb) int64 48B 2 3 1 3 1 2\\\\n    frame                   (time) int64 2kB 0 1 2 3 4 5 ... 196 197 198 199 200\\\\n    trajectory              int64 8B 2\\\\nDimensions without coordinates: atom\\\\nData variables:\\\\n    nacs                    (time, statecomb, atom, direction) float32 43kB -...\\\\n    forces                  (time, state, atom, direction) float32 43kB 0.007...\\\\n    energy                  (time, state) float32 2kB -94.66 -94.37 ... -94.32\\\\n    socs                    (time, full_statecomb) complex128 19kB 0j 0j ... 0j\\\\n    atXYZ                   (time, atom, direction) float64 29kB 0.01689 ... ...\\\\n    dip_perm                (time, state, direction) float32 7kB 1.13 ... 6.798\\\\n    dip_trans               (time, statecomb, direction) float32 7kB -0.1298 ...\\\\nAttributes: (12/13)\\\\n    input_format:                 sharc\\\\n    input_type:                   dynamic\\\\n    input_format_version:         2.0\\\\n    completed:                    True\\\\n    num_singlets:                 3\\\\n    num_doublets:                 0\\\\n    ...                           ...\\\\n    _shnitsel_setup_for_cleanup:  True\\\\n    has_forces:                   True\\\\n    misc_input_settings:          {\\\\\\'input\\\\\\': {\\\\\\'printlevel\\\\\\': \\\\\\'2\\\\\\', \\\\\\'geomfile\\\\\\': \\\\\\'...\\\\n    trajectory_input_path:        test_data/sharc/traj_I01_v2.0/TRAJ_00002\\\\n    trajectory_id:                795023807\\\\n    is_multi_trajectory:          True, _is_multi_trajectory=False), _children={}, _attrs={}, _parent=..., _level_name=\\\\\\'DataLeaf\\\\\\')}, _attrs={}, _parent=..., _level_name=\\\\\\'DataGroup\\\\\\', _group_info=None)}, _attrs={}, _parent=<class \\\\\\'shnitsel.data.tree.tree.ShnitselDBRoot[Trajectory]\\\\\\'> [{\\\\\\'level\\\\\\': \\\\\\'ShnitselDBRoot\\\\\\', \\\\\\'children\\\\\\': "1: {\\\\\\'C-N-H-H-H-H\\\\\\': \\\\\\'...\\\\\\'}"}], _level_name=\\\\\\'CompoundGroup\\\\\\', _group_info=None)\\'}'}]
" ], "text/plain": [ " [{'level': 'ShnitselDBRoot', 'children': '1: {\\'C-N-H-H-H-H\\': \\'ABCMeta(_name=\\\\\\'C-N-H-H-H-H\\\\\\', _dtype=, _data=None, _children={\\\\\\'0\\\\\\': ABCMeta(_name=\\\\\\'0\\\\\\', _dtype=, _data=None, _children={\\\\\\'0\\\\\\': ABCMeta(_name=\\\\\\'0\\\\\\', _dtype=, _data=Trajectory(_raw_dataset= Size: 159kB\\\\nDimensions: (time: 201, statecomb: 3, atom: 6, direction: 3,\\\\n state: 3, full_statecomb: 6)\\\\nCoordinates: (12/24)\\\\n * time (time) float64 2kB 0.0 0.5 1.0 ... 99.0 99.5 100.0\\\\n * statecomb (statecomb) object 24B MultiIndex\\\\n * direction (direction) , _data=Trajectory(_raw_dataset= Size: 159kB\\\\nDimensions: (time: 201, statecomb: 3, atom: 6, direction: 3,\\\\n state: 3, full_statecomb: 6)\\\\nCoordinates: (12/24)\\\\n * time (time) float64 2kB 0.0 0.5 1.0 ... 99.0 99.5 100.0\\\\n * statecomb (statecomb) object 24B MultiIndex\\\\n * direction (direction) [{\\\\\\'level\\\\\\': \\\\\\'ShnitselDBRoot\\\\\\', \\\\\\'children\\\\\\': \"1: {\\\\\\'C-N-H-H-H-H\\\\\\': \\\\\\'...\\\\\\'}\"}], _level_name=\\\\\\'CompoundGroup\\\\\\', _group_info=None)\\'}'}]" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "st.io.write_ase_db(sharc_input,\"./test_data/ase/schnet_I01.db\",db_format=\"schnet\", force=True) \n", "sharc_input_schnet = st.io.read(\"./test_data/ase/schnet_I01.db\", parallel=False) \n", "sharc_input_schnet\n" ] } ], "metadata": { "kernelspec": { "display_name": "shnitsel-tools-official (3.12.3)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.3" } }, "nbformat": 4, "nbformat_minor": 5 }