shnitsel.io.format_reader_base#
Attributes#
Classes#
Information to keep track of relevant information for |
|
Abstract base class for all input formats to define a unified input reader interface. |
Module Contents#
- class FormatInformation#
Information to keep track of relevant information for
- path: pathlib.Path | None = None#
- _default_trajid_pattern_regex#
- DataType#
- TrajType#
- class FormatReader#
Bases:
abc.ABCAbstract base class for all input formats to define a unified input reader interface.
Should be subclassed and the functions check_path_for_format_info() and read_from_path() overridden in the subclass
- abstractmethod find_candidates_in_directory(path)#
Function to return a all potential matches for the current file format within a provided directory at path.
- Returns:
list[PathOptionsType] – A list of paths that should be checked in detail for whether they represent the format of this FormatReader.
None – No potential candidate found
- Parameters:
path (shnitsel.io.shared.helpers.PathOptionsType)
- Return type:
list[pathlib.Path] | None
- abstractmethod check_path_for_format_info(path, hints_or_settings=None)#
Checks if a path is of a given format and returns a struct containing all relevant info for reading the format at this location. Additionally checks configured user settings provided in hints_or_settings whether they are consistent with the file format.
Needs to be overridden by each format.
- Parameters:
path (os.PathLike) – The path to look for data from the respective method for. Depending on the format, this would need to point to a file or a directory containing the actual trajectory information
hints (dict|None, optional) – Potential hints/configuration options provided by the user as input to the reader which can be checked for conflicts with the requirements of the format (i.e. requesting a static initial condition from a dynamic trajectory in SHARC). Defaults to None
hints_or_settings (dict | None)
- Raises:
FileNotFoundError – If required files were not found, i.e. if the path does not actually constitute input data of the denoted format
ValueError – If the hints/settings provided by the user conflict with the requirements of the format
- Returns:
A structure containing all of the information relevant to the interpretation or reading of the format. Can be used to differentiate different versions of the same format. Should be passed to the read_from_path() method of the same class.
- Return type:
- abstractmethod read_from_path(path, *, format_info, loading_parameters=None, expect_dtype=None)#
Method to read a path of the respective format (e.g. ‘shnitsel’ or ‘sharc’) into a shnitsel-format trajectory or hierarchical data type.
The return value of type xarray.Dataset read from the path is used for first-time imported trajectories from various formats. It is then wrapped into a Trajectory or Frames object to make type distinction simpler for users. For more complex input formats, hierarchical return types may be supported.
- Parameters:
path (pathlib.Path) – Path to either the input file or input folder to be read.
format_info (FormatInformation) – Format information previously constructed by check_path_for_format_info(). If None, will be constructed by calling Self.check_path_for_format_info() first.
loading_parameters (LoadingParameters|None, optional) – Loading parameters to e.g. override default state names, units or configure the error reporting behavior
expect_dtype (type[DataType] | UnionType, optional) – Optionally a datatype to constrain which types can be returned from the input call. If they do not match, an error may trigger.
- Raises:
FileNotFoundError – If required files were not found, i.e. if the path does not actually constitute input data of the denoted format
ValueError – If the format_info provided by the user conflicts with the requirements of the format
Valueerror – If neither path nor format_info are provided
TypeError – Formats can choose to raise a TypeError if the parsed input does not yield an object of type expected_dtype.
- Returns:
xr.Dataset
| ShnitselDataset
| SupportsFromXrConversion
| TreeNode[Any, ShnitselDataset | SupportsFromXrConversion | xr.Dataset]
| TreeNode[Any, DataType]
| Sequence[xr.Dataset | ShnitselDataset | SupportsFromXrConversion]
| DataType – Data resulting from reading data with hierarchical or arbitrary data contents.
None – If the reading of data failed for arbitrary reasons.
- Return type:
xarray.Dataset | xarray.DataArray | shnitsel.data.dataset_containers.shared.ShnitselDataset | shnitsel.data.xr_io_compatibility.SupportsFromXrConversion | shnitsel.data.tree.node.TreeNode[Any, shnitsel.data.dataset_containers.shared.ShnitselDataset | shnitsel.data.xr_io_compatibility.SupportsFromXrConversion | xarray.Dataset | xarray.DataArray] | shnitsel.data.tree.node.TreeNode[Any, DataType] | Sequence[xarray.Dataset | shnitsel.data.dataset_containers.shared.ShnitselDataset | shnitsel.data.xr_io_compatibility.SupportsFromXrConversion | xarray.DataArray] | DataType | None
- read_data(path, format_info=None, loading_parameters=None, expect_dtype=None)#
Wrapper function to perform some potential initialization and finalization on the read trajectory objects.
Uses the format-specific self.read_from_path() method to read the trajectory and then performs some standard post processing on it.
- Parameters:
path (PathOptionsType, optional) – Path to either the input file or input folder to be read.
format_info (FormatInformation, optional) – Format information previously constructed by check_path_for_format_info(). If None, will be constructed by calling Self.check_path_for_format_info() first. Defaults to None.
loading_parameters (LoadingParameters|None, optional) – Loading parameters to e.g. override default state names, units or configure the error reporting behavior
expected_dtype (type[DataType] | TypeForm[DataType], optional) – Optional setting of the expected dtype as a result of this call to import data. Either specifies the direct output type expected by the read() call or the dtype in a hierarchical structure.s
expect_dtype (type[DataType] | types.UnionType | None)
- Returns:
Trajectory | Frames | xr.Dataset – Returns a wrapped Trajectory/Frames/xr.Dataset object with standard units, only assigned variables remaining and all variables with appropriate attributes if new data was imported from one of the supported input formats.
ShnitselDB[Trajectory | Frames | InterState | PerState | SupportsFromXrConversion]
| DataType
| ShnitselDB[DataType]
| CompoundGroup[DataType]
| CompoundGroup[DataType]
| CompoundGroup[DataType]
| xr.DataArray – If a netcdf file was read as input, will return either one of the default xarray datatypes or a hierarchy of data stored in a shnitsel tools tree. Providing a type hint of the expected dtype in the invocation helps with identifying the expected type in the hierarchy but also allows for explicit control of the output type if desired. In principle, arbitrary types can be the result of inputs from netcdf files due to deserialization routines called in the deserialization process.
None – If no result was obtained by the call to self.read_from_path(), it will return None.
- Raises:
ValueError – If the format_info provided by the user conflicts with the requirements of the format
ValueError – If neither path nor format_info are provided
FileNotFoundError – If required files were not found, i.e. if the path does not actually constitute input data of the denoted format
- Return type:
xarray.Dataset | xarray.DataArray | shnitsel.data.dataset_containers.shared.ShnitselDataset | shnitsel.data.xr_io_compatibility.SupportsFromXrConversion | shnitsel.data.tree.node.TreeNode[Any, shnitsel.data.dataset_containers.shared.ShnitselDataset | shnitsel.data.xr_io_compatibility.SupportsFromXrConversion | xarray.Dataset | xarray.DataArray] | shnitsel.data.tree.node.TreeNode[Any, DataType] | Sequence[xarray.Dataset | shnitsel.data.dataset_containers.shared.ShnitselDataset | shnitsel.data.xr_io_compatibility.SupportsFromXrConversion | xarray.DataArray] | DataType | None
- abstractmethod get_units_with_defaults(unit_overrides=None)#
Apply units to the default unit dictionary of the format
- get_loading_parameters_with_defaults(base_loading_parameters)#
Populate loading parameters with default settings for this format
If settings were applied by the user, they may be coerced into a more fitting format like converting lists of state names into a function that automatically assigns the names to the variables in a dataset to simplify the logic at a later point where we would have to support both callables or lists being provided as parameters.
- Parameters:
base_loading_parameters (LoadingParameters | None) – User-provided parameter overrides
- Returns:
The default parameters modified by user overrides.
- Return type: