shnitsel.core.xrhelpers

Functions

replace_total(da, to_replace, value)

Replaces each occurence of to_replace in da with the corresponding element of value.

midx_combs(values[, name])

flatten_midx(obj, idx_name[, renamer])

flatten_levels(obj, idx_name, levels[, new_name, ...])

expand_midx(obj, midx_name, level_name, value)

assign_levels(obj[, levels])

Assign new values to levels of MultiIndexes in obj

open_frames(path)

Opens a NetCDF4 file saved by shnitsel-tools, specially interpreting certain attributes.

save_frames(frames, path[, complevel])

Save a Dataset, presumably (but not necessarily) consisting of frames of trajectories, to a file at path.

split_for_saving(frames[, bytes_per_chunk])

save_split(frames, path_template[, bytes_per_chunk, ...])

mgroupby(obj, levels)

Group a Dataset or DataArray by several levels of a MultiIndex it contains.

msel(obj, **kwargs)

sel_trajs(frames, trajids_or_mask[, invert])

Select trajectories using a list of trajectories IDs or a boolean mask

sel_trajids(frames, trajids[, invert])

Will not generally return trajectories in order given

unstack_trajs(frames)

Unstack the frame MultiIndex so that trajid and time become

stack_trajs(unstacked)

Stack the trajid and time dims of an unstacked Dataset

Module Contents

replace_total(da, to_replace, value)

Replaces each occurence of to_replace in da with the corresponding element of value. Replacement must be total, i.e. every element of da must be in to_replace. This permits a change of dtype between to_replace and value. This function is based on the snippets at https://github.com/pydata/xarray/issues/6377

Parameters:
Return type:

An xr.DataArray with dtype matching value.

midx_combs(values, name=None)
Parameters:
  • values (pandas.core.indexes.base.Index | list)

  • name (str | None)

flatten_midx(obj, idx_name, renamer=None)
Parameters:
Return type:

xarray.Dataset | xarray.DataArray

flatten_levels(obj, idx_name, levels, new_name=None, position=0, renamer=None)
Parameters:
Return type:

xarray.Dataset | xarray.DataArray

expand_midx(obj, midx_name, level_name, value)
Parameters:

obj (xarray.Dataset | xarray.DataArray)

Return type:

xarray.Dataset | xarray.DataArray

assign_levels(obj, levels=None, **levels_kwargs)

Assign new values to levels of MultiIndexes in obj

Parameters:
Return type:

A new object with the new level values replacing the old level values.

Raises:

ValueError – If levels are provided in both keyword and dictionary form.

open_frames(path)

Opens a NetCDF4 file saved by shnitsel-tools, specially interpreting certain attributes.

Parameters:

path – The path of the file to open.

Return type:

An xarray.Dataset with any MultiIndex restored.

Raises:
  • FileNotFoundError – If there is is nothing at path, or path is not a file.

  • ValueError (or other exception) – Raised by the underlying h5netcdf engine if the file is corrupted.

save_frames(frames, path, complevel=9)

Save a Dataset, presumably (but not necessarily) consisting of frames of trajectories, to a file at path.

Parameters:
  • accessor) (frames (omit if using) – The Dataset to save

  • path – The path at which to save it

  • complevel – The level of gzip compression which will be applied to all variables in the Dataset, by default 9

  • optional – The level of gzip compression which will be applied to all variables in the Dataset, by default 9

Notes

This function/accessor method wraps xarray.Dataset.to_netcdf() but not numpy.any().

split_for_saving(frames, bytes_per_chunk=50000000.0)
save_split(frames, path_template, bytes_per_chunk=50000000.0, complevel=9, ignore_errors=False)
mgroupby(obj, levels)

Group a Dataset or DataArray by several levels of a MultiIndex it contains.

Parameters:
Returns:

  • The grouped object, which behaves as documented at xr.Dataset.groupby()

  • and xr.DataArray.groupby with the caveat that the specified levels have been

  • ”flattened” into a single Multiindex level of tuples.

Raises:

ValueError – If no MultiIndex is found, or if the named levels belong to different MultiIndexes.

Return type:

xarray.core.groupby.DataArrayGroupBy | xarray.core.groupby.DatasetGroupBy

Warning

The function does not currently check whether the levels specified are really levels of a MultiIndex, as opposed to names of non-MultiIndex indexes.

msel(obj, **kwargs)
Parameters:

obj (xarray.Dataset | xarray.DataArray)

Return type:

xarray.Dataset | xarray.DataArray

sel_trajs(frames, trajids_or_mask, invert=False)

Select trajectories using a list of trajectories IDs or a boolean mask

Parameters:
  • frames (xarray.Dataset | xarray.DataArray) – The xr.Dataset from which a selection is to be drawn

  • trajids_or_mask (Sequence[int] | Sequence[bool]) –

    Either
    • A sequences of integers representing trajectory IDs to be included, in which case the trajectories may not be returned in the order specified.

    • Or a sequence of booleans, each indicating whether the trajectory with an ID in the corresponding entry in the Dataset’s trajid_ coordinate should be included

  • invert – Whether to invert the selection, i.e. return those trajectories not specified, by default False

  • optional – Whether to invert the selection, i.e. return those trajectories not specified, by default False

Return type:

A new xr.Dataset containing only the specified trajectories

Raises:
  • NotImplementedError – when an attempt is made to index an xr.Datset without a trajid_ dimension/coordinate using a boolean mask

  • TypeError – If trajids_or_mask has a dtype other than integer or boolean

sel_trajids(frames, trajids, invert=False)

Will not generally return trajectories in order given

Parameters:
Return type:

xarray.Dataset

unstack_trajs(frames)

Unstack the frame MultiIndex so that trajid and time become separate dims. Wraps the xarray.Dataset.unstack() method.

Parameters:

frames (xarray.Dataset | xarray.DataArray) – An xarray.Dataset with a frame dimension associated with a MultiIndex coordinate with levels named trajid and time. The Dataset may also have a trajid_ dimension used for variables and coordinates that store information pertaining to each trajectory in aggregate; this will be aligned along the trajid dimension of the unstacked Dataset.

Returns:

Return type:

xarray.Dataset | xarray.DataArray

stack_trajs(unstacked)

Stack the trajid and time dims of an unstacked Dataset into a MultiIndex along a new dimension called frame. Wraps the xarray.Dataset.stack() method.

Parameters:
Returns:

  • An xarray.Dataset with a frame dimension associated with

  • a MultiIndex coordinate with levels named trajid and time. Those variables

  • and coordinates which only depended on one of trajid

  • or time but not the other in the unstacked Dataset, will be aligned along new

  • dimensions named trajid_ and time_. The new dimensions trajid_ and

  • time_ will be independent of the frame dimension and its trajid and

  • time levels.

Return type:

xarray.Dataset | xarray.DataArray