shnitsel.data.multi_indices

Attributes

DatasetOrArray

Functions

midx_combs(values[, name])

Helper function to create a Multi-index based dimension coordinate for an xarray

flatten_midx(obj, idx_name[, renamer])

Function to flatten a multi-index into a flat index.

flatten_levels(obj, idx_name, levels[, new_name, ...])

expand_midx(obj, midx_name, level_name, value)

assign_levels(obj[, levels])

Assign new values to levels of MultiIndexes in obj

mgroupby(obj, levels)

Group a Dataset or DataArray by several levels of a MultiIndex it contains.

msel(obj, **kwargs)

sel_trajs(frames, trajids_or_mask[, invert])

Select trajectories using a list of trajectories IDs or a boolean mask

sel_trajids(frames, trajids[, invert])

Will not generally return trajectories in order given

unstack_trajs(frames)

Unstack the frame MultiIndex so that trajid and time become

stack_trajs(unstacked)

Stack the trajid and time dims of an unstacked Dataset

mdiff(da)

Take successive differences along the 'frame' dimension

Module Contents

DatasetOrArray
midx_combs(values, name=None)

Helper function to create a Multi-index based dimension coordinate for an xarray from all (unordered) pairwise combinations of entries in values

Parameters:
  • values (pd.core.indexes.base.Index | list) – The source values to generate pairwise combinations for

  • name (str | None, optional) – Optionally a name for the resulting combination dimension. Defaults to None.

Raises:

ValueError – If no name was provided and the name could not be extracted from the values parameter

Returns:

The resulting coordinates object.

Return type:

xr.Coordinates

flatten_midx(obj, idx_name, renamer=None)

Function to flatten a multi-index into a flat index.

Has the option to provide a custom renaming function

Parameters:
  • obj (xr.Dataset | xr.DataArray) – The object with the index intended to be flattened

  • idx_name (str) – The name of the index to flatten.

  • renamer (callable | None, optional) – An optional function to carry out the renaming of the combined entry from individual entries. Defaults to None.

Returns:

The refactored object without the original index coordinates but with a combined index instead

Return type:

xr.Dataset | xr.DataArray

flatten_levels(obj, idx_name, levels, new_name=None, position=0, renamer=None)
Parameters:
  • obj (DatasetOrArray)

  • idx_name (str)

  • levels (Sequence[str])

  • new_name (str | None)

  • position (int)

  • renamer (Callable | None)

Return type:

DatasetOrArray

expand_midx(obj, midx_name, level_name, value)
Parameters:
  • obj (DatasetOrArray)

  • midx_name (str)

  • level_name (str)

Return type:

DatasetOrArray

assign_levels(obj, levels=None, **levels_kwargs)

Assign new values to levels of MultiIndexes in obj

Parameters:
  • obj (DatasetOrArray) – An xarray object with at least one MultiIndex

  • levels (dict[str, numpy.typing.ArrayLike] | None) – A mapping whose keys are the names of the levels and whose values are the levels to assign. The mapping will be passed to xarray.DataArray.assign_coords() (or the xarray.Dataset equivalent).

  • optional – A mapping whose keys are the names of the levels and whose values are the levels to assign. The mapping will be passed to xarray.DataArray.assign_coords() (or the xarray.Dataset equivalent).

  • levels_kwargs (numpy.typing.ArrayLike)

Return type:

A new object (of the same type as obj) with the new level values replacing the old level values.

Raises:

ValueError – If levels are provided in both keyword and dictionary form.

mgroupby(obj, levels)

Group a Dataset or DataArray by several levels of a MultiIndex it contains.

Parameters:
Returns:

  • The grouped object, which behaves as documented at xr.Dataset.groupby()

  • and xr.DataArray.groupby with the caveat that the specified levels have been

  • ”flattened” into a single Multiindex level of tuples.

Raises:

ValueError – If no MultiIndex is found, or if the named levels belong to different MultiIndexes.

Return type:

xarray.core.groupby.DataArrayGroupBy | xarray.core.groupby.DatasetGroupBy

Warning

The function does not currently check whether the levels specified are really levels of a MultiIndex, as opposed to names of non-MultiIndex indexes.

msel(obj, **kwargs)
Parameters:

obj (xarray.Dataset | xarray.DataArray)

Return type:

xarray.Dataset | xarray.DataArray

sel_trajs(frames, trajids_or_mask, invert=False)

Select trajectories using a list of trajectories IDs or a boolean mask

Parameters:
  • frames (xarray.Dataset | xarray.DataArray) – The xr.Dataset from which a selection is to be drawn

  • trajids_or_mask (Sequence[int] | Sequence[bool]) –

    Either
    • A sequences of integers representing trajectory IDs to be included, in which case the trajectories may not be returned in the order specified.

    • Or a sequence of booleans, each indicating whether the trajectory with an ID in the corresponding entry in the Dataset’s trajid_ coordinate should be included

  • invert – Whether to invert the selection, i.e. return those trajectories not specified, by default False

  • optional – Whether to invert the selection, i.e. return those trajectories not specified, by default False

Return type:

A new xr.Dataset containing only the specified trajectories

Raises:
  • NotImplementedError – when an attempt is made to index an xr.Datset without a trajid_ dimension/coordinate using a boolean mask

  • TypeError – If trajids_or_mask has a dtype other than integer or boolean

sel_trajids(frames, trajids, invert=False)

Will not generally return trajectories in order given

Parameters:
Return type:

xarray.Dataset

unstack_trajs(frames)

Unstack the frame MultiIndex so that trajid and time become separate dims. Wraps the xarray.Dataset.unstack() method.

Parameters:
  • frames (DatasetOrArray) – An xarray.Dataset with a frame dimension associated with a MultiIndex coordinate with levels named trajid and time. The Dataset may also have a trajid_ dimension used for variables and coordinates that store information pertaining to each trajectory in aggregate; this will be aligned along the trajid dimension of the unstacked Dataset.

  • DatasetOrArray – An xarray.Dataset with a frame dimension associated with a MultiIndex coordinate with levels named trajid and time. The Dataset may also have a trajid_ dimension used for variables and coordinates that store information pertaining to each trajectory in aggregate; this will be aligned along the trajid dimension of the unstacked Dataset.

Returns:

  • An xarray.Dataset with independent trajid and time

  • dimensions. Same type as frames

Return type:

DatasetOrArray

stack_trajs(unstacked)

Stack the trajid and time dims of an unstacked Dataset into a MultiIndex along a new dimension called frame. Wraps the xarray.Dataset.stack() method.

Parameters:
Returns:

  • An xarray.Dataset with a frame dimension associated with

  • a MultiIndex coordinate with levels named trajid and time. Those variables

  • and coordinates which only depended on one of trajid

  • or time but not the other in the unstacked Dataset, will be aligned along new

  • dimensions named trajid_ and time_. The new dimensions trajid_ and

  • time_ will be independent of the frame dimension and its trajid and

  • time levels.

Return type:

xarray.Dataset | xarray.DataArray

mdiff(da)

Take successive differences along the ‘frame’ dimension

Parameters:

da (xarray.DataArray) – An xarray.DataArray with a ‘frame’ dimension corresponding to a pandas.MultiIndex of which the innermost level is ‘time’.

Returns:

  • An xarray.DataArray with the same shape, dimension names etc.,

  • but with the data of the (i)th frame replaced by the difference between

  • the original (i+1)th and (i)th frames, with zeros filling in for both the

  • initial frame and any frame for which time = 0, to avoid taking differences

  • between the last and first frames of successive trajectories.

Return type:

xarray.DataArray