shnitsel.analyze.pca¶

Attributes¶

`principal_component_analysis`
`PCA`

`pca_and_hops`(frames, mean)	Get PCA points and info on which of them represent hops
`pairwise_dists_pca`(atXYZ[, mean, return_pca_object])	PCA-reduced pairwise interatomic distances
`pca`(da, dim[, n_components, return_pca_object])	xarray-oriented wrapper around scikit-learn's PCA

pca_and_hops(frames, mean)¶

Get PCA points and info on which of them represent hops

Parameters:

frames (xarray.Dataset) – A Dataset containing ‘atXYZ’ and ‘astate’ variables
mean (bool) – mean center data before pca if true

Returns:

pca_res – The PCA-reduced pairwise interatomic distances
hops_pca_coords – pca_res filtered by hops, to facilitate marking hops when plotting

Return type:

tuple[xarray.DataArray, xarray.DataArray]

pairwise_dists_pca(atXYZ, mean=False, return_pca_object=False, **kwargs)¶

PCA-reduced pairwise interatomic distances

Parameters:

atXYZ (shnitsel.core.typedefs.AtXYZ) – A DataArray containing the atomic positions; must have a dimension called ‘atom’
mean (bool)

Returns:

Return type:

xarray.DataArray

pca(da, dim, n_components=2, return_pca_object=False)¶

xarray-oriented wrapper around scikit-learn’s PCA

Parameters:

da (xarray.DataArray) – A DataArray with at least a dimension with a name matching dim
dim (str) – The name of the array-dimension to reduce (i.e. the axis along which different features lie)
n_components (int) – The number of principle components to return, by default 2
optional – The number of principle components to return, by default 2
return_pca_object (bool) – Whether to return the scikit-learn PCA object as well as the transformed data, by default False
optional – Whether to return the scikit-learn PCA object as well as the transformed data, by default False

Returns:

pca_res – A DataArray with the same dimensions as da, except for the dimension indicated by dim, which is replaced by a dimension PC of size n_components If DataArray accessors are active, the following members will be added to the accessor of the result:
- pca_res.st.loadings: The PCA loadings as a DataArray
- pca_res.st.pca_object: The scikit-learn pipeline used for PCA, including the MinMaxScaler
- pca_res_st.use_to_transform(other_da: xr.DataArray): A function which transforms its argument (other data) using the pipeline that has been fitted to the current data.
(NB. The above assumes that the accessor name used is st, the default)
[pca_object] – The trained PCA object produced by scikit-learn, if return_pca_object=True
Examples
———
>>> pca_results1 = data1.st.pca(‘features’)
>>> pca_results1.st.loadings # See the loadings
>>> pca_results2 = pca_results1.st.use_to_transform(data2)

Return type: