ikpls.jax_ikpls_alg_1

Contains the PLS Class which implements partial least-squares regression using Improved Kernel PLS Algorithm #1 by Dayal and MacGregor: https://doi.org/10.1002/(SICI)1099-128X(199701)11:1%3C73::AID-CEM435%3E3.0.CO;2-%23

The class is implemented using JAX for end-to-end differentiability. Additionally, JAX allows CPU, GPU, and TPU execution.

Author: Ole-Christian Galbo Engstrøm E-mail: ocge@foss.dk

Classes

PLS(center_X, center_Y, scale_X, scale_Y, ...)

Implements partial least-squares regression using Improved Kernel PLS Algorithm #1 by Dayal and MacGregor: https://doi.org/10.1002/(SICI)1099-128X(199701)11:1%3C73::AID-CEM435%3E3.0.CO;2-%23

class ikpls.jax_ikpls_alg_1.PLS(center_X: bool = True, center_Y: bool = True, scale_X: bool = True, scale_Y: bool = True, ddof: int = 1, copy: bool = True, dtype: str | type[Any] | dtype | SupportsDType = <class 'jax.numpy.float64'>, differentiable: bool = False, verbose: bool = False)

Bases: PLSBase

Implements partial least-squares regression using Improved Kernel PLS Algorithm #1 by Dayal and MacGregor: https://doi.org/10.1002/(SICI)1099-128X(199701)11:1%3C73::AID-CEM435%3E3.0.CO;2-%23

Parameters:

center_X (bool, default=True) – Whether to center X before fitting by subtracting its row of column-wise means from each row.
center_Y (bool, default=True) – Whether to center Y before fitting by subtracting its row of column-wise means from each row.
scale_X (bool, default=True) – Whether to scale X before fitting by dividing each row with the row of X’s column-wise standard deviations.
scale_Y (bool, default=True) – Whether to scale Y before fitting by dividing each row with the row of Y’s column-wise standard deviations.
ddof (int, default=1) – The delta degrees of freedom to use when computing the sample standard deviation. A value of 0 corresponds to the biased estimate of the sample standard deviation, while a value of 1 corresponds to Bessel’s correction for the sample standard deviation.
copy (bool, default=True) – Whether to copy X and Y in fit before potentially applying centering and scaling. If True, then the data is copied before fitting. If False, and dtype matches the type of X and Y, then centering and scaling is done inplace, modifying both arrays.
dtype (DTypeLike, default=jnp.float64) – The float datatype to use in computation of the PLS algorithm. Using a lower precision than float64 will yield significantly worse results when using an increasing number of components due to propagation of numerical errors.
differentiable (bool, default=False) – Whether to make the implementation end-to-end differentiable. The differentiable version is slightly slower. Results among the two versions are identical. If this is True, fit and stateless_fit will not issue a warning if the residual goes below machine epsilon, and max_stable_components will not be set.
verbose (bool, default=False) – If True, each sub-function will print when it will be JIT compiled. This can be useful to track if recompilation is triggered due to passing inputs with different shapes.

Notes

Any centering and scaling is undone before returning predictions with fit to ensure that predictions are on the original scale. If both centering and scaling are True, then the data is first centered and then scaled.

Fits Improved Kernel PLS Algorithm #1 on X and Y using A components.

Parameters:

X (Array of shape (N, K)) – Predictor variables.
Y (Array of shape (N, M) or (N,)) – Response variables.
A (int) – Number of components in the PLS model.
weights (Array of shape (N,) or None, optional, default=None) – Weights for each observation. If None, then all observations are weighted equally.

A

Number of components in the PLS model.

Type:: int

max_stable_components

Maximum number of components that can be used without the residual going below machine epsilon. This is not set if differentiable is True.

Type:: int

B

PLS regression coefficients tensor.

Type:: Array of shape (A, K, M)

W

PLS weights matrix for X.

Type:: Array of shape (K, A)

P

PLS loadings matrix for X.

Type:: Array of shape (K, A)

Q

PLS Loadings matrix for Y.

Type:: Array of shape (M, A)

R

PLS weights matrix to compute scores T directly from original X.

Type:: Array of shape (K, A)

T

PLS scores matrix of X. IMPORTANT: If weights are provided, these are NOT the scores of X but instead weighted scores. In this case, scores can be computerd using transform.

Type:: Array of shape (N, A)

X_mean

Mean of X. If centering is not performed, this is None. If weights are used, then this is the weighted mean.

Type:: Array of shape (1, K) or None

Y_mean

Mean of Y. If centering is not performed, this is None. If weights are used, then this is the weighted mean.

Type:: Array of shape (1, M) or None

X_std

Sample standard deviation of X. If scaling is not performed, this is None. If weights are used, then this is the weighted standard deviation.

Type:: Array of shape (1, K) or None

Y_std

Sample standard deviation of Y. If scaling is not performed, this is None. If weights are used, then this is the weighted standard deviation.

Type:: Array of shape (1, M) or None

Returns:: self – Fitted model.
Return type:: PLS
Raises:: ValueError – If weights are provided and not all weights are non-negative.
Warns:: UserWarning. – If at any point during iteration over the number of components A, the residual goes below machine epsilon.