di_pls — Domain-Invariant PLS (di-PLS)

Group: Calibration transfer · Registry tolerance: 1e-06

Description

Domain-invariant PLS

From the pls4all.sklearn.DIPLSRegression docstring:

Domain-invariant PLS (Nikzad-Langerodi 2018).

Registry note — Python diPLSlib.models.DIPLS (B-Analytics; Nikzad-Langerodi 2018 authors). pls4all di_pls_fit defaults to the diPLSlib algorithm (centered NIPALS, convex-relaxation penalty, target-mean rescale) — bit-for-bit parity with DIPLS(centering=True, rescale='Target'). Set cfg.di_pls_legacy = 1 to fall back to the pre-0.97.4 SIMPLS direction projection.

Parameters

Name

Type

Default

Notes

n_components

int

2

Number of latent components extracted (k).

di_lambda

float

1.0

Domain-invariance penalty weight balancing covariance alignment vs response fit.

Explanations

Bibliographic source

Nikzad-Langerodi, R., Zellinger, W., Saminger-Platz, S. & Moser, B. A. (2018). Domain-invariant partial-least-squares regression. Analytical Chemistry 90(11), 6693–6701.

Mathematical principle

Calibration transfer methods reconcile spectra acquired on different instruments or under different environmental conditions. di-PLS does this by augmenting the PLS objective with a domain-discrepancy penalty: \(\mathcal{L}(\mathbf{w}) = -\operatorname{Cov}(\mathbf{X}_s\mathbf{w}, \mathbf{y}_s)^2 + \lambda \,\mathrm{MMD}^2(\mathbf{X}_s\mathbf{w}, \mathbf{X}_t\mathbf{w})\), where \((\mathbf{X}_s, \mathbf{y}_s)\) is a labelled source domain, \(\mathbf{X}_t\) is an unlabelled target domain and MMD is the maximum mean discrepancy.

Minimising \(\mathcal{L}\) produces latent directions \(\mathbf{w}\) that simultaneously predict \(y\) in the source and have matched distributions across domains. The model is therefore robust to drift between calibration and prediction sets without requiring labels on the target domain.

Computational cost is dominated by the MMD term, which is \(O((n_s + n_t)^2)\) in a naive implementation; pls4all uses a linear-kernel MMD which reduces this to \(O((n_s + n_t) p)\).

\(\lambda\) controls the bias–transferability trade-off: \(\lambda = 0\) recovers vanilla PLS on the source, large \(\lambda\) shrinks toward a domain-aligned but potentially under-predictive model.

Implementation

n4m_di_pls_fit — requires X_target at fit time. Reference: Python diPLSlib.models.DIPLS (Nikzad-Langerodi authors). The pls4all variant matches diPLSlib’s rescale='Target' source-centred default.

MATLAB header (bindings/matlab/+pls4all/DiPlsRegression.m):

pls4all.DiPlsRegression  Domain-Invariant PLS regression.

Usage

Every pls4all binding tab dispatches into the same C kernel; the external libraries listed at the bottom of the page are the parity references registered in benchmarks.parity_timing.registry. Switch tabs to read the same fit in your language. The R package now ships drop-in-compatible facades for the CRAN pls package (plsr, pcr, mvr) and for the mdatools::pls(x, y, ...) matrix idiom — those tabs appear only on the methods that have a meaningful equivalence.

pls4all bindings

/* C ABI — libn4m */
n4m_context_t* ctx = n4m_context_create();
n4m_config_t*  cfg = n4m_config_create();
n4m_method_result_t* res = NULL;
n4m_di_pls_fit(ctx, cfg, &x_view, &y_view, /* hyperparams */, &res);
/* … read coefficients / mask / scores via */
/* n4m_method_result_get_double_matrix / vector / scalar … */
n4m_method_result_destroy(res);
n4m_config_destroy(cfg);
n4m_context_destroy(ctx);
import pls4all
from pls4all._methods import di_pls_fit
with pls4all.Context() as ctx, pls4all.Config() as cfg:
    res = di_pls_fit(ctx, cfg, X, y, n_components=4, X_target=X_target)
# then: res.matrix("predictions"), res.matrix("coefficients"),
# res.vector("mask"), res.scalar("intercept"), …
from pls4all.sklearn import DIPLSRegression
mdl = DIPLSRegression(n_components=2, di_lambda=1.0)
mdl.fit(X, y, X_target=X_target)
y_hat = mdl.predict(X_test)
library(pls4all)
# Unified low-level dispatcher (May 2026 R cleanup):
res <- pls4all_method("di_pls", X, y,
                      n_components = 4L, params = list(di_lambda = 1.0))
# res is a named list with MethodResult arrays/scalars.
# selected_indices / top_k_intervals are 1-based.
res = pls4all.di_pls(X, y, 4);
% see header of bindings/matlab/+pls4all/di_pls.m for full
% parameter surface:
%   res = di_pls(X_source, Y_source, n_components, X_target, di_lambda)
yhat = predict(res, Xtest);
mdl  = pls4all.fit("di_pls", X, y, "NumComponents", 4);
yhat = predict(mdl, Xtest);

Registry parity references 📐

  • 📐 ref.python_diplslib (python · python) — diPLSlib 2.5.0 · strict (rmse_rel ≤ 1e-06) — Python diPLSlib.models.DIPLS (B-Analytics; Nikzad-Langerodi 2018 authors). Same di-PLS penalty applied during deflation; centering / target rescaling differ slightly, so tolerance is widened.

Benchmarks

Adaptive wall-clock per cell measured against full_matrix.csv. Only backends that implement this method are listed; libraries without the method are omitted.

Verdict  ·  ✓ ref / ≈ ref / ~ shape mark a reference-gate pass at strict / relaxed / qualitative tolerance  ·  ✓ bind = pls4all binding agrees with the C++ baseline  ·  ⇄ cross-check = documented by-design selector/RNG/model, noncanonical API/facade convention, or secondary oracle  ·  ✗ divergent  ·  ⚠ error  ·  — not run. The fastest backend per column is marked 🏆.

Reference gate: strict — numeric equivalence (rmse_rel_tol 1e-06).

Rows tagged with 📐 are the canonical parity references for this method (declared in parity_timing.registry). C++ and external rows show reference parity; pls4all language bindings show binding parity against the C++ backend. Hover the icon for role and tolerance band.

BackendParity200×50 (ms)
C++ native · libn4m
pls4all.cpp.blas+omp✓ ref 8e-152.58 ms🏆
Python · pls4all
pls4all.python✓ bind2.74 ms
pls4all.sklearn✓ 4e-152.82 ms
R · pls4all
pls4all.R✓ 6e-159.69 ms
pls4all.R.formula✓ 6e-1512.4 ms
pls4all.R.mdatools✓ 6e-1512.8 ms
pls4all.R.pls✓ 6e-1512.4 ms
Python · external
📐ref.python_diplslibsource3.78 ms
BackendParity200×50 (ms)
C++ native · libn4m
pls4all.cpp.blas+omp✓ ref 8e-152.80 ms
Python · pls4all
pls4all.python✓ bind2.70 ms🏆
pls4all.sklearn✓ 4e-152.98 ms
R · pls4all
pls4all.R✓ 6e-159.91 ms
pls4all.R.formula✓ 6e-1512.2 ms
pls4all.R.mdatools✓ 6e-1512.9 ms
pls4all.R.pls✓ 6e-1511.2 ms
Python · external
📐ref.python_diplslibsource3.79 ms
BackendParity200×50 (ms)
C++ native · libn4m
pls4all.cpp.blas+omp✓ ref 8e-152.74 ms🏆
Python · pls4all
pls4all.python✓ bind2.82 ms
pls4all.sklearn✓ 4e-152.95 ms
R · pls4all
pls4all.R✓ 6e-1511.1 ms
pls4all.R.formula✓ 6e-1514.0 ms
pls4all.R.mdatools✓ 6e-1513.0 ms
pls4all.R.pls✓ 6e-1512.4 ms
Python · external
📐ref.python_diplslibsource3.92 ms

See also: benchmark overview · methods index · interactive dashboard