di_pls — Domain-Invariant PLS (di-PLS)¶
Group: Calibration transfer · Registry tolerance: 1e-06
Description¶
Domain-invariant PLS
From the pls4all.sklearn.DIPLSRegression docstring:
Domain-invariant PLS (Nikzad-Langerodi 2018).
Registry note — Python
diPLSlib.models.DIPLS(B-Analytics; Nikzad-Langerodi 2018 authors). pls4alldi_pls_fitdefaults to the diPLSlib algorithm (centered NIPALS, convex-relaxation penalty, target-mean rescale) — bit-for-bit parity withDIPLS(centering=True, rescale='Target'). Setcfg.di_pls_legacy = 1to fall back to the pre-0.97.4 SIMPLS direction projection.
Parameters¶
Name |
Type |
Default |
Notes |
|---|---|---|---|
|
|
|
Number of latent components extracted (k). |
|
|
|
Domain-invariance penalty weight balancing covariance alignment vs response fit. |
Explanations¶
Bibliographic source¶
Nikzad-Langerodi, R., Zellinger, W., Saminger-Platz, S. & Moser, B. A. (2018). Domain-invariant partial-least-squares regression. Analytical Chemistry 90(11), 6693–6701.
Mathematical principle¶
Calibration transfer methods reconcile spectra acquired on different instruments or under different environmental conditions. di-PLS does this by augmenting the PLS objective with a domain-discrepancy penalty: \(\mathcal{L}(\mathbf{w}) = -\operatorname{Cov}(\mathbf{X}_s\mathbf{w}, \mathbf{y}_s)^2 + \lambda \,\mathrm{MMD}^2(\mathbf{X}_s\mathbf{w}, \mathbf{X}_t\mathbf{w})\), where \((\mathbf{X}_s, \mathbf{y}_s)\) is a labelled source domain, \(\mathbf{X}_t\) is an unlabelled target domain and MMD is the maximum mean discrepancy.
Minimising \(\mathcal{L}\) produces latent directions \(\mathbf{w}\) that simultaneously predict \(y\) in the source and have matched distributions across domains. The model is therefore robust to drift between calibration and prediction sets without requiring labels on the target domain.
Computational cost is dominated by the MMD term, which is \(O((n_s + n_t)^2)\) in a naive implementation; pls4all uses a linear-kernel MMD which reduces this to \(O((n_s + n_t) p)\).
\(\lambda\) controls the bias–transferability trade-off: \(\lambda = 0\) recovers vanilla PLS on the source, large \(\lambda\) shrinks toward a domain-aligned but potentially under-predictive model.
Implementation¶
n4m_di_pls_fit — requires X_target at fit time. Reference: Python diPLSlib.models.DIPLS (Nikzad-Langerodi authors). The pls4all variant matches diPLSlib’s rescale='Target' source-centred default.
MATLAB header (bindings/matlab/+pls4all/DiPlsRegression.m):
pls4all.DiPlsRegression Domain-Invariant PLS regression.
Usage¶
Every pls4all binding tab dispatches into the same C kernel; the external libraries listed at the bottom of the page are the parity references registered in benchmarks.parity_timing.registry. Switch tabs to read the same fit in your language. The R package now ships drop-in-compatible facades for the CRAN pls package (plsr, pcr, mvr) and for the mdatools::pls(x, y, ...) matrix idiom — those tabs appear only on the methods that have a meaningful equivalence.
pls4all bindings
/* C ABI — libn4m */
n4m_context_t* ctx = n4m_context_create();
n4m_config_t* cfg = n4m_config_create();
n4m_method_result_t* res = NULL;
n4m_di_pls_fit(ctx, cfg, &x_view, &y_view, /* hyperparams */, &res);
/* … read coefficients / mask / scores via */
/* n4m_method_result_get_double_matrix / vector / scalar … */
n4m_method_result_destroy(res);
n4m_config_destroy(cfg);
n4m_context_destroy(ctx);
import pls4all
from pls4all._methods import di_pls_fit
with pls4all.Context() as ctx, pls4all.Config() as cfg:
res = di_pls_fit(ctx, cfg, X, y, n_components=4, X_target=X_target)
# then: res.matrix("predictions"), res.matrix("coefficients"),
# res.vector("mask"), res.scalar("intercept"), …
from pls4all.sklearn import DIPLSRegression
mdl = DIPLSRegression(n_components=2, di_lambda=1.0)
mdl.fit(X, y, X_target=X_target)
y_hat = mdl.predict(X_test)
library(pls4all)
# Unified low-level dispatcher (May 2026 R cleanup):
res <- pls4all_method("di_pls", X, y,
n_components = 4L, params = list(di_lambda = 1.0))
# res is a named list with MethodResult arrays/scalars.
# selected_indices / top_k_intervals are 1-based.
res = pls4all.di_pls(X, y, 4);
% see header of bindings/matlab/+pls4all/di_pls.m for full
% parameter surface:
% res = di_pls(X_source, Y_source, n_components, X_target, di_lambda)
yhat = predict(res, Xtest);
mdl = pls4all.fit("di_pls", X, y, "NumComponents", 4);
yhat = predict(mdl, Xtest);
Registry parity references 📐
📐
ref.python_diplslib(python · python) —diPLSlib2.5.0 · strict (rmse_rel ≤ 1e-06) — PythondiPLSlib.models.DIPLS(B-Analytics; Nikzad-Langerodi 2018 authors). Same di-PLS penalty applied during deflation; centering / target rescaling differ slightly, so tolerance is widened.
Benchmarks¶
Adaptive wall-clock per cell measured against full_matrix.csv. Only backends that implement this method are listed; libraries without the method are omitted.
Verdict · ✓ ref / ≈ ref / ~ shape mark a reference-gate pass at strict / relaxed / qualitative tolerance · ✓ bind = pls4all binding agrees with the C++ baseline · ⇄ cross-check = documented by-design selector/RNG/model, noncanonical API/facade convention, or secondary oracle · ✗ divergent · ⚠ error · — not run. The fastest backend per column is marked 🏆.
Reference gate: strict — numeric equivalence (rmse_rel_tol ≤ 1e-06).
Rows tagged with 📐 are the canonical parity references for this method (declared in parity_timing.registry). C++ and external rows show reference parity; pls4all language bindings show binding parity against the C++ backend. Hover the icon for role and tolerance band.
| Backend | Parity | 200×50 (ms) |
|---|---|---|
| C++ native · libn4m | ||
pls4all.cpp.blas+omp | ✓ ref 8e-15 | 2.58 ms🏆 |
| Python · pls4all | ||
pls4all.python | ✓ bind | 2.74 ms |
pls4all.sklearn | ✓ 4e-15 | 2.82 ms |
| R · pls4all | ||
pls4all.R | ✓ 6e-15 | 9.69 ms |
pls4all.R.formula | ✓ 6e-15 | 12.4 ms |
pls4all.R.mdatools | ✓ 6e-15 | 12.8 ms |
pls4all.R.pls | ✓ 6e-15 | 12.4 ms |
| Python · external | ||
📐ref.python_diplslib | source | 3.78 ms |
| Backend | Parity | 200×50 (ms) |
|---|---|---|
| C++ native · libn4m | ||
pls4all.cpp.blas+omp | ✓ ref 8e-15 | 2.80 ms |
| Python · pls4all | ||
pls4all.python | ✓ bind | 2.70 ms🏆 |
pls4all.sklearn | ✓ 4e-15 | 2.98 ms |
| R · pls4all | ||
pls4all.R | ✓ 6e-15 | 9.91 ms |
pls4all.R.formula | ✓ 6e-15 | 12.2 ms |
pls4all.R.mdatools | ✓ 6e-15 | 12.9 ms |
pls4all.R.pls | ✓ 6e-15 | 11.2 ms |
| Python · external | ||
📐ref.python_diplslib | source | 3.79 ms |
| Backend | Parity | 200×50 (ms) |
|---|---|---|
| C++ native · libn4m | ||
pls4all.cpp.blas+omp | ✓ ref 8e-15 | 2.74 ms🏆 |
| Python · pls4all | ||
pls4all.python | ✓ bind | 2.82 ms |
pls4all.sklearn | ✓ 4e-15 | 2.95 ms |
| R · pls4all | ||
pls4all.R | ✓ 6e-15 | 11.1 ms |
pls4all.R.formula | ✓ 6e-15 | 14.0 ms |
pls4all.R.mdatools | ✓ 6e-15 | 13.0 ms |
pls4all.R.pls | ✓ 6e-15 | 12.4 ms |
| Python · external | ||
📐ref.python_diplslib | source | 3.92 ms |
See also: benchmark overview · methods index · interactive dashboard