`pls_glm` — PLS-GLM (Generalised Linear Model PLS)¶

Group: Classification & GLM · Registry tolerance: 1e-06

Description¶

PLS-GLM (§5) — softmax/Poisson IRLS on PLS scores

From the pls4all.sklearn.PLSGLMRegressor docstring:

PLS + Generalised Linear Model head (Bastien 2005).

Registry note — R plsRglm::plsRglm (Bastien, Vinzi & Tenenhaus 2005) with scaleX=FALSE. pls4all’s default now mirrors the plsRglm algorithm exactly: per-component partial-regression weights (Gaussian-identity uses closed-form OLS; Poisson-log uses IRLS), score-space GLM coefficients, and per-target stacking. The legacy single-pass C++ kernel (centred SIMPLS + column-mean intercept) is opt-in via legacy=True.

Parameters¶

Name	Type	Default	Notes
`n_components`	`int`	`2`	Number of latent components extracted (k).
`poisson`	`bool`	`False`	If True, fit a Poisson-deviance PLS-GLM (default Gaussian link).
`n_targets`	`int`	`3`	registry benchmark cell value

Explanations¶

Bibliographic source¶

Marx, B. D. (1996). Iteratively reweighted partial least squares estimation for generalized linear regression. Technometrics 38(4), 374–381.

Mathematical principle¶

PLS-GLM generalises PLS-logistic to any GLM family. The IRLS recipe is identical — derive a working response from the current linear predictor, fit PLS with the GLM weights, iterate — but the link function varies: identity for Gaussian, log for Poisson, logit for Bernoulli/binomial.

pls4all currently supports Gaussian and Poisson families (controlled by the poisson flag). The Poisson case is useful for count regression on spectroscopy data where the response is an integer abundance (cell counts, particle counts) rather than a continuous concentration.

Compared to running a vanilla PLS on \(\log(y+1)\), the true Poisson formulation correctly handles the mean–variance relationship and is less biased for low counts.

Implementation¶

n4m_estimators_pls_glm_fit. Reference: R plsRglm 1.7.0.

R roxygen note (sklearn_methods.R::pls_glm):

PLS-GLM — formula entry point. Default is Gaussian; set family = "poisson" for Poisson IRLS.

MATLAB header (bindings/matlab/+pls4all/GlmRegression.m):

pls4all.GlmRegression — PLS-GLM (Gaussian / Poisson IRLS).
 Like MB-PLS, uses the stored intercept directly.

Usage¶

Every pls4all binding tab dispatches into the same C kernel; the external libraries listed at the bottom of the page are the parity references registered in benchmarks.parity_timing.registry. Switch tabs to read the same fit in your language. The R package now ships drop-in-compatible facades for the CRAN pls package (plsr, pcr, mvr) and for the mdatools::pls(x, y, ...) matrix idiom — those tabs appear only on the methods that have a meaningful equivalence.

pls4all bindings

C ABI · libn4m

/* C ABI — libn4m */
n4m_context_t* ctx = n4m_context_create();
n4m_config_t*  cfg = n4m_config_create();
n4m_method_result_t* res = NULL;
n4m_estimators_pls_glm_fit(ctx, cfg, &x_view, &y_view, /* hyperparams */, &res);
/* … read coefficients / mask / scores via */
/* n4m_method_result_get_double_matrix / vector / scalar … */
n4m_method_result_destroy(res);
n4m_config_destroy(cfg);
n4m_context_destroy(ctx);

Python · pls4all (raw)

import pls4all
from pls4all._methods import pls_glm_fit
with pls4all.Context() as ctx, pls4all.Config() as cfg:
    res = pls_glm_fit(ctx, cfg, X, y, n_components=4)
# then: res.matrix("predictions"), res.matrix("coefficients"),
# res.vector("mask"), res.scalar("intercept"), …

Python · pls4all.sklearn

from pls4all.sklearn import PLSGLMRegressor
mdl = PLSGLMRegressor(n_components=2, poisson=False)
mdl.fit(X, y)
y_hat = mdl.predict(X_test)

R · pls4all_method()

library(pls4all)
# Unified low-level dispatcher (May 2026 R cleanup):
res <- pls4all_method("pls_glm", X, y,
                      n_components = 4L, params = list(n_targets = 3L, poisson = 0L))
# res is a named list with MethodResult arrays/scalars.
# selected_indices / top_k_intervals are 1-based.

R · pls4all (raw fn)

library(pls4all)
res  <- pls_glm_fit(X, Y, n_components, poisson = FALSE)
yhat <- pls4all_predict(res, X_test)

R · pls4all (formula+S3)

library(pls4all)
fit  <- pls_glm(y ~ ., data = train, ncomp = 4L)
yhat <- predict(fit, newdata = test)
summary(fit)

MATLAB · pls4all (MEX)

res = pls4all.pls_glm(X, y, 4);
% see header of bindings/matlab/+pls4all/pls_glm.m for full
% parameter surface:
%   res = pls_glm(X, Y, n_components, family)
yhat = predict(res, Xtest);

MATLAB · pls4all (classdef)

mdl  = pls4all.fit("pls_glm", X, y, "NumComponents", 4);
yhat = predict(mdl, Xtest);

Registry parity references 📐

📐 ref.r_plsrglm (R · r) — plsRglm 1.5.1 · strict (rmse_rel ≤ 1e-06) — R plsRglm::plsRglm (Bastien, Vinzi & Tenenhaus 2005) with the pls-glm-gaussian / pls-glm-poisson family. pls4all implements a simpler PLS-then-link variant so predictions diverge substantially; the parity check is a presence flag for the external reference.

Benchmarks¶

Adaptive wall-clock per cell measured against full_matrix.csv. Only backends that implement this method are listed; libraries without the method are omitted.

Verdict · ✓ ref / ≈ ref / ~ shape mark a reference-gate pass at strict / relaxed / qualitative tolerance · ✓ bind = pls4all binding agrees with the C++ baseline · ⇄ cross-check = documented by-design selector/RNG/model, noncanonical API/facade convention, or secondary oracle · ✗ divergent · ⚠ error · — not run. The fastest backend per column is marked 🏆.

Reference gate: strict — numeric equivalence (rmse_rel_tol ≤ 1e-06).

Rows tagged with 📐 are the canonical parity references for this method (declared in parity_timing.registry). C++ and external rows show reference parity; pls4all language bindings show binding parity against the C++ backend. Hover the icon for role and tolerance band.

1 thread

Backend	Parity	200×30 (ms)
C++ native · libn4m
`pls4all.cpp.blas+omp`	✓ ref 1e-15	3.57 ms
Python · pls4all
`pls4all.python`	✓ bind	7.99 ms
`pls4all.sklearn`	⇄ +4e-01	2.74 ms🏆
R · pls4all
`pls4all.R`	⇄ +4e-01	7.97 ms
`pls4all.R.formula`	⇄ +4e-01	22.7 ms
`pls4all.R.mdatools`	⇄ +4e-01	25.8 ms
`pls4all.R.pls`	⇄ +4e-01	18.4 ms
R · external
📐`ref.r_plsrglm`	source	240.6 ms

3 threads

Backend	Parity	200×30 (ms)
C++ native · libn4m
`pls4all.cpp.blas+omp`	✓ ref 1e-15	3.56 ms
Python · pls4all
`pls4all.python`	✓ bind	8.04 ms
`pls4all.sklearn`	⇄ +4e-01	1.50 ms🏆
R · pls4all
`pls4all.R`	⇄ +4e-01	4.56 ms
`pls4all.R.formula`	⇄ +4e-01	5.18 ms
`pls4all.R.mdatools`	⇄ +4e-01	5.40 ms
`pls4all.R.pls`	⇄ +4e-01	5.24 ms
R · external
📐`ref.r_plsrglm`	source	138.8 ms

10 threads

Backend	Parity	200×30 (ms)
C++ native · libn4m
`pls4all.cpp.blas+omp`	✓ ref 1e-15	2.07 ms
Python · pls4all
`pls4all.python`	✓ bind	2.05 ms
`pls4all.sklearn`	⇄ +4e-01	1.46 ms🏆
R · pls4all
`pls4all.R`	⇄ +4e-01	3.78 ms
`pls4all.R.formula`	⇄ +4e-01	5.59 ms
`pls4all.R.mdatools`	⇄ +4e-01	4.89 ms
`pls4all.R.pls`	⇄ +4e-01	4.78 ms
R · external
📐`ref.r_plsrglm`	source	134.0 ms

nirs4all-methods

Navigation

`pls_glm` — PLS-GLM (Generalised Linear Model PLS)¶

Description¶

Parameters¶

Explanations¶

Bibliographic source¶

Mathematical principle¶

Implementation¶

Usage¶

Benchmarks¶

pls_glm — PLS-GLM (Generalised Linear Model PLS)¶

Description¶

Parameters¶

Explanations¶

Bibliographic source¶

Mathematical principle¶

Implementation¶

Usage¶

Benchmarks¶

`pls_glm` — PLS-GLM (Generalised Linear Model PLS)¶