# `mb_pls` — Multi-block PLS (Westerhuis 1998) _Group_: **Multi-block / cross-modal** · _Registry tolerance_: `2.0` ## Description MB-PLS — Multi-block PLS (§17 Phase 4) From the `pls4all.sklearn.MBPLSRegression` docstring: > Multi-block PLS (Westerhuis 1998). > **Registry note** — In-tree `nirs4all.operators.models.sklearn.mbpls.MBPLS` is the sanctioned external reference (the mbpls PyPI package is broken against sklearn 1.8). pls4all's MB-PLS default now mirrors nirs4all NIPALS multi-block (standardize=False); the legacy block-balanced SIMPLS path is opt-in via cfg.scale_x=True. ### Parameters | Name | Type | Default | Notes | |------|------|---------|-------| | `n_components` | `int` | `2` | Number of latent components extracted (k). | | `block_sizes` | `—` | `None` | Sequence of contiguous block widths defining the X-block partition (columns of X). | | `n_blocks` | `int` | `3` | registry benchmark cell value | ## Explanations ### Bibliographic source Westerhuis, J. A., Kourti, T. & MacGregor, J. F. (1998). *Analysis of multiblock and hierarchical PCA and PLS models*. Journal of Chemometrics 12(5), 301–321. ### Mathematical principle When predictors come from several distinct sources — NIR, MIR, Raman, process tags, lab assays — concatenating them into one wide matrix lets the block with the most variance dominate. Multi-block PLS instead **block-scales** each $\mathbf{X}_b$ so blocks contribute proportionally to their information content rather than their dimensionality. Formally, each block is centred and autoscaled, then scaled by $1 / \sqrt{p_b}$ so its total variance is unit-normalised. PLS then runs on the concatenated $[\mathbf{X}_1, \ldots, \mathbf{X}_B]$ with optional per-block weights. Block-level *importance* statistics (block-VIP, block-RMSE) are recovered from the loadings by restriction to each block's columns. Compared to plain concatenation, MB-PLS gives interpretable per-block contributions and is the standard approach in process spectroscopy. ### Implementation `n4m_mb_pls_fit` — requires a `block_sizes` integer vector summing to $p$. The C ABI materialises the intercept directly (no separate $\bar{\mathbf{y}}$ key) because the block scaling changes the centring semantics. Reference: sanctioned git-pinned port `nirs4all.operators.models.sklearn.mbpls`. MATLAB header (`bindings/matlab/+pls4all/MbPlsRegression.m`): ```text pls4all.MbPlsRegression — Multi-block PLS. predict uses the stored intercept directly (coefficients are already in original X scale + intercept folds in y_mean - x_mean @ coef). ``` ### Usage Every pls4all binding tab dispatches into the same C kernel; the external libraries listed at the bottom of the page are the parity references registered in `benchmarks.parity_timing.registry`. Switch tabs to read the same fit in your language. The R package now ships drop-in-compatible facades for the CRAN `pls` package (`plsr`, `pcr`, `mvr`) and for the `mdatools::pls(x, y, ...)` matrix idiom — those tabs appear only on the methods that have a meaningful equivalence. **pls4all bindings** ::::{tab-set} :class: pls4all-bindings :::{tab-item} C ABI · libn4m :sync: c :class-label: lang-c ```c /* C ABI — libn4m */ n4m_context_t* ctx = n4m_context_create(); n4m_config_t* cfg = n4m_config_create(); n4m_method_result_t* res = NULL; n4m_mb_pls_fit(ctx, cfg, &x_view, &y_view, /* hyperparams */, &res); /* … read coefficients / mask / scores via */ /* n4m_method_result_get_double_matrix / vector / scalar … */ n4m_method_result_destroy(res); n4m_config_destroy(cfg); n4m_context_destroy(ctx); ``` ::: :::{tab-item} Python · pls4all (raw) :sync: python-raw :class-label: lang-python ```python import pls4all from pls4all._methods import mb_pls_fit with pls4all.Context() as ctx, pls4all.Config() as cfg: res = mb_pls_fit(ctx, cfg, X, y, n_components=3) # then: res.matrix("predictions"), res.matrix("coefficients"), # res.vector("mask"), res.scalar("intercept"), … ``` ::: :::{tab-item} Python · pls4all.sklearn :sync: python-sklearn :class-label: lang-python ```python from pls4all.sklearn import MBPLSRegression mdl = MBPLSRegression(n_components=2, block_sizes=None) mdl.fit(X, y) y_hat = mdl.predict(X_test) ``` ::: :::{tab-item} R · pls4all_method() :sync: r-dispatcher :class-label: lang-r ```r library(pls4all) # Unified low-level dispatcher (May 2026 R cleanup): res <- pls4all_method("mb_pls", X, y, n_components = 3L, params = list(n_blocks = 3L)) # res is a named list with MethodResult arrays/scalars. # selected_indices / top_k_intervals are 1-based. ``` ::: :::{tab-item} MATLAB · pls4all (MEX) :sync: matlab-mex :class-label: lang-matlab ```matlab res = pls4all.mb_pls(X, y, 3); % see header of bindings/matlab/+pls4all/mb_pls.m for full % parameter surface: % res = mb_pls(X, Y, n_components, block_sizes) yhat = predict(res, Xtest); ``` ::: :::{tab-item} MATLAB · pls4all (classdef) :sync: matlab-classdef :class-label: lang-matlab ```matlab mdl = pls4all.fit("mb_pls", X, y, "NumComponents", 3); yhat = predict(mdl, Xtest); ``` ::: :::: **Registry parity references** 📐 :::{card} :class-card: external-refs - 📐 **`nirs4all`** (python · python) — `nirs4all` in-tree · qualitative (rmse_rel ≤ 2e+00) — In-tree Python MB-PLS (sanctioned external reference). The mbpls PyPI package is broken against sklearn 1.8 (uses the deprecated `force_all_finite` kwarg). nirs4all's implementation is a clean re-derivation of Westerhuis 1998. ::: ### Benchmarks Adaptive wall-clock per cell measured against [`full_matrix.csv`](../benchmarks/overview.md). Only backends that implement this method are listed; libraries without the method are omitted. **Verdict**  ·  ✓ ref / ≈ ref / ~ shape mark a reference-gate pass at strict / relaxed / qualitative tolerance  ·  ✓ bind = pls4all binding agrees with the C++ baseline  ·  ✗ divergent  ·  ⚠ error  ·  — not run. The fastest backend per column is marked 🏆. **Reference gate**: qualitative — shape/smoke comparison only. The external library and pls4all do not produce numerically equivalent output for this method (see the MethodSpec notes); the `rmse_rel_tol ≤ 2e+00` budget is set wide on purpose. Treat ~ shape as *“we ran both, both finished”*, not as numerical agreement. Rows tagged with **📐** are the canonical parity references for this method (declared in [`parity_timing.registry`](../benchmarks/methodology.md)). C++ and external rows show reference parity; pls4all language bindings show binding parity against the C++ backend. Hover the icon for role and tolerance band. ::::{tab-set} :class: parity-tabs :::{tab-item} 1 thread :sync: threads-1
BackendParity50×250 (ms)100×50 (ms)100×500 (ms)100×2500 (ms)200×60 (ms)250×50 (ms)500×50 (ms)500×500 (ms)500×2500 (ms)2500×50 (ms)2500×500 (ms)2500×2500 (ms)10000×50 (ms)10000×500 (ms)
C++ native · libn4m
pls4all.cpp.blas≈ +2e-153.12 ms1.20 ms10.1 ms🏆53.1 ms2.27 ms🏆3.04 ms4.85 ms51.9 ms271.9 ms24.7 ms345.6 ms🏆1.8 s🏆110.2 ms🏆1.6 s
pls4all.cpp.blas+omp≈ +2e-152.91 ms1.10 ms10.8 ms51.0 ms2.31 ms2.79 ms🏆6.73 ms51.0 ms🏆268.2 ms26.3 ms356.3 ms1.8 s142.7 ms1.4 s🏆
pls4all.cpp.omp≈ +2e-152.91 ms1.67 ms11.5 ms49.6 ms🏆2.43 ms3.00 ms5.21 ms51.9 ms269.3 ms23.6 ms🏆362.4 ms1.9 s127.6 ms1.5 s
pls4all.cpp.ref≈ +2e-153.35 ms1.03 ms🏆11.1 ms52.8 ms2.41 ms2.88 ms4.52 ms🏆53.9 ms260.5 ms🏆27.7 ms362.0 ms1.9 s122.6 ms1.5 s
Python · pls4all
pls4all.python✓ bind2.78 ms🏆2.46 ms3.04 ms
pls4all.sklearn✓ 4e-154.94 ms3.85 ms4.83 ms
R · pls4all
pls4all.R✗ +1e-0113.9 ms8.28 ms14.4 ms
pls4all.R.formula✗ +1e-0124.0 ms11.6 ms13.9 ms
pls4all.R.mdatools✗ +1e-0120.9 ms12.4 ms10.8 ms
pls4all.R.pls✗ +1e-0124.8 ms12.7 ms12.4 ms
MATLAB · pls4all
pls4all.matlab✗ +9e+005.08 ms3.64 ms4.16 ms
pls4all.matlab.classdef✗ +9e+004.72 ms4.78 ms5.06 ms
Python · external
📐nirs4allsource3.83 ms2.98 ms3.88 ms
::: :::{tab-item} 3 threads :sync: threads-3
BackendParity50×250 (ms)100×50 (ms)100×500 (ms)100×2500 (ms)200×60 (ms)250×50 (ms)500×50 (ms)500×500 (ms)500×2500 (ms)2500×50 (ms)2500×500 (ms)2500×2500 (ms)10000×50 (ms)10000×500 (ms)
C++ native · libn4m
pls4all.cpp.blas~ shape 7e-162.33 ms
pls4all.cpp.blas+omp~ shape 7e-162.30 ms
pls4all.cpp.omp~ shape 7e-162.14 ms
pls4all.cpp.ref~ shape 7e-163.49 ms
Python · pls4all
pls4all.python✓ bind2.10 ms🏆
pls4all.sklearn✓ 4e-152.67 ms
R · pls4all
pls4all.R✓ 7e-157.58 ms
pls4all.R.formula✓ 7e-1510.0 ms
pls4all.R.mdatools✓ 7e-159.24 ms
pls4all.R.pls✓ 7e-1510.1 ms
MATLAB · pls4all
pls4all.matlab✗ +9e+005.86 ms
pls4all.matlab.classdef✗ +9e+004.25 ms
Python · external
📐nirs4allsource3.10 ms
::: :::{tab-item} 10 threads :sync: threads-10
BackendParity50×250 (ms)100×50 (ms)100×500 (ms)100×2500 (ms)200×60 (ms)250×50 (ms)500×50 (ms)500×500 (ms)500×2500 (ms)2500×50 (ms)2500×500 (ms)2500×2500 (ms)10000×50 (ms)10000×500 (ms)
C++ native · libn4m
pls4all.cpp.blas~ shape 7e-162.08 ms
pls4all.cpp.blas+omp~ shape 7e-162.06 ms🏆
pls4all.cpp.omp~ shape 7e-162.10 ms
pls4all.cpp.ref~ shape 7e-162.08 ms
Python · pls4all
pls4all.python✓ bind2.98 ms
pls4all.sklearn✓ 4e-152.24 ms
R · pls4all
pls4all.R✓ 7e-155.81 ms
pls4all.R.formula✓ 7e-157.09 ms
pls4all.R.mdatools✓ 7e-157.28 ms
pls4all.R.pls✓ 7e-157.09 ms
MATLAB · pls4all
pls4all.matlab✗ +9e+003.29 ms
pls4all.matlab.classdef✗ +9e+003.69 ms
Python · external
📐nirs4allsource2.59 ms
::: :::: --- _See also_: [benchmark overview](../benchmarks/overview.md) · [methods index](index.md) · [interactive dashboard](../landing/dashboard.md)