# `pop_pls` — POP-PLS (per-component operator selection) _Group_: **Adaptive** · _Registry tolerance_: `5.0` ## Description POP-PLS — per-component adaptive operator selection > **Registry note** — POPPLS/POP-PLS uses per-component operator selection over the same compact nirs4all bank. Reference is the in-tree nirs4all POPPLSRegressor; parity is qualitative. ### Parameters | Name | Type | Default | Notes | |------|------|---------|-------| | `max_components` | `int` | `3` | registry benchmark cell value | | `n_operators` | `int` | `9` | registry benchmark cell value | | `cv` | `int` | `3` | registry benchmark cell value | ## Explanations ### Bibliographic source Beurier, G., Reiter, R., Noûs, C., Rouan, L. & Cornet, D. (2026). *Reframing preprocessing selection as model-internal calibration in near-infrared spectroscopy: a large-scale benchmark of operator-adaptive PLS and Ridge models*. arXiv:2605.13587. https://arxiv.org/abs/2605.13587. ### Mathematical principle POP-PLS (Per-Operator PLS) is the per-component ablation of AOM-PLS: each latent component may pick a *different* operator from the bank, rather than committing to one global operator. The setting is the same — centered $\mathbf{X} \in \mathbb{R}^{n\times p}$, response $\mathbf{Y}$, strict-linear bank $\{\mathbf{A}_b\}_{b=1}^{B}$, cross-covariance matrix $\mathbf{S} = \mathbf{X}^{\top}\mathbf{Y}$ — but the selection rule is local to each component. **Per-component greedy selection.** Initialise $\mathbf{S}^{(0)} \leftarrow \mathbf{S}$. For $a = 1, \dots, K$: 1. **Score the bank** on the *current* deflated cross-covariance: for every $b$ evaluate the criterion $\mathcal{C}_a(b)$ of the SIMPLS-covariance step that would result from picking operator $b$ at component $a$ (covariance proxy $\lVert\mathbf{A}_b\mathbf{S}^{(a-1)}\rVert$, K-fold CV-RMSE on the resulting prefix, or approximate PRESS — same family of criteria as AOM-PLS). 2. **Pick the local minimiser** $b_a = \operatorname*{arg\,min}_b \mathcal{C}_a(b)$. 3. **Extract the component** $\mathbf{r}_a = \mathbf{u}_1\!\bigl(\mathbf{A}_{b_a}\mathbf{S}^{(a-1)}\bigr)$ in transformed space and lift it back through the *component-specific* adjoint: $$\mathbf{z}_a \;=\; \mathbf{A}_{b_a}^{\top}\,\mathbf{r}_a, \qquad \mathbf{t}_a = \mathbf{X}\mathbf{z}_a.$$ 4. **Deflate in the original space** so that the next component sees a residual cross-covariance free of $\mathbf{t}_a$: $$\mathbf{S}^{(a)} \;=\; \bigl(\mathbf{I}_p - \mathbf{v}_a\mathbf{v}_a^{\top}\bigr)\mathbf{S}^{(a-1)}, \quad \mathbf{v}_a = \mathbf{p}_a / \lVert\mathbf{p}_a\rVert, \quad \mathbf{p}_a = \mathbf{X}^{\top}\mathbf{t}_a / \lVert\mathbf{t}_a\rVert^{2}.$$ **Closed-form coefficient.** With the selected sequence $(b_1, \dots, b_K)$ the model coefficients use exactly the same SIMPLS recovery formula as AOM-PLS, only with a *component-dependent* adjoint: $$\mathbf{Z} = \bigl[\mathbf{A}_{b_1}^{\top}\mathbf{r}_1\;\cdots\;\mathbf{A}_{b_K}^{\top}\mathbf{r}_K\bigr], \qquad \mathbf{B} = \mathbf{Z}\bigl(\mathbf{P}^{\top}\mathbf{Z}\bigr)^{+}\mathbf{Q}^{\top}.$$ $\mathbf{B}$ lives in the original wavelength space, so — exactly as for AOM-PLS — predictions are a single dot product $\hat{\mathbf{Y}}(\mathbf{X}^{\star}) = \mathbf{X}^{\star}\mathbf{B}$, **with no preprocessing replay at predict time**. The relaxation buys wavelength-region adaptivity (the model can pick a smoother for one component and a derivative for the next), at the cost of $B$ extra cheap left actions per component. ### Implementation `n4m_aom_per_component_select` via the native C ABI. Python exposes this as `n4m.aom_per_component_select` and the catalog alias `n4m.pop_pls`; the wrapper uses the same compact strict-linear default bank as AOM-PLS and accepts caller-provided strict operators. Result buffers include `input_coefficients` and `intercept`, so callers can reuse the selected per-component model on new spectra as `X_new @ input_coefficients + intercept`. The sklearn-style `n4m.sklearn.NativePOPPLSRegressor` wraps the same native result. Reference: git-pinned oracle `nirs4all.operators.models.sklearn.aom_pls.POPPLSRegressor` (sanctioned exception). MATLAB header (`bindings/matlab/+pls4all/pop_pls.m`): ```text pls4all.pop_pls POP-PLS per-component operator selection. ``` ### Usage Every pls4all binding tab dispatches into the same C kernel; the external libraries listed at the bottom of the page are the parity references registered in `benchmarks.parity_timing.registry`. Switch tabs to read the same fit in your language. The R package now ships drop-in-compatible facades for the CRAN `pls` package (`plsr`, `pcr`, `mvr`) and for the `mdatools::pls(x, y, ...)` matrix idiom — those tabs appear only on the methods that have a meaningful equivalence. **pls4all bindings** ::::{tab-set} :class: pls4all-bindings :::{tab-item} C ABI · libn4m :sync: c :class-label: lang-c ```c /* C ABI — libn4m AOM/POP selector path */ n4m_context_t* ctx = n4m_context_create(); n4m_config_t* cfg = n4m_config_create(); n4m_operator_bank_t* bank = NULL; n4m_validation_plan_t* plan = NULL; n4m_aom_per_component_result_t* res = NULL; n4m_operator_bank_create(&bank); /* add compact nirs4all-style operators: identity, SG, detrend, FD */ n4m_validation_plan_create(&plan); /* fill CV folds on plan */ n4m_aom_per_component_select(ctx, cfg, bank, &x_view, &y_view, plan, /* max_components */ 2, &res); /* read predictions and selection diagnostics via result getters */ n4m_aom_per_component_result_destroy(res); n4m_validation_plan_destroy(plan); n4m_operator_bank_destroy(bank); n4m_config_destroy(cfg); n4m_context_destroy(ctx); ``` ::: :::{tab-item} Python · pls4all (raw) :sync: python-raw :class-label: lang-python ```python import n4m res = n4m.pop_pls( X, y, max_components=2, cv=4, operators=[ "identity", ("savgol_smooth", [5, 2]), ("finite_difference", [1]), ], ) yhat = res["predictions"] selected_ops = res["selected_operator_indices"] coef = res["input_coefficients"] intercept = res["intercept"] yhat_new = X_new @ coef + intercept ``` ::: :::{tab-item} Python · pls4all.sklearn :sync: python-sklearn :class-label: lang-python ```python from n4m.sklearn import NativePOPPLSRegressor model = NativePOPPLSRegressor(max_components=2, cv=4).fit(X, y) yhat_new = model.predict(X_new) ``` ::: :::{tab-item} R · pls4all_method() :sync: r-dispatcher :class-label: lang-r ```r library(pls4all) # Unified low-level dispatcher (May 2026 R cleanup): res <- pls4all_method("pop_pls", X, y, n_components = 2L, params = list(max_components = 3L, n_operators = 9L, cv = 3L)) # res is a named list with MethodResult arrays/scalars. # selected_indices / top_k_intervals are 1-based. ``` ::: :::{tab-item} MATLAB · pls4all (MEX) :sync: matlab-mex :class-label: lang-matlab ```matlab res = pls4all.pop_pls(X, y, 2); % see header of bindings/matlab/+pls4all/pop_pls.m for full % parameter surface: % res = pop_pls(X, Y, max_components, n_operators, cv) yhat = predict(res, Xtest); ``` ::: :::{tab-item} MATLAB · pls4all (classdef) :sync: matlab-classdef :class-label: lang-matlab _No idiomatic classdef wrapper — invoke `pls4all.fit("pop_pls", X, y, …)` directly from the unified MEX factory._ ::: :::: **Registry parity references** 📐 :::{card} :class-card: external-refs - 📐 **`nirs4all`** (python · python) — `nirs4all` in-tree · qualitative (rmse_rel ≤ 5e+00) — In-tree nirs4all AOM/POP estimator stack (sanctioned reference). The pls4all ABI uses the same compact strict-linear bank and contiguous folds for cross-binding determinism; nirs4all remains the qualitative algorithmic reference. ::: ### Benchmarks Adaptive wall-clock per cell measured against [`full_matrix.csv`](../benchmarks/overview.md). Only backends that implement this method are listed; libraries without the method are omitted. **Verdict**  ·  ✓ ref / ≈ ref / ~ shape mark a reference-gate pass at strict / relaxed / qualitative tolerance  ·  ✓ bind = pls4all binding agrees with the C++ baseline  ·  ✗ divergent  ·  ⚠ error  ·  — not run. The fastest backend per column is marked 🏆. **Reference gate**: qualitative — shape/smoke comparison only. The external library and pls4all do not produce numerically equivalent output for this method (see the MethodSpec notes); the `rmse_rel_tol ≤ 5e+00` budget is set wide on purpose. Treat ~ shape as *“we ran both, both finished”*, not as numerical agreement. Rows tagged with **📐** are the canonical parity references for this method (declared in [`parity_timing.registry`](../benchmarks/methodology.md)). C++ and external rows show reference parity; pls4all language bindings show binding parity against the C++ backend. Hover the icon for role and tolerance band. ::::{tab-set} :class: parity-tabs :::{tab-item} 1 thread :sync: threads-1
BackendParity50×250 (ms)100×50 (ms)100×500 (ms)100×2500 (ms)200×40 (ms)250×50 (ms)500×50 (ms)500×500 (ms)500×2500 (ms)2500×50 (ms)2500×500 (ms)2500×2500 (ms)10000×50 (ms)10000×500 (ms)
C++ native · libn4m
pls4all.cpp.blas≈ +5e-1525.7 ms5.48 ms174.4 ms8.7 s4.60 ms🏆7.39 ms19.4 ms311.4 ms🏆9.9 s128.8 ms🏆1.9 s19.5 s928.0 ms13.3 s🏆
pls4all.cpp.blas+omp≈ +5e-1524.0 ms3.98 ms🏆163.8 ms8.5 s4.70 ms7.60 ms19.2 ms316.8 ms9.5 s143.2 ms1.8 s🏆18.8 s🏆915.4 ms🏆13.3 s
pls4all.cpp.omp≈ +5e-1528.8 ms4.93 ms158.0 ms🏆8.4 s🏆4.83 ms7.90 ms20.6 ms322.1 ms9.4 s🏆148.4 ms1.9 s19.6 s1.0 s13.4 s
pls4all.cpp.ref≈ +5e-1524.2 ms4.07 ms160.1 ms8.8 s4.69 ms7.66 ms17.8 ms🏆325.9 ms9.6 s138.2 ms1.9 s19.3 s925.9 ms13.3 s
Python · pls4all
pls4all.python✓ bind25.2 ms4.70 ms7.12 ms🏆
pls4all.sklearn✓ bind23.1 ms🏆5.33 ms8.47 ms
R · pls4all
pls4all.R✓ bind32.7 ms8.05 ms18.5 ms
pls4all.R.formula✓ bind48.4 ms10.7 ms17.7 ms
pls4all.R.mdatools✓ bind42.8 ms8.75 ms14.9 ms
pls4all.R.pls✓ bind48.6 ms9.93 ms14.8 ms
MATLAB · pls4all
pls4all.matlab✗ +2e+0037.2 ms5.25 ms9.38 ms
pls4all.matlab.classdef✗ +2e+0030.3 ms6.18 ms14.8 ms
Python · external
📐nirs4all35.7 ms
::: :::{tab-item} 3 threads :sync: threads-3
BackendParity50×250 (ms)100×50 (ms)100×500 (ms)100×2500 (ms)200×40 (ms)250×50 (ms)500×50 (ms)500×500 (ms)500×2500 (ms)2500×50 (ms)2500×500 (ms)2500×2500 (ms)10000×50 (ms)10000×500 (ms)
C++ native · libn4m
pls4all.cpp.blas~ shape 5e-154.67 ms
pls4all.cpp.blas+omp~ shape 5e-154.58 ms
pls4all.cpp.omp~ shape 5e-154.75 ms
pls4all.cpp.ref~ shape 5e-154.48 ms🏆
Python · pls4all
pls4all.python✓ bind5.75 ms
pls4all.sklearn✓ bind4.98 ms
R · pls4all
pls4all.R✓ bind7.30 ms
pls4all.R.formula✓ bind8.75 ms
pls4all.R.mdatools✓ bind8.12 ms
pls4all.R.pls✓ bind8.31 ms
MATLAB · pls4all
pls4all.matlab✗ +2e+005.30 ms
pls4all.matlab.classdef✗ +2e+005.43 ms
Python · external
📐nirs4allsource31.5 ms
::: :::{tab-item} 10 threads :sync: threads-10
BackendParity50×250 (ms)100×50 (ms)100×500 (ms)100×2500 (ms)200×40 (ms)250×50 (ms)500×50 (ms)500×500 (ms)500×2500 (ms)2500×50 (ms)2500×500 (ms)2500×2500 (ms)10000×50 (ms)10000×500 (ms)
C++ native · libn4m
pls4all.cpp.blas~ shape 5e-154.39 ms
pls4all.cpp.blas+omp~ shape 5e-154.25 ms
pls4all.cpp.omp~ shape 5e-154.50 ms
pls4all.cpp.ref~ shape 5e-154.25 ms
Python · pls4all
pls4all.python✓ bind4.28 ms
pls4all.sklearn✓ bind4.18 ms🏆
R · pls4all
pls4all.R✓ bind5.83 ms
pls4all.R.formula✓ bind6.72 ms
pls4all.R.mdatools✓ bind6.93 ms
pls4all.R.pls✓ bind6.95 ms
MATLAB · pls4all
pls4all.matlab✗ +2e+004.52 ms
pls4all.matlab.classdef✗ +2e+004.77 ms
Python · external
📐nirs4allsource24.8 ms
::: :::: --- _See also_: [benchmark overview](../benchmarks/overview.md) · [methods index](index.md) · [interactive dashboard](../landing/dashboard.md)