# `pop_pls` — POP-PLS (per-component operator selection)
_Group_: **Adaptive** · _Registry tolerance_: `5.0`
## Description
POP-PLS — per-component adaptive operator selection
> **Registry note** — POPPLS/POP-PLS uses per-component operator selection over the same compact nirs4all bank. Reference is the in-tree nirs4all POPPLSRegressor; parity is qualitative.
### Parameters
| Name | Type | Default | Notes |
|------|------|---------|-------|
| `max_components` | `int` | `3` | registry benchmark cell value |
| `n_operators` | `int` | `9` | registry benchmark cell value |
| `cv` | `int` | `3` | registry benchmark cell value |
## Explanations
### Bibliographic source
Beurier, G., Reiter, R., Noûs, C., Rouan, L. & Cornet, D. (2026). *Reframing preprocessing selection as model-internal calibration in near-infrared spectroscopy: a large-scale benchmark of operator-adaptive PLS and Ridge models*. arXiv:2605.13587. https://arxiv.org/abs/2605.13587.
### Mathematical principle
POP-PLS (Per-Operator PLS) is the per-component ablation of AOM-PLS: each latent component may pick a *different* operator from the bank, rather than committing to one global operator. The setting is the same — centered $\mathbf{X} \in \mathbb{R}^{n\times p}$, response $\mathbf{Y}$, strict-linear bank $\{\mathbf{A}_b\}_{b=1}^{B}$, cross-covariance matrix $\mathbf{S} = \mathbf{X}^{\top}\mathbf{Y}$ — but the selection rule is local to each component.
**Per-component greedy selection.** Initialise $\mathbf{S}^{(0)} \leftarrow \mathbf{S}$. For $a = 1, \dots, K$:
1. **Score the bank** on the *current* deflated cross-covariance: for every $b$ evaluate the criterion $\mathcal{C}_a(b)$ of the SIMPLS-covariance step that would result from picking operator $b$ at component $a$ (covariance proxy $\lVert\mathbf{A}_b\mathbf{S}^{(a-1)}\rVert$, K-fold CV-RMSE on the resulting prefix, or approximate PRESS — same family of criteria as AOM-PLS).
2. **Pick the local minimiser** $b_a = \operatorname*{arg\,min}_b \mathcal{C}_a(b)$.
3. **Extract the component** $\mathbf{r}_a = \mathbf{u}_1\!\bigl(\mathbf{A}_{b_a}\mathbf{S}^{(a-1)}\bigr)$ in transformed space and lift it back through the *component-specific* adjoint:
$$\mathbf{z}_a \;=\; \mathbf{A}_{b_a}^{\top}\,\mathbf{r}_a, \qquad \mathbf{t}_a = \mathbf{X}\mathbf{z}_a.$$
4. **Deflate in the original space** so that the next component sees a residual cross-covariance free of $\mathbf{t}_a$:
$$\mathbf{S}^{(a)} \;=\; \bigl(\mathbf{I}_p - \mathbf{v}_a\mathbf{v}_a^{\top}\bigr)\mathbf{S}^{(a-1)}, \quad \mathbf{v}_a = \mathbf{p}_a / \lVert\mathbf{p}_a\rVert, \quad \mathbf{p}_a = \mathbf{X}^{\top}\mathbf{t}_a / \lVert\mathbf{t}_a\rVert^{2}.$$
**Closed-form coefficient.** With the selected sequence $(b_1, \dots, b_K)$ the model coefficients use exactly the same SIMPLS recovery formula as AOM-PLS, only with a *component-dependent* adjoint:
$$\mathbf{Z} = \bigl[\mathbf{A}_{b_1}^{\top}\mathbf{r}_1\;\cdots\;\mathbf{A}_{b_K}^{\top}\mathbf{r}_K\bigr], \qquad \mathbf{B} = \mathbf{Z}\bigl(\mathbf{P}^{\top}\mathbf{Z}\bigr)^{+}\mathbf{Q}^{\top}.$$
$\mathbf{B}$ lives in the original wavelength space, so — exactly as for AOM-PLS — predictions are a single dot product $\hat{\mathbf{Y}}(\mathbf{X}^{\star}) = \mathbf{X}^{\star}\mathbf{B}$, **with no preprocessing replay at predict time**. The relaxation buys wavelength-region adaptivity (the model can pick a smoother for one component and a derivative for the next), at the cost of $B$ extra cheap left actions per component.
### Implementation
`n4m_aom_per_component_select` via the native C ABI. Python exposes this as
`n4m.aom_per_component_select` and the catalog alias `n4m.pop_pls`; the wrapper
uses the same compact strict-linear default bank as AOM-PLS and accepts
caller-provided strict operators. Result buffers include `input_coefficients`
and `intercept`, so callers can reuse the selected per-component model on new
spectra as `X_new @ input_coefficients + intercept`. The sklearn-style
`n4m.sklearn.NativePOPPLSRegressor` wraps the same native result. Reference:
git-pinned oracle
`nirs4all.operators.models.sklearn.aom_pls.POPPLSRegressor` (sanctioned
exception).
MATLAB header (`bindings/matlab/+pls4all/pop_pls.m`):
```text
pls4all.pop_pls POP-PLS per-component operator selection.
```
### Usage
Every pls4all binding tab dispatches into the same C kernel; the external libraries listed at the bottom of the page are the parity references registered in `benchmarks.parity_timing.registry`. Switch tabs to read the same fit in your language. The R package now ships drop-in-compatible facades for the CRAN `pls` package (`plsr`, `pcr`, `mvr`) and for the `mdatools::pls(x, y, ...)` matrix idiom — those tabs appear only on the methods that have a meaningful equivalence.
**pls4all bindings**
::::{tab-set}
:class: pls4all-bindings
:::{tab-item} C ABI · libn4m
:sync: c
:class-label: lang-c
```c
/* C ABI — libn4m AOM/POP selector path */
n4m_context_t* ctx = n4m_context_create();
n4m_config_t* cfg = n4m_config_create();
n4m_operator_bank_t* bank = NULL;
n4m_validation_plan_t* plan = NULL;
n4m_aom_per_component_result_t* res = NULL;
n4m_operator_bank_create(&bank);
/* add compact nirs4all-style operators: identity, SG, detrend, FD */
n4m_validation_plan_create(&plan);
/* fill CV folds on plan */
n4m_aom_per_component_select(ctx, cfg, bank, &x_view, &y_view, plan,
/* max_components */ 2, &res);
/* read predictions and selection diagnostics via result getters */
n4m_aom_per_component_result_destroy(res);
n4m_validation_plan_destroy(plan);
n4m_operator_bank_destroy(bank);
n4m_config_destroy(cfg);
n4m_context_destroy(ctx);
```
:::
:::{tab-item} Python · pls4all (raw)
:sync: python-raw
:class-label: lang-python
```python
import n4m
res = n4m.pop_pls(
X,
y,
max_components=2,
cv=4,
operators=[
"identity",
("savgol_smooth", [5, 2]),
("finite_difference", [1]),
],
)
yhat = res["predictions"]
selected_ops = res["selected_operator_indices"]
coef = res["input_coefficients"]
intercept = res["intercept"]
yhat_new = X_new @ coef + intercept
```
:::
:::{tab-item} Python · pls4all.sklearn
:sync: python-sklearn
:class-label: lang-python
```python
from n4m.sklearn import NativePOPPLSRegressor
model = NativePOPPLSRegressor(max_components=2, cv=4).fit(X, y)
yhat_new = model.predict(X_new)
```
:::
:::{tab-item} R · pls4all_method()
:sync: r-dispatcher
:class-label: lang-r
```r
library(pls4all)
# Unified low-level dispatcher (May 2026 R cleanup):
res <- pls4all_method("pop_pls", X, y,
n_components = 2L, params = list(max_components = 3L, n_operators = 9L, cv = 3L))
# res is a named list with MethodResult arrays/scalars.
# selected_indices / top_k_intervals are 1-based.
```
:::
:::{tab-item} MATLAB · pls4all (MEX)
:sync: matlab-mex
:class-label: lang-matlab
```matlab
res = pls4all.pop_pls(X, y, 2);
% see header of bindings/matlab/+pls4all/pop_pls.m for full
% parameter surface:
% res = pop_pls(X, Y, max_components, n_operators, cv)
yhat = predict(res, Xtest);
```
:::
:::{tab-item} MATLAB · pls4all (classdef)
:sync: matlab-classdef
:class-label: lang-matlab
_No idiomatic classdef wrapper — invoke `pls4all.fit("pop_pls", X, y, …)` directly from the unified MEX factory._
:::
::::
**Registry parity references** 📐
:::{card}
:class-card: external-refs
- 📐 **`nirs4all`** (python · python) — `nirs4all` in-tree · qualitative (rmse_rel ≤ 5e+00) — In-tree nirs4all AOM/POP estimator stack (sanctioned reference). The pls4all ABI uses the same compact strict-linear bank and contiguous folds for cross-binding determinism; nirs4all remains the qualitative algorithmic reference.
:::
### Benchmarks
Adaptive wall-clock per cell measured against [`full_matrix.csv`](../benchmarks/overview.md). Only backends that implement this method are listed; libraries without the method are omitted.
**Verdict** · ✓ ref / ≈ ref / ~ shape mark a reference-gate pass at strict / relaxed / qualitative tolerance · ✓ bind = pls4all binding agrees with the C++ baseline · ✗ divergent · ⚠ error · — not run. The fastest backend per column is marked 🏆.
**Reference gate**: qualitative — shape/smoke comparison only. The external library and pls4all do not produce numerically equivalent output for this method (see the MethodSpec notes); the `rmse_rel_tol ≤ 5e+00` budget is set wide on purpose. Treat ~ shape as *“we ran both, both finished”*, not as numerical agreement.
Rows tagged with **📐** are the canonical parity references for this method (declared in [`parity_timing.registry`](../benchmarks/methodology.md)). C++ and external rows show reference parity; pls4all language bindings show binding parity against the C++ backend. Hover the icon for role and tolerance band.
::::{tab-set}
:class: parity-tabs
:::{tab-item} 1 thread
:sync: threads-1
| Backend | Parity | 50×250 (ms) | 100×50 (ms) | 100×500 (ms) | 100×2500 (ms) | 200×40 (ms) | 250×50 (ms) | 500×50 (ms) | 500×500 (ms) | 500×2500 (ms) | 2500×50 (ms) | 2500×500 (ms) | 2500×2500 (ms) | 10000×50 (ms) | 10000×500 (ms) |
| C++ native · libn4m |
pls4all.cpp.blas | ≈ +5e-15 | 25.7 ms | 5.48 ms | 174.4 ms | 8.7 s | 4.60 ms🏆 | 7.39 ms | 19.4 ms | 311.4 ms🏆 | 9.9 s | 128.8 ms🏆 | 1.9 s | 19.5 s | 928.0 ms | 13.3 s🏆 |
pls4all.cpp.blas+omp | ≈ +5e-15 | 24.0 ms | 3.98 ms🏆 | 163.8 ms | 8.5 s | 4.70 ms | 7.60 ms | 19.2 ms | 316.8 ms | 9.5 s | 143.2 ms | 1.8 s🏆 | 18.8 s🏆 | 915.4 ms🏆 | 13.3 s |
pls4all.cpp.omp | ≈ +5e-15 | 28.8 ms | 4.93 ms | 158.0 ms🏆 | 8.4 s🏆 | 4.83 ms | 7.90 ms | 20.6 ms | 322.1 ms | 9.4 s🏆 | 148.4 ms | 1.9 s | 19.6 s | 1.0 s | 13.4 s |
pls4all.cpp.ref | ≈ +5e-15 | 24.2 ms | 4.07 ms | 160.1 ms | 8.8 s | 4.69 ms | 7.66 ms | 17.8 ms🏆 | 325.9 ms | 9.6 s | 138.2 ms | 1.9 s | 19.3 s | 925.9 ms | 13.3 s |
| Python · pls4all |
pls4all.python | ✓ bind | 25.2 ms | — | — | — | 4.70 ms | 7.12 ms🏆 | — | — | — | — | — | — | — | — |
pls4all.sklearn | ✓ bind | 23.1 ms🏆 | — | — | — | 5.33 ms | 8.47 ms | — | — | — | — | — | — | — | — |
| R · pls4all |
pls4all.R | ✓ bind | 32.7 ms | — | — | — | 8.05 ms | 18.5 ms | — | — | — | — | — | — | — | — |
pls4all.R.formula | ✓ bind | 48.4 ms | — | — | — | 10.7 ms | 17.7 ms | — | — | — | — | — | — | — | — |
pls4all.R.mdatools | ✓ bind | 42.8 ms | — | — | — | 8.75 ms | 14.9 ms | — | — | — | — | — | — | — | — |
pls4all.R.pls | ✓ bind | 48.6 ms | — | — | — | 9.93 ms | 14.8 ms | — | — | — | — | — | — | — | — |
| MATLAB · pls4all |
pls4all.matlab | ✗ +2e+00 | 37.2 ms | — | — | — | 5.25 ms | 9.38 ms | — | — | — | — | — | — | — | — |
pls4all.matlab.classdef | ✗ +2e+00 | 30.3 ms | — | — | — | 6.18 ms | 14.8 ms | — | — | — | — | — | — | — | — |
| Python · external |
📐nirs4all | ⚠ | — | — | — | — | 35.7 ms | — | — | — | — | — | — | — | — | — |
:::
:::{tab-item} 3 threads
:sync: threads-3
| Backend | Parity | 50×250 (ms) | 100×50 (ms) | 100×500 (ms) | 100×2500 (ms) | 200×40 (ms) | 250×50 (ms) | 500×50 (ms) | 500×500 (ms) | 500×2500 (ms) | 2500×50 (ms) | 2500×500 (ms) | 2500×2500 (ms) | 10000×50 (ms) | 10000×500 (ms) |
| C++ native · libn4m |
pls4all.cpp.blas | ~ shape 5e-15 | — | — | — | — | 4.67 ms | — | — | — | — | — | — | — | — | — |
pls4all.cpp.blas+omp | ~ shape 5e-15 | — | — | — | — | 4.58 ms | — | — | — | — | — | — | — | — | — |
pls4all.cpp.omp | ~ shape 5e-15 | — | — | — | — | 4.75 ms | — | — | — | — | — | — | — | — | — |
pls4all.cpp.ref | ~ shape 5e-15 | — | — | — | — | 4.48 ms🏆 | — | — | — | — | — | — | — | — | — |
| Python · pls4all |
pls4all.python | ✓ bind | — | — | — | — | 5.75 ms | — | — | — | — | — | — | — | — | — |
pls4all.sklearn | ✓ bind | — | — | — | — | 4.98 ms | — | — | — | — | — | — | — | — | — |
| R · pls4all |
pls4all.R | ✓ bind | — | — | — | — | 7.30 ms | — | — | — | — | — | — | — | — | — |
pls4all.R.formula | ✓ bind | — | — | — | — | 8.75 ms | — | — | — | — | — | — | — | — | — |
pls4all.R.mdatools | ✓ bind | — | — | — | — | 8.12 ms | — | — | — | — | — | — | — | — | — |
pls4all.R.pls | ✓ bind | — | — | — | — | 8.31 ms | — | — | — | — | — | — | — | — | — |
| MATLAB · pls4all |
pls4all.matlab | ✗ +2e+00 | — | — | — | — | 5.30 ms | — | — | — | — | — | — | — | — | — |
pls4all.matlab.classdef | ✗ +2e+00 | — | — | — | — | 5.43 ms | — | — | — | — | — | — | — | — | — |
| Python · external |
📐nirs4all | source | — | — | — | — | 31.5 ms | — | — | — | — | — | — | — | — | — |
:::
:::{tab-item} 10 threads
:sync: threads-10
| Backend | Parity | 50×250 (ms) | 100×50 (ms) | 100×500 (ms) | 100×2500 (ms) | 200×40 (ms) | 250×50 (ms) | 500×50 (ms) | 500×500 (ms) | 500×2500 (ms) | 2500×50 (ms) | 2500×500 (ms) | 2500×2500 (ms) | 10000×50 (ms) | 10000×500 (ms) |
| C++ native · libn4m |
pls4all.cpp.blas | ~ shape 5e-15 | — | — | — | — | 4.39 ms | — | — | — | — | — | — | — | — | — |
pls4all.cpp.blas+omp | ~ shape 5e-15 | — | — | — | — | 4.25 ms | — | — | — | — | — | — | — | — | — |
pls4all.cpp.omp | ~ shape 5e-15 | — | — | — | — | 4.50 ms | — | — | — | — | — | — | — | — | — |
pls4all.cpp.ref | ~ shape 5e-15 | — | — | — | — | 4.25 ms | — | — | — | — | — | — | — | — | — |
| Python · pls4all |
pls4all.python | ✓ bind | — | — | — | — | 4.28 ms | — | — | — | — | — | — | — | — | — |
pls4all.sklearn | ✓ bind | — | — | — | — | 4.18 ms🏆 | — | — | — | — | — | — | — | — | — |
| R · pls4all |
pls4all.R | ✓ bind | — | — | — | — | 5.83 ms | — | — | — | — | — | — | — | — | — |
pls4all.R.formula | ✓ bind | — | — | — | — | 6.72 ms | — | — | — | — | — | — | — | — | — |
pls4all.R.mdatools | ✓ bind | — | — | — | — | 6.93 ms | — | — | — | — | — | — | — | — | — |
pls4all.R.pls | ✓ bind | — | — | — | — | 6.95 ms | — | — | — | — | — | — | — | — | — |
| MATLAB · pls4all |
pls4all.matlab | ✗ +2e+00 | — | — | — | — | 4.52 ms | — | — | — | — | — | — | — | — | — |
pls4all.matlab.classdef | ✗ +2e+00 | — | — | — | — | 4.77 ms | — | — | — | — | — | — | — | — | — |
| Python · external |
📐nirs4all | source | — | — | — | — | 24.8 ms | — | — | — | — | — | — | — | — | — |
:::
::::
---
_See also_: [benchmark overview](../benchmarks/overview.md) · [methods index](index.md) · [interactive dashboard](../landing/dashboard.md)