# `aom_pls` — AOM-PLS (global adaptive operator selection)
_Group_: **Adaptive** · _Registry tolerance_: `5.0`
## Description
AOM-PLS — global adaptive operator selection
> **Registry note** — Global AOMPLS/AOM-PLS selector with the compact strict-linear nirs4all bank: identity, Savitzky-Golay smooth/derivative, detrend and finite-difference operators. Reference is the in-tree nirs4all estimator stack; parity remains qualitative because selection tie-breaking and CV scoring details differ across implementations.
### Parameters
| Name | Type | Default | Notes |
|------|------|---------|-------|
| `max_components` | `int` | `3` | registry benchmark cell value |
| `n_operators` | `int` | `9` | registry benchmark cell value |
| `cv` | `int` | `3` | registry benchmark cell value |
## Explanations
### Bibliographic source
Beurier, G., Reiter, R., Noûs, C., Rouan, L. & Cornet, D. (2026). *Reframing preprocessing selection as model-internal calibration in near-infrared spectroscopy: a large-scale benchmark of operator-adaptive PLS and Ridge models*. arXiv:2605.13587. https://arxiv.org/abs/2605.13587.
### Mathematical principle
AOM-PLS treats spectroscopic preprocessing as a learnable step *inside* the PLS calibration. Let $\mathbf{X} \in \mathbb{R}^{n\times p}$ be the centered spectral matrix (rows = samples), $\mathbf{Y} \in \mathbb{R}^{n\times q}$ the centered response, and $\{\mathbf{A}_b\}_{b=1}^{B} \subset \mathbb{R}^{p\times p}$ a finite bank of **strict-linear** spectral operators. An operator is strict-linear when its action $\mathbf{X}_b = \mathbf{X}\mathbf{A}_b^{\top}$ depends only on the fixed wavelength grid (identity, Savitzky–Golay smoothing and derivatives, finite differences, polynomial detrending, Norris–Williams gap derivatives, Whittaker smoothing); SNV, MSC, EMSC, ASLS and OSC are *not* strict-linear and are handled as fold-local branches in `nirs4all`.
**Cross-covariance identity** (Eq. 2 of the paper). For centered $\mathbf{X}, \mathbf{Y}$ and any strict-linear $\mathbf{A}$,
$$\bigl(\mathbf{X}\mathbf{A}^{\top}\bigr)^{\top}\mathbf{Y} \;=\; \mathbf{A}\,\mathbf{X}^{\top}\mathbf{Y}.$$
Writing $\mathbf{S} = \mathbf{X}^{\top}\mathbf{Y}$, every operator can therefore be *screened* by the cheap left action $\mathbf{S}_b = \mathbf{A}_b\mathbf{S}$ ($O(p q)$ per candidate) instead of materializing $\mathbf{X}_b$ ($O(n p)$).
**Global selection (the AOM in AOM-PLS).** A single operator index $b^{\star}$ is chosen for *all* $K$ components by minimising a selection criterion $\mathcal{C}$ over $b$:
$$b^{\star} \;=\; \operatorname*{arg\,min}_{b\in\{1,\dots,B\}} \; \mathcal{C}\!\bigl(\text{SIMPLS}(\mathbf{X}_b, \mathbf{Y}; K)\bigr).$$
The default criterion is K-fold CV-RMSE (`criterion='cv'`); alternatives include the covariance proxy $-\lVert\mathbf{A}_b\mathbf{S}\rVert$ (fast prescreen), leverage-corrected approximate PRESS, and a hybrid covariance-then-CV refinement. The optimal prefix length $k \le K$ is selected jointly when `auto_prefix=True`.
**SIMPLS-covariance engine.** With $\mathbf{S}_b = \mathbf{A}_b\mathbf{S}$ precomputed, SIMPLS extracts component $a$ from the leading left singular vector $\mathbf{r}_a = \mathbf{u}_1(\mathbf{S}_b)$ and maps it back to the original wavelength grid through the operator's adjoint:
$$\mathbf{z}_a \;=\; \mathbf{A}_{b^{\star}}^{\top}\,\mathbf{r}_a, \qquad \mathbf{t}_a = \mathbf{X}\mathbf{z}_a.$$
Stacking $\mathbf{Z} = [\mathbf{z}_1\;\cdots\;\mathbf{z}_K]$, with original-space loadings $\mathbf{P} = \mathbf{X}^{\top}\mathbf{T}\,\operatorname{diag}(1/\lVert\mathbf{t}_a\rVert^{2})$ and $\mathbf{Q} = \mathbf{Y}^{\top}\mathbf{T}\,\operatorname{diag}(1/\lVert\mathbf{t}_a\rVert^{2})$, the recovered coefficient matrix is
$$\mathbf{B} \;=\; \mathbf{Z}\,\bigl(\mathbf{P}^{\top}\mathbf{Z}\bigr)^{+}\mathbf{Q}^{\top}, \qquad \hat{\mathbf{Y}}(\mathbf{X}^{\star}) = \mathbf{X}^{\star}\mathbf{B}.$$
Because $\mathbf{B}$ lives in the *original* feature space, **the fitted model is a single linear calibration on the raw wavelength grid: there is no preprocessing stage to replay at predict time** — the operator has been absorbed into the coefficients (paper §3.2). Computationally the bank exploration cost is roughly that of a single SIMPLS fit on $\mathbf{S}$ plus $B$ tiny left actions, which is the algorithmic gain that makes AOM-PLS comparable to vanilla PLS even with a $\sim$77-operator default bank.
### Implementation
`n4m_aom_global_select` via the native C ABI. Python exposes this as
`n4m.aom_global_select` and the catalog alias `n4m.aom_pls`; the wrapper builds
the compact strict-linear bank by default and also accepts caller-provided
strict operators. Result buffers include `input_coefficients` and `intercept`,
so callers can reuse the selected model on new spectra as
`X_new @ input_coefficients + intercept` without replaying the selected
operator. The sklearn-style `n4m.sklearn.NativeAOMPLSRegressor` wraps the same
native result. Reference: git-pinned oracle
`nirs4all.operators.models.sklearn.aom_pls.AOMPLSRegressor` (sanctioned
exception).
MATLAB header (`bindings/matlab/+pls4all/aom_pls.m`):
```text
pls4all.aom_pls AOM-PLS global operator selection.
```
### Usage
Every pls4all binding tab dispatches into the same C kernel; the external libraries listed at the bottom of the page are the parity references registered in `benchmarks.parity_timing.registry`. Switch tabs to read the same fit in your language. The R package now ships drop-in-compatible facades for the CRAN `pls` package (`plsr`, `pcr`, `mvr`) and for the `mdatools::pls(x, y, ...)` matrix idiom — those tabs appear only on the methods that have a meaningful equivalence.
**pls4all bindings**
::::{tab-set}
:class: pls4all-bindings
:::{tab-item} C ABI · libn4m
:sync: c
:class-label: lang-c
```c
/* C ABI — libn4m AOM/POP selector path */
n4m_context_t* ctx = n4m_context_create();
n4m_config_t* cfg = n4m_config_create();
n4m_operator_bank_t* bank = NULL;
n4m_validation_plan_t* plan = NULL;
n4m_aom_global_result_t* res = NULL;
n4m_operator_bank_create(&bank);
/* add compact nirs4all-style operators: identity, SG, detrend, FD */
n4m_validation_plan_create(&plan);
/* fill CV folds on plan */
n4m_aom_global_select(ctx, cfg, bank, &x_view, &y_view, plan,
/* max_components */ 2, &res);
/* read predictions and selection diagnostics via result getters */
n4m_aom_global_result_destroy(res);
n4m_validation_plan_destroy(plan);
n4m_operator_bank_destroy(bank);
n4m_config_destroy(cfg);
n4m_context_destroy(ctx);
```
:::
:::{tab-item} Python · pls4all (raw)
:sync: python-raw
:class-label: lang-python
```python
import n4m
res = n4m.aom_pls(
X,
y,
max_components=2,
cv=4,
operators=[
"identity",
("savgol_smooth", [5, 2]),
("finite_difference", [1]),
],
)
yhat = res["predictions"]
rmse_curves = res["rmse_curves"]
coef = res["input_coefficients"]
intercept = res["intercept"]
yhat_new = X_new @ coef + intercept
```
:::
:::{tab-item} Python · pls4all.sklearn
:sync: python-sklearn
:class-label: lang-python
```python
from n4m.sklearn import NativeAOMPLSRegressor
model = NativeAOMPLSRegressor(max_components=2, cv=4).fit(X, y)
yhat_new = model.predict(X_new)
```
:::
:::{tab-item} R · pls4all_method()
:sync: r-dispatcher
:class-label: lang-r
```r
library(pls4all)
# Unified low-level dispatcher (May 2026 R cleanup):
res <- pls4all_method("aom_pls", X, y,
n_components = 2L, params = list(max_components = 3L, n_operators = 9L, cv = 3L))
# res is a named list with MethodResult arrays/scalars.
# selected_indices / top_k_intervals are 1-based.
```
:::
:::{tab-item} MATLAB · pls4all (MEX)
:sync: matlab-mex
:class-label: lang-matlab
```matlab
res = pls4all.aom_pls(X, y, 2);
% see header of bindings/matlab/+pls4all/aom_pls.m for full
% parameter surface:
% res = aom_pls(X, Y, max_components, n_operators, cv)
yhat = predict(res, Xtest);
```
:::
:::{tab-item} MATLAB · pls4all (classdef)
:sync: matlab-classdef
:class-label: lang-matlab
_No idiomatic classdef wrapper — invoke `pls4all.fit("aom_pls", X, y, …)` directly from the unified MEX factory._
:::
::::
**Registry parity references** 📐
:::{card}
:class-card: external-refs
- 📐 **`nirs4all`** (python · python) — `nirs4all` in-tree · qualitative (rmse_rel ≤ 5e+00) — In-tree nirs4all AOM/POP estimator stack (sanctioned reference). The pls4all ABI uses the same compact strict-linear bank and contiguous folds for cross-binding determinism; nirs4all remains the qualitative algorithmic reference.
:::
### Benchmarks
Adaptive wall-clock per cell measured against [`full_matrix.csv`](../benchmarks/overview.md). Only backends that implement this method are listed; libraries without the method are omitted.
**Verdict** · ✓ ref / ≈ ref / ~ shape mark a reference-gate pass at strict / relaxed / qualitative tolerance · ✓ bind = pls4all binding agrees with the C++ baseline · ✗ divergent · ⚠ error · — not run. The fastest backend per column is marked 🏆.
**Reference gate**: qualitative — shape/smoke comparison only. The external library and pls4all do not produce numerically equivalent output for this method (see the MethodSpec notes); the `rmse_rel_tol ≤ 5e+00` budget is set wide on purpose. Treat ~ shape as *“we ran both, both finished”*, not as numerical agreement.
Rows tagged with **📐** are the canonical parity references for this method (declared in [`parity_timing.registry`](../benchmarks/methodology.md)). C++ and external rows show reference parity; pls4all language bindings show binding parity against the C++ backend. Hover the icon for role and tolerance band.
::::{tab-set}
:class: parity-tabs
:::{tab-item} 1 thread
:sync: threads-1
| Backend | Parity | 50×250 (ms) | 100×50 (ms) | 100×500 (ms) | 100×2500 (ms) | 200×40 (ms) | 250×50 (ms) | 500×50 (ms) | 500×500 (ms) | 500×2500 (ms) | 2500×50 (ms) | 2500×500 (ms) | 2500×2500 (ms) | 10000×50 (ms) | 10000×500 (ms) |
| C++ native · libn4m |
pls4all.cpp.blas | ≈ +6e-16 | 6.91 ms | 4.67 ms | 41.9 ms | 266.5 ms | 4.42 ms | 8.21 ms | 17.0 ms | 210.2 ms🏆 | 1.2 s | 116.0 ms | 1.2 s | 7.7 s | 490.2 ms | 6.0 s |
pls4all.cpp.blas+omp | ≈ +6e-16 | 7.00 ms | 4.09 ms🏆 | 31.9 ms🏆 | 241.5 ms🏆 | 4.16 ms🏆 | 6.55 ms🏆 | 16.2 ms🏆 | 228.9 ms | 1.2 s🏆 | 109.5 ms🏆 | 1.2 s🏆 | 7.7 s | 456.2 ms🏆 | 5.9 s🏆 |
pls4all.cpp.omp | ≈ +1e-15 | 7.22 ms | 4.21 ms | 40.5 ms | 251.8 ms | 4.41 ms | 7.75 ms | 20.1 ms | 224.8 ms | 1.3 s | 123.3 ms | 1.3 s | 7.7 s🏆 | 473.2 ms | 6.1 s |
pls4all.cpp.ref | ≈ +1e-15 | 7.04 ms | 4.74 ms | 39.3 ms | 259.6 ms | 4.39 ms | 7.39 ms | 19.0 ms | 222.0 ms | 1.3 s | 116.1 ms | 1.3 s | 7.9 s | 489.6 ms | 6.1 s |
| Python · pls4all |
pls4all.python | ✓ bind | 6.54 ms🏆 | — | — | — | 4.59 ms | 7.17 ms | — | — | — | — | — | — | — | — |
pls4all.sklearn | ✓ bind | 6.75 ms | — | — | — | 6.81 ms | 6.91 ms | — | — | — | — | — | — | — | — |
| R · pls4all |
pls4all.R | ✓ 6e-15 | 15.2 ms | — | — | — | 7.80 ms | 19.1 ms | — | — | — | — | — | — | — | — |
pls4all.R.formula | ✓ 6e-15 | 29.3 ms | — | — | — | 10.5 ms | 17.0 ms | — | — | — | — | — | — | — | — |
pls4all.R.mdatools | ✓ 6e-15 | 28.6 ms | — | — | — | 8.82 ms | 13.6 ms | — | — | — | — | — | — | — | — |
pls4all.R.pls | ✓ 6e-15 | 27.6 ms | — | — | — | 10.5 ms | 16.9 ms | — | — | — | — | — | — | — | — |
| MATLAB · pls4all |
pls4all.matlab | ✗ +1e+01 | 10.5 ms | — | — | — | 4.58 ms | 9.08 ms | — | — | — | — | — | — | — | — |
pls4all.matlab.classdef | ✗ +1e+01 | 11.7 ms | — | — | — | 5.10 ms | 9.72 ms | — | — | — | — | — | — | — | — |
| Python · external |
📐nirs4all | ⚠ | — | — | — | — | 12.4 ms | — | — | — | — | — | — | — | — | — |
:::
:::{tab-item} 3 threads
:sync: threads-3
| Backend | Parity | 50×250 (ms) | 100×50 (ms) | 100×500 (ms) | 100×2500 (ms) | 200×40 (ms) | 250×50 (ms) | 500×50 (ms) | 500×500 (ms) | 500×2500 (ms) | 2500×50 (ms) | 2500×500 (ms) | 2500×2500 (ms) | 10000×50 (ms) | 10000×500 (ms) |
| C++ native · libn4m |
pls4all.cpp.blas | ~ shape 6e-16 | — | — | — | — | 6.31 ms | — | — | — | — | — | — | — | — | — |
pls4all.cpp.blas+omp | ~ shape 6e-16 | — | — | — | — | 5.93 ms | — | — | — | — | — | — | — | — | — |
pls4all.cpp.omp | ~ shape 1e-15 | — | — | — | — | 4.28 ms | — | — | — | — | — | — | — | — | — |
pls4all.cpp.ref | ~ shape 1e-15 | — | — | — | — | 5.40 ms | — | — | — | — | — | — | — | — | — |
| Python · pls4all |
pls4all.python | ✓ 6e-15 | — | — | — | — | 4.15 ms🏆 | — | — | — | — | — | — | — | — | — |
pls4all.sklearn | ✓ 6e-15 | — | — | — | — | 4.25 ms | — | — | — | — | — | — | — | — | — |
| R · pls4all |
pls4all.R | ✓ bind | — | — | — | — | 7.68 ms | — | — | — | — | — | — | — | — | — |
pls4all.R.formula | ✓ bind | — | — | — | — | 10.4 ms | — | — | — | — | — | — | — | — | — |
pls4all.R.mdatools | ✓ bind | — | — | — | — | 9.05 ms | — | — | — | — | — | — | — | — | — |
pls4all.R.pls | ✓ bind | — | — | — | — | 8.40 ms | — | — | — | — | — | — | — | — | — |
| MATLAB · pls4all |
pls4all.matlab | ✗ +1e+01 | — | — | — | — | 4.60 ms | — | — | — | — | — | — | — | — | — |
pls4all.matlab.classdef | ✗ +1e+01 | — | — | — | — | 5.65 ms | — | — | — | — | — | — | — | — | — |
| Python · external |
📐nirs4all | source | — | — | — | — | 13.7 ms | — | — | — | — | — | — | — | — | — |
:::
:::{tab-item} 10 threads
:sync: threads-10
| Backend | Parity | 50×250 (ms) | 100×50 (ms) | 100×500 (ms) | 100×2500 (ms) | 200×40 (ms) | 250×50 (ms) | 500×50 (ms) | 500×500 (ms) | 500×2500 (ms) | 2500×50 (ms) | 2500×500 (ms) | 2500×2500 (ms) | 10000×50 (ms) | 10000×500 (ms) |
| C++ native · libn4m |
pls4all.cpp.blas | ~ shape 6e-16 | — | — | — | — | 3.97 ms | — | — | — | — | — | — | — | — | — |
pls4all.cpp.blas+omp | ~ shape 6e-16 | — | — | — | — | 3.93 ms | — | — | — | — | — | — | — | — | — |
pls4all.cpp.omp | ~ shape 1e-15 | — | — | — | — | 4.11 ms | — | — | — | — | — | — | — | — | — |
pls4all.cpp.ref | ~ shape 1e-15 | — | — | — | — | 4.11 ms | — | — | — | — | — | — | — | — | — |
| Python · pls4all |
pls4all.python | ✓ 6e-15 | — | — | — | — | 3.97 ms | — | — | — | — | — | — | — | — | — |
pls4all.sklearn | ✓ 6e-15 | — | — | — | — | 3.85 ms🏆 | — | — | — | — | — | — | — | — | — |
| R · pls4all |
pls4all.R | ✓ bind | — | — | — | — | 6.42 ms | — | — | — | — | — | — | — | — | — |
pls4all.R.formula | ✓ bind | — | — | — | — | 7.29 ms | — | — | — | — | — | — | — | — | — |
pls4all.R.mdatools | ✓ bind | — | — | — | — | 7.37 ms | — | — | — | — | — | — | — | — | — |
pls4all.R.pls | ✓ bind | — | — | — | — | 7.14 ms | — | — | — | — | — | — | — | — | — |
| MATLAB · pls4all |
pls4all.matlab | ✗ +1e+01 | — | — | — | — | 4.25 ms | — | — | — | — | — | — | — | — | — |
pls4all.matlab.classdef | ✗ +1e+01 | — | — | — | — | 4.51 ms | — | — | — | — | — | — | — | — | — |
| Python · external |
📐nirs4all | source | — | — | — | — | 11.1 ms | — | — | — | — | — | — | — | — | — |
:::
::::
---
_See also_: [benchmark overview](../benchmarks/overview.md) · [methods index](index.md) · [interactive dashboard](../landing/dashboard.md)