`aom_pls` — AOM-PLS (global adaptive operator selection)¶

Group: Adaptive · Registry tolerance: 1e-08

Description¶

AOM-PLS — global adaptive operator selection

Registry note — Global AOMPLS/AOM-PLS selector with the compact strict-linear nirs4all bank: identity, Savitzky-Golay smooth/derivative, detrend and finite-difference operators. Reference is the in-tree nirs4all estimator stack; parity remains qualitative because selection tie-breaking and CV scoring details differ across implementations.

Parameters¶

Name	Type	Default	Notes
`max_components`	`int`	`3`	registry benchmark cell value
`n_operators`	`int`	`9`	registry benchmark cell value
`cv`	`int`	`3`	registry benchmark cell value

Explanations¶

Bibliographic source¶

Beurier, G., Reiter, R., Noûs, C., Rouan, L. & Cornet, D. (2026). Reframing preprocessing selection as model-internal calibration in near-infrared spectroscopy: a large-scale benchmark of operator-adaptive PLS and Ridge models. arXiv:2605.13587. https://arxiv.org/abs/2605.13587.

Mathematical principle¶

AOM-PLS treats spectroscopic preprocessing as a learnable step inside the PLS calibration. Let \(\mathbf{X} \in \mathbb{R}^{n\times p}\) be the centered spectral matrix (rows = samples), \(\mathbf{Y} \in \mathbb{R}^{n\times q}\) the centered response, and \(\{\mathbf{A}_b\}_{b=1}^{B} \subset \mathbb{R}^{p\times p}\) a finite bank of strict-linear spectral operators. An operator is strict-linear when its action \(\mathbf{X}_b = \mathbf{X}\mathbf{A}_b^{\top}\) depends only on the fixed wavelength grid (identity, Savitzky–Golay smoothing and derivatives, finite differences, polynomial detrending, Norris–Williams gap derivatives, Whittaker smoothing); SNV, MSC, EMSC, ASLS and OSC are not strict-linear and are handled as fold-local branches in nirs4all.

Cross-covariance identity (Eq. 2 of the paper). For centered \(\mathbf{X}, \mathbf{Y}\) and any strict-linear \(\mathbf{A}\),

\[\bigl(\mathbf{X}\mathbf{A}^{\top}\bigr)^{\top}\mathbf{Y} \;=\; \mathbf{A}\,\mathbf{X}^{\top}\mathbf{Y}.\]

Writing \(\mathbf{S} = \mathbf{X}^{\top}\mathbf{Y}\), every operator can therefore be screened by the cheap left action \(\mathbf{S}_b = \mathbf{A}_b\mathbf{S}\) (\(O(p q)\) per candidate) instead of materializing \(\mathbf{X}_b\) (\(O(n p)\)).

Global selection (the AOM in AOM-PLS). A single operator index \(b^{\star}\) is chosen for all \(K\) components by minimising a selection criterion \(\mathcal{C}\) over \(b\):

\[b^{\star} \;=\; \operatorname*{arg\,min}_{b\in\{1,\dots,B\}} \; \mathcal{C}\!\bigl(\text{SIMPLS}(\mathbf{X}_b, \mathbf{Y}; K)\bigr).\]

The default criterion is K-fold CV-RMSE (criterion='cv'); alternatives include the covariance proxy \(-\lVert\mathbf{A}_b\mathbf{S}\rVert\) (fast prescreen), leverage-corrected approximate PRESS, and a hybrid covariance-then-CV refinement. The optimal prefix length \(k \le K\) is selected jointly when auto_prefix=True.

SIMPLS-covariance engine. With \(\mathbf{S}_b = \mathbf{A}_b\mathbf{S}\) precomputed, SIMPLS extracts component \(a\) from the leading left singular vector \(\mathbf{r}_a = \mathbf{u}_1(\mathbf{S}_b)\) and maps it back to the original wavelength grid through the operator’s adjoint:

\[\mathbf{z}_a \;=\; \mathbf{A}_{b^{\star}}^{\top}\,\mathbf{r}_a, \qquad \mathbf{t}_a = \mathbf{X}\mathbf{z}_a.\]

Stacking \(\mathbf{Z} = [\mathbf{z}_1\;\cdots\;\mathbf{z}_K]\), with original-space loadings \(\mathbf{P} = \mathbf{X}^{\top}\mathbf{T}\,\operatorname{diag}(1/\lVert\mathbf{t}_a\rVert^{2})\) and \(\mathbf{Q} = \mathbf{Y}^{\top}\mathbf{T}\,\operatorname{diag}(1/\lVert\mathbf{t}_a\rVert^{2})\), the recovered coefficient matrix is

\[\mathbf{B} \;=\; \mathbf{Z}\,\bigl(\mathbf{P}^{\top}\mathbf{Z}\bigr)^{+}\mathbf{Q}^{\top}, \qquad \hat{\mathbf{Y}}(\mathbf{X}^{\star}) = \mathbf{X}^{\star}\mathbf{B}.\]

Because \(\mathbf{B}\) lives in the original feature space, the fitted model is a single linear calibration on the raw wavelength grid: there is no preprocessing stage to replay at predict time — the operator has been absorbed into the coefficients (paper §3.2). Computationally the bank exploration cost is roughly that of a single SIMPLS fit on \(\mathbf{S}\) plus \(B\) tiny left actions, which is the algorithmic gain that makes AOM-PLS comparable to vanilla PLS even with a \(\sim\)77-operator default bank.

Implementation¶

n4m_model_selection_aom_pls_select via the Python/R/MATLAB dispatchers. The compact strict-linear bank shipped by pls4all (compact_bank: identity, Savitzky–Golay smooth/derivative, polynomial detrending, finite differences) mirrors the default_bank used in the paper experiments. Reference: git-pinned oracle nirs4all.operators.models.sklearn.aom_pls.AOMPLSRegressor (sanctioned exception).

R roxygen note (methods_extra.R::aom_pls):

AOM-PLS with global operator selection.

MATLAB header (bindings/matlab/+pls4all/aom_pls.m):

pls4all.aom_pls  AOM-PLS global operator selection.

Usage¶

Every pls4all binding tab dispatches into the same C kernel; the external libraries listed at the bottom of the page are the parity references registered in benchmarks.parity_timing.registry. Switch tabs to read the same fit in your language. The R package now ships drop-in-compatible facades for the CRAN pls package (plsr, pcr, mvr) and for the mdatools::pls(x, y, ...) matrix idiom — those tabs appear only on the methods that have a meaningful equivalence.

pls4all bindings

C ABI · libn4m

/* C ABI — libn4m AOM/POP selector path */
n4m_context_t* ctx = n4m_context_create();
n4m_config_t*  cfg = n4m_config_create();
n4m_operator_bank_t* bank = NULL;
n4m_validation_plan_t* plan = NULL;
n4m_aom_global_result_t* res = NULL;
n4m_operator_bank_create(&bank);
/* add compact nirs4all-style operators: identity, SG, detrend, FD */
n4m_validation_plan_create(&plan);
/* fill CV folds on plan */
n4m_model_selection_aom_pls_select(ctx, cfg, bank, &x_view, &y_view, plan,
              /* max_components */ 2, &res);
/* read predictions and selection diagnostics via result getters */
n4m_model_selection_aom_pls_result_destroy(res);
n4m_validation_plan_destroy(plan);
n4m_operator_bank_destroy(bank);
n4m_config_destroy(cfg);
n4m_context_destroy(ctx);

Python · pls4all (raw)

import pls4all

with pls4all.Context() as ctx, pls4all.Config() as cfg:
    bank = pls4all.OperatorBank()
    plan = pls4all.ValidationPlan()
    # Add compact nirs4all-style operators and CV folds.
    res = pls4all.aom_global_select(
        ctx, cfg, bank, X.ravel(), y.ravel(), plan,
        max_components=2,
        x_rows=X.shape[0], x_cols=X.shape[1],
        y_rows=y.shape[0], y_cols=1,
    )
    values, rows, cols = res.predictions

Python · pls4all.sklearn

No tier-2 sklearn-style class yet — exposed via the pls4all.aom_global_select / pls4all.aom_per_component_select low-level ABI.

R · pls4all_method()

library(pls4all)
# Unified low-level dispatcher (May 2026 R cleanup):
res <- pls4all_method("aom_pls", X, y,
                      n_components = 2L, params = list(max_components = 3L, n_operators = 9L, cv = 3L))
# res is a named list with MethodResult arrays/scalars.
# selected_indices / top_k_intervals are 1-based.

R · pls4all (raw fn)

library(pls4all)
res  <- aom_pls(X, Y, max_components = 3L, n_operators = 9L, cv = 3L)
yhat <- pls4all_predict(res, X_test)

MATLAB · pls4all (MEX)

res = pls4all.aom_pls(X, y, 2);
% see header of bindings/matlab/+pls4all/aom_pls.m for full
% parameter surface:
%   res = aom_pls(X, Y, max_components, n_operators, cv)
yhat = predict(res, Xtest);

MATLAB · pls4all (classdef)

No idiomatic classdef wrapper — invoke pls4all.fit("aom_pls", X, y, …) directly from the unified MEX factory.

Registry parity references 📐

📐 nirs4all (python · python) — nirs4all in-tree · strict (rmse_rel ≤ 1e-08) — In-tree nirs4all AOM/POP estimator stack (sanctioned reference). The pls4all ABI uses the same compact strict-linear bank and contiguous folds for cross-binding determinism; nirs4all remains the qualitative algorithmic reference.

Benchmarks¶

Adaptive wall-clock per cell measured against full_matrix.csv. Only backends that implement this method are listed; libraries without the method are omitted.

Verdict · ✓ ref / ≈ ref / ~ shape mark a reference-gate pass at strict / relaxed / qualitative tolerance · ✓ bind = pls4all binding agrees with the C++ baseline · ⇄ cross-check = documented by-design selector/RNG/model, noncanonical API/facade convention, or secondary oracle · ✗ divergent · ⚠ error · — not run. The fastest backend per column is marked 🏆.

Reference gate: strict — numeric equivalence (rmse_rel_tol ≤ 1e-08).

Rows tagged with 📐 are the canonical parity references for this method (declared in parity_timing.registry). C++ and external rows show reference parity; pls4all language bindings show binding parity against the C++ backend. Hover the icon for role and tolerance band.

1 thread

Backend	Parity	200×40 (ms)
C++ native · libn4m
`pls4all.cpp.blas+omp`	✓ ref 6e-16	4.20 ms
Python · pls4all
`pls4all.python`	✓ bind	4.16 ms
`pls4all.sklearn`	✓ bind	4.15 ms🏆
R · pls4all
`pls4all.R`	✓ 6e-15	7.84 ms
`pls4all.R.formula`	✓ 6e-15	9.43 ms
`pls4all.R.mdatools`	✓ 6e-15	17.6 ms
`pls4all.R.pls`	✓ 6e-15	9.87 ms
Python · external
📐`nirs4all`	source	41.9 ms

3 threads

Backend	Parity	200×40 (ms)
C++ native · libn4m
`pls4all.cpp.blas+omp`	✓ ref 6e-16	14.6 ms
Python · pls4all
`pls4all.python`	✓ bind	8.48 ms🏆
`pls4all.sklearn`	✓ bind	14.7 ms
R · pls4all
`pls4all.R`	✓ 6e-15	27.0 ms
`pls4all.R.formula`	✓ 6e-15	40.4 ms
`pls4all.R.mdatools`	✓ 6e-15	50.0 ms
`pls4all.R.pls`	✓ 6e-15	47.9 ms
Python · external
📐`nirs4all`	source	59.7 ms

10 threads

Backend	Parity	200×40 (ms)
C++ native · libn4m
`pls4all.cpp.blas+omp`	✓ ref 6e-16	23.7 ms
Python · pls4all
`pls4all.python`	✓ bind	23.7 ms
`pls4all.sklearn`	✓ bind	15.5 ms
R · pls4all
`pls4all.R`	✓ 6e-15	13.4 ms
`pls4all.R.formula`	✓ 6e-15	16.4 ms
`pls4all.R.mdatools`	✓ 6e-15	12.9 ms
`pls4all.R.pls`	✓ 6e-15	12.4 ms🏆
Python · external
📐`nirs4all`	source	16.6 ms

nirs4all-methods

Navigation

`aom_pls` — AOM-PLS (global adaptive operator selection)¶

Description¶

Parameters¶

Explanations¶

Bibliographic source¶

Mathematical principle¶

Implementation¶

Usage¶

Benchmarks¶

aom_pls — AOM-PLS (global adaptive operator selection)¶

Description¶

Parameters¶

Explanations¶

Bibliographic source¶

Mathematical principle¶

Implementation¶

Usage¶

Benchmarks¶

`aom_pls` — AOM-PLS (global adaptive operator selection)¶