aom_operator_pls_stack - native AOM operator PLS score stack

Group: Diagnostic / AOM · ABI: n4m_aom_operator_pls_stack_fit

aom_operator_pls_stack computes compact/wide strict-linear AOM operator views, compresses each view through a fold-local PLS1 score projector, concatenates those scores, and fits a Ridge head on the stacked score matrix. The selected final stack is linear in the original input spectra, so the native result also exposes folded coefficients for direct reuse.

The sklearn-style AOMOperatorPLSStack reference estimator remains available as n4m.AOMOperatorPLSStack and n4m.sklearn.AOMOperatorPLSStack for custom operator matrices, shuffled/both CV modes, and optional baseline admission gating.

Status

  • API surface: C ABI, Python function n4m.aom_operator_pls_stack, and native sklearn wrapper NativeAOMOperatorPLSStackRegressor.

  • Native ABI: yes, since ABI 1.18.0.

  • Catalog status: aom_pop.operator_pls_stack.

  • CPU: tested (n4m_tests and full Python bindings).

  • CUDA: builds and smoke-tests with the CUDA-enabled library; native v1 is not a fused batched GPU stack.

  • Target shape: single-target Y only, matching the PLS1 reference design.

  • Dataset/source routing: forbidden. metadata passed to fit is audit-only.

  • Native/default operator bank: fixed strict-linear matrices only. Compact has 12 operators; wide has 31 operators, including Gaussian, FCK and Whittaker variants. Stateful scatter operators such as SNV, MSC, EMSC, OSC, or EPO are not included by default.

Python Function

n4m.aom_operator_pls_stack(
    X,
    y,
    profile="compact",
    cv=5,
    fold_ids=None,
    components=(2, 4, 8),
    alphas=(1e-3, 1e-2, 1e-1, 1.0, 10.0, 100.0),
    std_penalty=0.0,
    gap_penalty=0.0,
    scale_x=False,
)

The selection criterion is:

mean_oof_rmse
  + std_penalty * std_oof_rmse
  + gap_penalty * max(0, mean_oof_rmse - mean_train_rmse)

If fold_ids is omitted, contiguous balanced folds are generated from cv. For reproducible campaigns, pass explicit train-only fold ids.

Native Example

import n4m

res = n4m.aom_operator_pls_stack(
    X_train,
    y_train,
    profile="compact",
    cv=5,
    fold_ids=fold_ids,
    components=[1, 2, 4],
    alphas=[1e-3, 1e-1, 10.0],
    std_penalty=0.05,
    gap_penalty=0.25,
    scale_x=False,
)

print(res["selected_components"], res["selected_alpha"])
print(res["candidate_scores"])

y_hat = X_train @ res["input_coefficients"] + res["input_intercept"]

Native Outputs

Double matrices:

  • candidate_scores (n_specs, 7): spec_id, n_components, alpha, mean_oof_rmse, std_oof_rmse, mean_train_rmse, criterion

  • fold_scores (n_specs, cv)

  • oof_predictions (n_samples, 1)

  • predictions (n_samples, 1)

  • stack_features (n_samples, n_operator_features)

  • coefficients (n_operator_features, 1): final Ridge head on stack_features

  • intercept (1, 1): final Ridge head intercept on stack_features

  • input_coefficients (n_features, 1): selected stack folded into the original input feature space

  • input_intercept (1, 1): folded input-space intercept

Int vectors:

  • fold_ids

  • operator_feature_offsets

Scalars include selected_spec_id, selected_components, selected_alpha, selected_oof_rmse, selected_train_rmse, selected_criterion, n_operator_features, n_specs, n_operators, profile, cv, n_samples, n_features and n_targets.

The direct replay contract is:

pred = X_new @ res["input_coefficients"] + res["input_intercept"]

Python Estimator

NativeAOMOperatorPLSStackRegressor is the sklearn-style wrapper over the native compact/wide ABI. It stores the selected stack diagnostics and predicts from input_coefficients/input_intercept on new spectra.

AOMOperatorPLSStack remains the flexible Python/sklearn reference layer for custom operator matrices and baseline gates.

Constructor

AOMOperatorPLSStack(
    operator_bank=None,
    components=(2, 4, 8),
    alphas=(1e-3, 1e-2, 1e-1, 1.0, 10.0, 100.0),
    cv=5,
    cv_mode="shuffled",
    std_penalty=0.0,
    gap_penalty=0.0,
    baseline_estimator=None,
    min_relative_oof_gain=0.0,
    random_state=2026,
    drop_failed_specs=True,
)

operator_bank may be a mapping from name to a fixed matrix, a transformer, or a callable taking n_features and returning a matrix. Matrices may be shaped (n_features, n_outputs) or (n_outputs, n_features).

When operator_bank=None, the estimator builds a fixed strict-linear bank from raw, detrend, finite differences, Gaussian smoothing, Savitzky-Golay variants, and Norris-Williams variants that are valid for the current feature count.

If baseline_estimator is provided, the operator stack is admitted only when its train-only OOF RMSE improves the baseline by at least min_relative_oof_gain.

Example

import n4m

model = n4m.AOMOperatorPLSStack(
    components=(1, 2, 4, 8),
    alphas=(1e-4, 1e-2, 1.0, 100.0),
    cv=5,
    cv_mode="both",
    std_penalty=0.05,
    gap_penalty=0.25,
)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

print(model.stack_report_)

Outputs

After fit, the Python estimator exposes:

  • selected_spec_: AOMOperatorPLSSpec(n_components, alpha).

  • accepted_operator_stack_: False when an OOF baseline gate rejects it.

  • operator_names_: fixed operator views used by the stack.

  • n_operator_features_: concatenated score feature count.

  • cv_scores_: per-spec train-only CV diagnostics.

  • stack_report_: JSON-serializable diagnostic report.

Validation

The native tests cover:

  • compact result shapes and selected-spec criterion;

  • explicit fold ids;

  • final prediction reconstruction from stack_features, coefficients and intercept;

  • final prediction reconstruction from input_coefficients and input_intercept;

  • operator feature offsets;

  • rejection of multi-output Y;

  • CPU and CUDA-enabled C++ test suites.

The Python estimator tests cover:

  • custom fixed operators with finite predictions and score transforms;

  • false-positive rejection by the OOF baseline gate;

  • metadata perturbation invariance;

  • default strict-linear operator bank smoke behavior;

  • predict before fit error behavior.

2026-06-04 validation:

  • CPU n4m_tests: 328 passed, 0 failed.

  • CUDA-enabled n4m_tests with CUDA_VISIBLE_DEVICES=0: 328 passed, 0 failed.

  • Targeted wrapper pytest through N4M_LIB_PATH: 13 passed.

  • Full Python binding pytest against packaged ABI 1.18.0: 254 passed, 4 existing UVE warnings.

  • Catalog/ABI gates: 196 methods, 558/558 method symbols attributed, 123 infra symbols, split method files up to date.

  • Import smoke without N4M_LIB_PATH: ABI (1, 18, 0) and n4m.aom_operator_pls_stack exported.

Benchmarks

Timing script:

PYTHONPATH=bindings/python/src \
N4M_LIB_PATH=build/dev-release/cpp/src/libn4m.so \
python3 benchmarks/cross_binding/bench_aom_operator_pls_stack_timing.py

The CPU timing smoke uses the compact profile with cv=4, components [1, 2], and alphas [0.01, 1.0]. The generated CSV records the current ABI, library path, elapsed medians, replay error, and fit-count telemetry.

CUDA-build smoke:

CUDA_VISIBLE_DEVICES=0 \
PYTHONPATH=bindings/python/src \
N4M_LIB_PATH=build/cuda-on/cpp/src/libn4m.so \
python3 benchmarks/cross_binding/bench_aom_operator_pls_stack_timing.py \
  --output benchmarks/cross_binding/aom_operator_pls_stack_timing_cuda_smoke.csv \
  --mode both

The CUDA smoke CSV includes both the ABI-close function row native_aom_operator_pls_stack and the sklearn replay wrapper row native_aom_operator_pls_stack_sklearn. The wrapper row records prediction_replay_max_abs_error to prove predict(X) replays the native folded input-space stack on the CUDA build path.

The timing rows also expose deterministic cost accounting for the internal fit_stack calls. For the compact smoke grid (cv=4, n_specs=4, n_operators=12), n_operator_pls_stack_fit_calls=21, n_operator_pls_stack_pls_fit_calls=252, and n_operator_pls_stack_ridge_fit_calls=21. The CV/final breakdown is also recorded as n_pls_stack_cv_fits=240, n_pls_stack_final_fits=12, n_ridge_stack_cv_fits=20, and n_ridge_stack_final_fits=1.