moment_stack — OOF stack over native moment heads

Group: Ensemble · Registry tolerance: n/a

Description

NativeMomentStackRegressor builds a small linear meta-model over out-of-fold predictions from native moment-compatible base models:

  • Ridge (NativeRidgeRegressor)

  • PLS via NativeMomentSweepRegressor(heads=("pls", ...))

  • PCR (NativePCRRegressor)

  • Continuum regression (NativeContinuumRegressionRegressor)

  • ECR (NativeECRRegressor)

  • CPPLS (NativeCPPLSRegressor)

The base models are fit only on training folds when producing OOF predictions. The final reusable estimator refits the same base models on all training rows, then applies a small Ridge meta-model on their predictions.

This is intentionally still a strict moment-model composition. It does not add RFF/RBF lifts, trees, neural models, TabPFN, transformed-spectrum stacking, or dataset-name routing.

Parameters

Name

Type

Default

Notes

base_models

sequence[str]

("ridge", "pls", "pcr", "continuum", "ecr", "cppls")

Allowed bases. continuum_regression aliases continuum.

cv

int

5

Outer OOF folds used for the meta-model.

fold_ids

array-like or None

None

Explicit train-only fold ids.

inner_cv

int

3

Inner CV for the PLS sweep base.

meta_alpha

float

1e-6

Ridge penalty for the meta-model.

ridge_alpha

float

0.1

Ridge base penalty.

n_components

int

3

Component count for PCR/continuum/ECR/CPPLS and max PLS grid when pls_components=None.

pls_components

sequence[int] or None

None

Explicit PLS component grid.

cppls_gamma

float

0.5

CPPLS gamma.

continuum_tau

float

0.5

Continuum tau.

ecr_alpha

float

0.5

ECR alpha.

center_x, scale_x, center_y, scale_y

bool or None

mixed

Forwarded to bases that expose those options.

cuda_pls_parallel_folds, cuda_pls_min_device_features, cuda_pls_many_batched

optional

None

Forwarded to the PLS sweep base.

Usage

import n4m
from n4m.sklearn import NativeMomentStackRegressor

model = n4m.moment_stack(
    X_train,
    y_train,
    base_models=("ridge", "pcr", "pls"),
    cv=5,
    inner_cv=3,
    n_components=3,
    scale_x=False,
)

same_model_type = NativeMomentStackRegressor(
    base_models=("ridge", "pcr", "pls"),
    cv=5,
    inner_cv=3,
    n_components=3,
    scale_x=False,
).fit(X_train, y_train)

y_pred = model.predict(X_test)
diagnostics = model.get_diagnostics()

Key fitted attributes:

  • base_model_names_

  • base_models_

  • base_oof_predictions_

  • oof_predictions_

  • meta_coefficients_

  • intercept_

  • oof_rmse_

  • rmse_

  • base_oof_diagnostics_

  • base_final_diagnostics_

get_diagnostics() includes aggregate PLS route counters for both phases, for example n_base_oof_pls_moment_cuda_device_cv_fits and n_base_final_pls_moment_cuda_device_cv_fits. These counters are audit-only; they do not affect the OOF split, meta-model fit, or production selection.

CUDA smoke used for release readiness:

CUDA_VISIBLE_DEVICES=0 \
PYTHONPATH=bindings/python/src \
N4M_LIB_PATH=build/cuda-on/cpp/src/libn4m.so \
  /home/delete/.venv/bin/python benchmarks/cross_binding/bench_moment_stack_timing.py \
  --output benchmarks/cross_binding/moment_stack_timing_cuda_smoke.csv \
  --repeats 1 --shapes 80x1024 --base-models pls --cv 4 --inner-cv 4 \
  --n-components 1 --cuda-pls-min-device-features 1 \
  --cuda-pls-parallel-folds

The smoke should show nonzero OOF and final n_base_*_pls_moment_cuda_device_cv_fits and zero corresponding host PLS CV fits.

Validation

Covered by bindings/python/tests/test_moment_model_wrappers.py and benchmarks/cross_binding/bench_moment_stack_timing.py.