aom_ridge_blender - native AOM Ridge OOF simplex blender¶
Group: Diagnostic / AOM · ABI: n4m_aom_ridge_blender_fit
aom_ridge_blender runs a native strict-linear AOM Ridge candidate pool,
builds out-of-fold predictions for every chain/lambda candidate, solves a
non-negative simplex blend, then refits every candidate on the full training
set for final blended predictions. Because every native candidate is a
strict-linear chain plus Ridge head, the final simplex blend is also exported
as a reusable linear model in the original input feature space.
The sklearn-style AOMRidgeBlender reference estimator is still available as
n4m.AOMRidgeBlender and n4m.sklearn.AOMRidgeBlender for explicit custom
candidate estimators.
Status¶
API surface: C ABI, Python function
n4m.aom_ridge_blender, and native sklearn wrapperNativeAOMRidgeBlenderRegressor.Native ABI: yes, since ABI
1.17.0.Catalog status:
aom_pop.ridge_blender.CPU: tested.
CUDA: builds and smoke-tests with the CUDA-enabled library; native v1 is not a fused batched GPU blender.
Candidate scope: fixed compact/wide strict-linear AOM chain banks and strictly positive Ridge lambdas. Compact has 12 chains; wide has 31 chains, including Gaussian, FCK and Whittaker variants.
Dataset/source routing: forbidden.
metadatapassed tofitis audit-only for the Python estimator and does not affect candidates, weights, or predictions.
Python Function¶
n4m.aom_ridge_blender(
X,
y,
profile="compact",
cv=5,
fold_ids=None,
ridge_lambdas=(1e-4, 1e-2, 1.0, 100.0),
regularizer=0.01,
scale_x=False,
)
Example¶
import n4m
res = n4m.aom_ridge_blender(
X_train,
y_train,
profile="compact",
cv=5,
fold_ids=fold_ids,
ridge_lambdas=[1e-4, 1e-2, 1.0, 100.0],
regularizer=0.01,
scale_x=False,
)
print(res["blend_oof_rmse"], res["selected_chain_id"], res["selected_param"])
print(res["weights"])
y_hat = X_train @ res["input_coefficients"] + res["intercept"]
Outputs¶
The native method returns a MethodResult dictionary with:
candidate_scores(n_candidates, 5):candidate_id,chain_id,lambda,cv_rmse,weightweights(1, n_candidates): non-negative simplex weightsoof_predictions(n_samples, n_targets): blended OOF predictionspredictions(n_samples, n_targets): blended final in-sample predictionsinput_coefficients(n_features, n_targets): final blended coefficients folded back into the original input feature spaceintercept(1, n_targets): final blended interceptoof_candidate_predictions(n_samples * n_targets, n_candidates)candidate_predictions(n_samples * n_targets, n_candidates)fold_ids
Scalar diagnostics include selected_candidate_id, selected_chain_id,
selected_param, selected_cv_rmse, blend_oof_rmse, regularizer,
n_candidates, n_chains, profile, cv, n_samples, n_features and
n_targets.
The selected blend can be replayed on compatible spectra as:
pred = X_new @ res["input_coefficients"] + res["intercept"]
Python Estimator¶
NativeAOMRidgeBlenderRegressor is the sklearn-style wrapper over the native
compact/wide ABI. It stores coef_, intercept_, weights_,
candidate_scores_, OOF predictions, and diagnostics while predicting from
the folded native coefficients.
AOMRidgeBlender remains the flexible Python/sklearn reference layer. It can
blend explicit estimator/factory candidates or build a default AOM-Ridge pool
from build_aom_control_chain_bank(profile). It supports custom chains,
candidate labels, random_state, n_jobs, and candidate failure reporting.
Use the native function for the catalogued compact/wide strict-linear Ridge bank. Use the estimator when the candidates are Python objects or when a scikit-learn-compatible estimator API is needed.
Validation¶
The native tests cover:
compact and wide chain/lambda candidate counts and result shapes;
simplex non-negativity and unit-sum weights;
exact reconstruction of blended OOF/final predictions from candidate prediction matrices;
exact replay of final predictions from
input_coefficientsandintercept;explicit fold ids;
rejection of non-positive Ridge lambdas;
CPU and CUDA-enabled C++ test suites.
The Python estimator tests cover:
simplex weights and exact OOF blend recovery on synthetic candidates;
fold-local OOF fitting, with the validation fold excluded from each candidate fit;
metadata perturbation invariance;
default chain+Ridge candidate construction;
predictbeforefiterror behavior.
Benchmarks¶
Timing script:
PYTHONPATH=bindings/python/src \
N4M_LIB_PATH=build/dev-release/cpp/src/libn4m.so \
python3 benchmarks/cross_binding/bench_aom_ridge_blender_timing.py
CUDA-build smoke:
CUDA_VISIBLE_DEVICES=0 \
PYTHONPATH=bindings/python/src \
N4M_LIB_PATH=build/cuda-on/cpp/src/libn4m.so \
python3 benchmarks/cross_binding/bench_aom_ridge_blender_timing.py \
--output benchmarks/cross_binding/aom_ridge_blender_timing_cuda_smoke.csv \
--mode both
Current smoke CSVs:
benchmarks/cross_binding/aom_ridge_blender_timing.csvbenchmarks/cross_binding/aom_ridge_blender_timing_cuda_smoke.csv
The CUDA smoke CSV includes both the ABI-close function row
native_aom_ridge_blender and the sklearn replay wrapper row
native_aom_ridge_blender_sklearn. The wrapper row records
prediction_replay_max_abs_error to prove predict(X) replays the native
folded coefficients/intercept state on the CUDA build path.
The timing rows also expose deterministic Ridge fit accounting:
n_ridge_blender_cv_fits = n_candidates * cv,
n_ridge_blender_final_fits = n_candidates, and
n_ridge_blender_fit_calls = n_candidates * (cv + 1). For the compact smoke
grid (cv=5, 12 chains, 4 lambdas), those counters are 240, 48, and
288.
On the same synthetic cells, blend_oof_rmse is lower than the best single
candidate OOF RMSE, which confirms that the stored simplex weights and
candidate OOF matrix are doing useful blend work in the smoke benchmark.