# `aom_ridge_blender` - native AOM Ridge OOF simplex blender _Group_: **Diagnostic / AOM** ยท _ABI_: `n4m_aom_ridge_blender_fit` `aom_ridge_blender` runs a native strict-linear AOM Ridge candidate pool, builds out-of-fold predictions for every chain/lambda candidate, solves a non-negative simplex blend, then refits every candidate on the full training set for final blended predictions. Because every native candidate is a strict-linear chain plus Ridge head, the final simplex blend is also exported as a reusable linear model in the original input feature space. The sklearn-style `AOMRidgeBlender` reference estimator is still available as `n4m.AOMRidgeBlender` and `n4m.sklearn.AOMRidgeBlender` for explicit custom candidate estimators. ## Status - API surface: C ABI, Python function `n4m.aom_ridge_blender`, and native sklearn wrapper `NativeAOMRidgeBlenderRegressor`. - Native ABI: yes, since ABI `1.17.0`. - Catalog status: `aom_pop.ridge_blender`. - CPU: tested. - CUDA: builds and smoke-tests with the CUDA-enabled library; native v1 is not a fused batched GPU blender. - Candidate scope: fixed compact/wide strict-linear AOM chain banks and strictly positive Ridge lambdas. Compact has 12 chains; wide has 31 chains, including Gaussian, FCK and Whittaker variants. - Dataset/source routing: forbidden. `metadata` passed to `fit` is audit-only for the Python estimator and does not affect candidates, weights, or predictions. ## Python Function ```python n4m.aom_ridge_blender( X, y, profile="compact", cv=5, fold_ids=None, ridge_lambdas=(1e-4, 1e-2, 1.0, 100.0), regularizer=0.01, scale_x=False, ) ``` ## Example ```python import n4m res = n4m.aom_ridge_blender( X_train, y_train, profile="compact", cv=5, fold_ids=fold_ids, ridge_lambdas=[1e-4, 1e-2, 1.0, 100.0], regularizer=0.01, scale_x=False, ) print(res["blend_oof_rmse"], res["selected_chain_id"], res["selected_param"]) print(res["weights"]) y_hat = X_train @ res["input_coefficients"] + res["intercept"] ``` ## Outputs The native method returns a `MethodResult` dictionary with: - `candidate_scores` `(n_candidates, 5)`: `candidate_id`, `chain_id`, `lambda`, `cv_rmse`, `weight` - `weights` `(1, n_candidates)`: non-negative simplex weights - `oof_predictions` `(n_samples, n_targets)`: blended OOF predictions - `predictions` `(n_samples, n_targets)`: blended final in-sample predictions - `input_coefficients` `(n_features, n_targets)`: final blended coefficients folded back into the original input feature space - `intercept` `(1, n_targets)`: final blended intercept - `oof_candidate_predictions` `(n_samples * n_targets, n_candidates)` - `candidate_predictions` `(n_samples * n_targets, n_candidates)` - `fold_ids` Scalar diagnostics include `selected_candidate_id`, `selected_chain_id`, `selected_param`, `selected_cv_rmse`, `blend_oof_rmse`, `regularizer`, `n_candidates`, `n_chains`, `profile`, `cv`, `n_samples`, `n_features` and `n_targets`. The selected blend can be replayed on compatible spectra as: ```python pred = X_new @ res["input_coefficients"] + res["intercept"] ``` ## Python Estimator `NativeAOMRidgeBlenderRegressor` is the sklearn-style wrapper over the native compact/wide ABI. It stores `coef_`, `intercept_`, `weights_`, `candidate_scores_`, OOF predictions, and diagnostics while predicting from the folded native coefficients. `AOMRidgeBlender` remains the flexible Python/sklearn reference layer. It can blend explicit estimator/factory candidates or build a default AOM-Ridge pool from `build_aom_control_chain_bank(profile)`. It supports custom `chains`, candidate labels, `random_state`, `n_jobs`, and candidate failure reporting. Use the native function for the catalogued compact/wide strict-linear Ridge bank. Use the estimator when the candidates are Python objects or when a scikit-learn-compatible estimator API is needed. ## Validation The native tests cover: - compact and wide chain/lambda candidate counts and result shapes; - simplex non-negativity and unit-sum weights; - exact reconstruction of blended OOF/final predictions from candidate prediction matrices; - exact replay of final predictions from `input_coefficients` and `intercept`; - explicit fold ids; - rejection of non-positive Ridge lambdas; - CPU and CUDA-enabled C++ test suites. The Python estimator tests cover: - simplex weights and exact OOF blend recovery on synthetic candidates; - fold-local OOF fitting, with the validation fold excluded from each candidate fit; - metadata perturbation invariance; - default chain+Ridge candidate construction; - `predict` before `fit` error behavior. ## Benchmarks Timing script: ```bash PYTHONPATH=bindings/python/src \ N4M_LIB_PATH=build/dev-release/cpp/src/libn4m.so \ python3 benchmarks/cross_binding/bench_aom_ridge_blender_timing.py ``` CUDA-build smoke: ```bash CUDA_VISIBLE_DEVICES=0 \ PYTHONPATH=bindings/python/src \ N4M_LIB_PATH=build/cuda-on/cpp/src/libn4m.so \ python3 benchmarks/cross_binding/bench_aom_ridge_blender_timing.py \ --output benchmarks/cross_binding/aom_ridge_blender_timing_cuda_smoke.csv \ --mode both ``` Current smoke CSVs: - `benchmarks/cross_binding/aom_ridge_blender_timing.csv` - `benchmarks/cross_binding/aom_ridge_blender_timing_cuda_smoke.csv` The CUDA smoke CSV includes both the ABI-close function row `native_aom_ridge_blender` and the sklearn replay wrapper row `native_aom_ridge_blender_sklearn`. The wrapper row records `prediction_replay_max_abs_error` to prove `predict(X)` replays the native folded coefficients/intercept state on the CUDA build path. The timing rows also expose deterministic Ridge fit accounting: `n_ridge_blender_cv_fits = n_candidates * cv`, `n_ridge_blender_final_fits = n_candidates`, and `n_ridge_blender_fit_calls = n_candidates * (cv + 1)`. For the compact smoke grid (`cv=5`, 12 chains, 4 lambdas), those counters are `240`, `48`, and `288`. On the same synthetic cells, `blend_oof_rmse` is lower than the best single candidate OOF RMSE, which confirms that the stored simplex weights and candidate OOF matrix are doing useful blend work in the smoke benchmark.