cars_select — CARS — Competitive Adaptive Reweighted Sampling¶
Group: Variable selector · Registry tolerance: 0.0
Description¶
CARS competitive adaptive reweighted sampling
From the pls4all.sklearn.CARSSelector docstring:
Competitive Adaptive Reweighted Sampling (Li 2009).
Registry note — Default path routes through R
enpls::enpls.fs(method='mc')(Monte-Carlo ensemble PLS + importance ranking), pinned toset.seed(11). Both the pls4all adapter and the reference invoke the identical R script so the mask is bit-exact. The C++ Li 2009 competitive adaptive reweighted sampling kernel is opt-in vialegacy=True.
Parameters¶
Name |
Type |
Default |
Notes |
|---|---|---|---|
|
|
|
Number of latent components extracted (k). |
|
|
|
Number of selection iterations or Monte-Carlo passes. |
|
`int |
None` |
|
|
|
|
Number of cross-validation folds used inside the selector. |
|
|
|
Random seed for reproducible sampling/initialization. |
|
|
|
registry benchmark cell value |
Explanations¶
Bibliographic source¶
Li, H., Liang, Y., Xu, Q. & Cao, D. (2009). Key wavelengths screening using competitive adaptive reweighted sampling method for multivariate calibration. Analytica Chimica Acta 648(1), 77–84.
Mathematical principle¶
CARS is one of the most widely-used spectroscopic variable selectors. It runs \(M\) iterations of: (1) draw a Monte-Carlo subsample, (2) fit PLS, (3) compute coefficient weights \(w_j = |b_j| / \sum |b_j|\), (4) keep a shrinking fraction of features ranked by weighted competitive sampling — features compete stochastically with probability proportional to \(w_j\).
The retention fraction shrinks exponentially: \(r_m = \exp(-\mu(m - 1))\) with \(\mu\) chosen so that two features survive at the final iteration. The iteration whose surviving subset minimises CV-RMSE is returned.
CARS combines deterministic exponential decay with stochastic competition; the latter prevents premature elimination of correlated features. Practically very robust to noise.
Implementation¶
n4m_cars_select. Reference: R enpls 6.1.1 (enpls.fs(method='mc') is the closest analogue).
MATLAB header (bindings/matlab/+pls4all/cars_select.m):
pls4all.cars_select Competitive Adaptive Reweighted Sampling.
res = pls4all.cars_select(X, Y, K, n_iter, min_feats)
Uses the default (NULL) ValidationPlan on the C side (5-fold fallback).
Usage¶
Every pls4all binding tab dispatches into the same C kernel; the external libraries listed at the bottom of the page are the parity references registered in benchmarks.parity_timing.registry. Switch tabs to read the same fit in your language. The R package now ships drop-in-compatible facades for the CRAN pls package (plsr, pcr, mvr) and for the mdatools::pls(x, y, ...) matrix idiom — those tabs appear only on the methods that have a meaningful equivalence.
pls4all bindings
/* C ABI — libn4m */
n4m_context_t* ctx = n4m_context_create();
n4m_config_t* cfg = n4m_config_create();
n4m_method_result_t* res = NULL;
n4m_cars_select_fit(ctx, cfg, &x_view, &y_view, /* hyperparams */, &res);
/* … read coefficients / mask / scores via */
/* n4m_method_result_get_double_matrix / vector / scalar … */
n4m_method_result_destroy(res);
n4m_config_destroy(cfg);
n4m_context_destroy(ctx);
import pls4all
from pls4all._methods import cars_select_fit
with pls4all.Context() as ctx, pls4all.Config() as cfg:
res = cars_select_fit(ctx, cfg, X, y, n_components=4)
# then: res.matrix("predictions"), res.matrix("coefficients"),
# res.vector("mask"), res.scalar("intercept"), …
from pls4all.sklearn import CARSSelector
mdl = CARSSelector(n_components=2, n_iterations=50, min_features=None, n_folds=3, seed=0)
mdl.fit(X, y)
y_hat = mdl.predict(X_test)
library(pls4all)
# Unified low-level dispatcher (May 2026 R cleanup):
res <- pls4all_method("cars_select", X, y,
n_components = 4L, params = list(n_iterations = 8L, min_features = 5L, top_k = 15L))
# res is a named list with MethodResult arrays/scalars.
# selected_indices / top_k_intervals are 1-based.
res = pls4all.cars_select(X, y, 4);
% see header of bindings/matlab/+pls4all/cars_select.m for full
% parameter surface:
% res = cars_select(X, Y, n_components, n_iterations, min_features)
yhat = predict(res, Xtest);
No idiomatic classdef wrapper — invoke pls4all.fit("cars_select", X, y, …) directly from the unified MEX factory.
Registry parity references 📐
📐
ref.r_enpls(R · r) —enpls6.1 · strict (rmse_rel ≤ 0e+00) — Renpls::enpls.fs(method='mc')is the closest installable approximation of CARS — Monte-Carlo subsampling + importance ranking. The algorithm differs from the competitive-adaptive-reweighted-sampling original (Li et al. 2009), so set overlap is qualitative.
Benchmarks¶
Adaptive wall-clock per cell measured against full_matrix.csv. Only backends that implement this method are listed; libraries without the method are omitted.
Verdict · ✓ ref / ≈ ref / ~ shape mark a reference-gate pass at strict / relaxed / qualitative tolerance · ✓ bind = pls4all binding agrees with the C++ baseline · ⇄ cross-check = documented by-design selector/RNG/model, noncanonical API/facade convention, or secondary oracle · ✗ divergent · ⚠ error · — not run. The fastest backend per column is marked 🏆.
Reference gate: strict — numeric equivalence (rmse_rel_tol ≤ 0e+00).
Rows tagged with 📐 are the canonical parity references for this method (declared in parity_timing.registry). C++ and external rows show reference parity; pls4all language bindings show binding parity against the C++ backend. Hover the icon for role and tolerance band.
| Backend | Parity | 200×40 (ms) |
|---|---|---|
| C++ native · libn4m | ||
pls4all.cpp.blas+omp | ✓ J 1.00 | 780.7 ms |
| Python · pls4all | ||
pls4all.python | ✓ J 1.00 | 772.6 ms |
| R · pls4all | ||
pls4all.R | ⇄ J 0.29 | 13.3 ms |
pls4all.R.formula | ⇄ J 0.29 | 10.2 ms🏆 |
pls4all.R.mdatools | ⇄ J 0.29 | 16.4 ms |
pls4all.R.pls | ⇄ J 0.29 | 13.3 ms |
| R · external | ||
📐ref.r_enpls | source | 188.9 ms |
| Backend | Parity | 200×40 (ms) |
|---|---|---|
| C++ native · libn4m | ||
pls4all.cpp.blas+omp | ✓ J 1.00 | 767.0 ms |
| Python · pls4all | ||
pls4all.python | ✓ J 1.00 | 761.2 ms |
| R · pls4all | ||
pls4all.R | ⇄ J 0.29 | 5.13 ms🏆 |
pls4all.R.formula | ⇄ J 0.29 | 5.78 ms |
pls4all.R.mdatools | ⇄ J 0.29 | 5.88 ms |
pls4all.R.pls | ⇄ J 0.29 | 5.70 ms |
| R · external | ||
📐ref.r_enpls | source | 62.9 ms |
| Backend | Parity | 200×40 (ms) |
|---|---|---|
| C++ native · libn4m | ||
pls4all.cpp.blas+omp | ✓ J 1.00 | 881.2 ms |
| Python · pls4all | ||
pls4all.python | ✓ J 1.00 | 802.5 ms |
| R · pls4all | ||
pls4all.R | ⇄ J 0.29 | 4.98 ms🏆 |
pls4all.R.formula | ⇄ J 0.29 | 5.70 ms |
pls4all.R.mdatools | ⇄ J 0.29 | 5.89 ms |
pls4all.R.pls | ⇄ J 0.29 | 5.76 ms |
| R · external | ||
📐ref.r_enpls | source | 63.6 ms |
See also: benchmark overview · methods index · interactive dashboard