# `sparse_pls_da` — Sparse PLS-DA (Lê Cao 2008) _Group_: **Sparse** · _Registry tolerance_: `2.0` ## Description Sparse PLS-DA (§7) From the `pls4all.sklearn.SparsePLSDAClassifier` docstring: > Sparse PLS-DA classifier. > **Registry note** — R `spls::splsda` uses an LDA classifier on PLS scores; pls4all and `SparsePlsDaPythonReference` use argmax of the regression decision scores. Both emit one-hot predictions; differences appear only at the decision boundary. ### Parameters | Name | Type | Default | Notes | |------|------|---------|-------| | `n_components` | `int` | `2` | Number of latent components extracted (k). | | `sparsity_lambda` | `float` | `0.05` | L1 soft-threshold magnitude applied to the PLS weight vectors. | | `n_classes` | `int` | `3` | registry benchmark cell value | ## Explanations ### Bibliographic source Lê Cao, K.-A., Rossouw, D., Robert-Granié, C. & Besse, P. (2008). *A sparse PLS for variable selection when integrating omics data*. Statistical Applications in Genetics and Molecular Biology 7(1). ### Mathematical principle Discriminant variant of sparse PLS. Encode class labels $y \in \{0, 1, \ldots, C-1\}$ as a one-hot matrix $\mathbf{Y} \in \{0, 1\}^{n \times C}$, fit a sparse PLS regression on it, then assign new samples to the class with the largest predicted score. The L1 penalty selects a discriminative subset of features along each latent direction. In high-dimensional biomarker discovery (microarray, MALDI-TOF, NIR food classification) sparse PLS-DA is a standard since it simultaneously builds the discriminant and shortlists the candidate markers in a single regularised fit. Class probabilities follow from a softmax over the predicted score columns. ### Implementation `n4m_sparse_pls_da_fit`. Reference: Bioconductor `mixOmics::splsda`. MATLAB header (`bindings/matlab/+pls4all/sparse_pls_da.m`): ```text pls4all.sparse_pls_da Sparse PLS-DA classifier (Chun & Keles 2010 + DA). y_labels: integer class IDs in {0, …, n_classes-1}. ``` ### Usage Every pls4all binding tab dispatches into the same C kernel; the external libraries listed at the bottom of the page are the parity references registered in `benchmarks.parity_timing.registry`. Switch tabs to read the same fit in your language. The R package now ships drop-in-compatible facades for the CRAN `pls` package (`plsr`, `pcr`, `mvr`) and for the `mdatools::pls(x, y, ...)` matrix idiom — those tabs appear only on the methods that have a meaningful equivalence. **pls4all bindings** ::::{tab-set} :class: pls4all-bindings :::{tab-item} C ABI · libn4m :sync: c :class-label: lang-c ```c /* C ABI — libn4m */ n4m_context_t* ctx = n4m_context_create(); n4m_config_t* cfg = n4m_config_create(); n4m_method_result_t* res = NULL; n4m_sparse_pls_da_fit(ctx, cfg, &x_view, &y_view, /* hyperparams */, &res); /* … read coefficients / mask / scores via */ /* n4m_method_result_get_double_matrix / vector / scalar … */ n4m_method_result_destroy(res); n4m_config_destroy(cfg); n4m_context_destroy(ctx); ``` ::: :::{tab-item} Python · pls4all (raw) :sync: python-raw :class-label: lang-python ```python import pls4all from pls4all._methods import sparse_pls_da_fit with pls4all.Context() as ctx, pls4all.Config() as cfg: res = sparse_pls_da_fit(ctx, cfg, X, y, n_components=4, y_labels=y_labels) # then: res.matrix("predictions"), res.matrix("coefficients"), # res.vector("mask"), res.scalar("intercept"), … ``` ::: :::{tab-item} Python · pls4all.sklearn :sync: python-sklearn :class-label: lang-python ```python from pls4all.sklearn import SparsePLSDAClassifier mdl = SparsePLSDAClassifier(n_components=2, sparsity_lambda=0.05) mdl.fit(X, y) y_hat = mdl.predict(X_test) ``` ::: :::{tab-item} R · pls4all_method() :sync: r-dispatcher :class-label: lang-r ```r library(pls4all) # Unified low-level dispatcher (May 2026 R cleanup): res <- pls4all_method("sparse_pls_da", X, y, n_components = 4L, params = list(sparsity_lambda = 0.05, n_classes = 3L)) # res is a named list with MethodResult arrays/scalars. # selected_indices / top_k_intervals are 1-based. ``` ::: :::{tab-item} MATLAB · pls4all (MEX) :sync: matlab-mex :class-label: lang-matlab ```matlab res = pls4all.sparse_pls_da(X, y, 4); % see header of bindings/matlab/+pls4all/sparse_pls_da.m for full % parameter surface: % res = sparse_pls_da(X, y_labels, n_components, sparsity_lambda) yhat = predict(res, Xtest); ``` ::: :::{tab-item} MATLAB · pls4all (classdef) :sync: matlab-classdef :class-label: lang-matlab _No idiomatic classdef wrapper — invoke `pls4all.fit("sparse_pls_da", X, y, …)` directly from the unified MEX factory._ ::: :::: **Registry parity references** 📐 :::{card} :class-card: external-refs - 📐 **`ref.python_chun_keles_splsda`** (python · python) — `chun_keles_splsda` 1.0 · qualitative (rmse_rel ≤ 2e+00) — Sparse SIMPLS (Chun & Keles 2010) on dummy-coded class labels, followed by argmax over decision scores. Mirrors pls4all's `n4m_sparse_pls_da_fit` (default, cfg.sparse_simpls_legacy = 0) bit-for-bit. - 📐 **`ref.r_spls`** (R · r) — `spls` 2.3.2 · qualitative (rmse_rel ≤ 2e+00) — R `spls::splsda` (Chun & Keles). Predictions returned as hard class labels by the package; we one-hot encode them to match pls4all's soft-assignment prediction shape, so the parity check is on the classification *boundary* rather than continuous score values. ::: ### Benchmarks Adaptive wall-clock per cell measured against [`full_matrix.csv`](../benchmarks/overview.md). Only backends that implement this method are listed; libraries without the method are omitted. **Verdict**  ·  ✓ ref / ≈ ref / ~ shape mark a reference-gate pass at strict / relaxed / qualitative tolerance  ·  ✓ bind = pls4all binding agrees with the C++ baseline  ·  ✗ divergent  ·  ⚠ error  ·  — not run. The fastest backend per column is marked 🏆. **Reference gate**: qualitative — shape/smoke comparison only. The external library and pls4all do not produce numerically equivalent output for this method (see the MethodSpec notes); the `rmse_rel_tol ≤ 2e+00` budget is set wide on purpose. Treat ~ shape as *“we ran both, both finished”*, not as numerical agreement. Rows tagged with **📐** are the canonical parity references for this method (declared in [`parity_timing.registry`](../benchmarks/methodology.md)). C++ and external rows show reference parity; pls4all language bindings show binding parity against the C++ backend. Hover the icon for role and tolerance band. ::::{tab-set} :class: parity-tabs :::{tab-item} 1 thread :sync: threads-1
BackendParity50×250 (ms)100×50 (ms)100×500 (ms)100×2500 (ms)200×50 (ms)250×50 (ms)500×50 (ms)500×500 (ms)500×2500 (ms)2500×50 (ms)2500×500 (ms)2500×2500 (ms)10000×50 (ms)10000×500 (ms)
C++ native · libn4m
pls4all.cpp.blas6.28 ms1.35 ms32.9 ms🏆2.7 s2.14 ms🏆3.00 ms6.88 ms91.4 ms1.9 s38.6 ms🏆365.2 ms🏆4.4 s🏆159.2 ms1.8 s
pls4all.cpp.blas+omp6.54 ms2.20 ms34.3 ms2.6 s🏆2.17 ms2.91 ms7.22 ms91.1 ms🏆1.8 s🏆41.4 ms388.2 ms4.5 s159.3 ms1.8 s🏆
pls4all.cpp.omp8.23 ms1.26 ms🏆38.3 ms3.1 s2.28 ms4.24 ms6.16 ms🏆97.6 ms1.8 s41.5 ms428.9 ms4.4 s157.0 ms2.1 s
pls4all.cpp.ref6.60 ms2.28 ms36.9 ms2.7 s2.48 ms4.02 ms7.42 ms94.8 ms1.9 s41.3 ms434.7 ms4.6 s145.1 ms🏆2.3 s
Python · pls4all
pls4all.python✓ bind5.99 ms🏆2.19 ms2.77 ms🏆
pls4all.sklearn✗ +7e-0110.9 ms3.83 ms4.45 ms
R · pls4all
pls4all.R✗ +7e-0117.1 ms7.05 ms12.5 ms
pls4all.R.formula✗ +7e-0122.4 ms9.43 ms10.2 ms
pls4all.R.mdatools✗ +7e-0123.7 ms8.14 ms10.6 ms
pls4all.R.pls✗ +7e-0121.3 ms9.81 ms13.0 ms
MATLAB · pls4all
pls4all.matlab✗ +1e+009.28 ms3.69 ms5.07 ms
pls4all.matlab.classdef✗ +1e+009.44 ms4.00 ms5.41 ms
Python · external
📐ref.python_chun_keles_splsdasource14.9 ms4.20 ms4.67 ms
R · external
📐ref.r_spls~ shape 1e+0055.6 ms32.8 ms35.3 ms
::: :::{tab-item} 3 threads :sync: threads-3
BackendParity50×250 (ms)100×50 (ms)100×500 (ms)100×2500 (ms)200×50 (ms)250×50 (ms)500×50 (ms)500×500 (ms)500×2500 (ms)2500×50 (ms)2500×500 (ms)2500×2500 (ms)10000×50 (ms)10000×500 (ms)
C++ native · libn4m
pls4all.cpp.blas~ shape2.43 ms
pls4all.cpp.blas+omp~ shape3.42 ms
pls4all.cpp.omp~ shape2.33 ms
pls4all.cpp.ref~ shape2.45 ms
Python · pls4all
pls4all.python✓ bind2.06 ms🏆
pls4all.sklearn✗ +7e-012.85 ms
R · pls4all
pls4all.R✗ +7e-016.82 ms
pls4all.R.formula✗ +7e-0110.0 ms
pls4all.R.mdatools✗ +7e-018.34 ms
pls4all.R.pls✗ +7e-018.02 ms
MATLAB · pls4all
pls4all.matlab✗ +1e+005.53 ms
pls4all.matlab.classdef✗ +1e+006.46 ms
Python · external
📐ref.python_chun_keles_splsdasource4.69 ms
R · external
📐ref.r_spls~ shape 1e+0032.2 ms
::: :::{tab-item} 10 threads :sync: threads-10
BackendParity50×250 (ms)100×50 (ms)100×500 (ms)100×2500 (ms)200×50 (ms)250×50 (ms)500×50 (ms)500×500 (ms)500×2500 (ms)2500×50 (ms)2500×500 (ms)2500×2500 (ms)10000×50 (ms)10000×500 (ms)
C++ native · libn4m
pls4all.cpp.blas~ shape1.94 ms
pls4all.cpp.blas+omp~ shape1.90 ms🏆
pls4all.cpp.omp~ shape2.00 ms
pls4all.cpp.ref~ shape2.02 ms
Python · pls4all
pls4all.python✓ bind2.07 ms
pls4all.sklearn✗ +7e-012.29 ms
R · pls4all
pls4all.R✗ +7e-015.09 ms
pls4all.R.formula✗ +7e-015.98 ms
pls4all.R.mdatools✗ +7e-015.98 ms
pls4all.R.pls✗ +7e-016.45 ms
MATLAB · pls4all
pls4all.matlab✗ +1e+003.17 ms
pls4all.matlab.classdef✗ +1e+003.62 ms
Python · external
📐ref.python_chun_keles_splsdasource3.31 ms
R · external
📐ref.r_spls~ shape 1e+0022.1 ms
::: :::: --- _See also_: [benchmark overview](../benchmarks/overview.md) · [methods index](index.md) · [interactive dashboard](../landing/dashboard.md)