# What is pls4all? **pls4all** is a portable PLS / NIRS engine written in C++17, exposed through a stable C ABI, and packaged behind thin first-class bindings for the current target languages — Python, R, MATLAB / Octave, and JavaScript / WebAssembly. Additional language bindings (Go, Rust, Julia, Ruby, .NET, Lua, Nim, JNI, Android) exist as frozen proofs-of-concept under `bindings/_archive/` and are revived on request. It is built around a single claim: **the same numerical PLS result, in every language**, with timings that match or beat each language's established library. ## Why this project exists PLS (Partial Least Squares) and the wider chemometrics catalogue — sparse PLS, OPLS, CPPLS, MB-PLS, kernel PLS, AOM/POP, calibration transfer, variable selection (VIP / CARS / GA / SPA / …) — are *scattered* across an ecosystem: | Language | Where the algorithms live | |---|---| | Python | `sklearn.cross_decomposition`, `ikpls`, `diPLSlib`, `hoggorm`, `tensorly`, `pybaselines`, in-tree implementations of papers | | R | `pls`, `spls`, `OmicsPLS`, `prospectr`, `mdatools`, `multiway`, `kernlab`, `plsVarSel`, `enpls`, `mixOmics`, `chemometrics`, `ropls`, `sgPLS`, `multiblock`, `plsRglm`, `plsRcox`, `softImpute`, `mboost`, … | | MATLAB | `plsregress`, `libPLS` | Each library has its own numerical conventions (NIPALS vs SIMPLS, unit-variance vs centring-only, deflation policy, intercept handling). Comparing two methods across two languages quickly becomes a multi-month integration project. pls4all collapses that surface to a **single C++ kernel** with a single set of conventions, then exposes each language's idiomatic API on top. The benefits stack: - **Determinism across languages.** Same kernel and same generated datasets, with numerical parity checked by explicit gates instead of claiming byte-identical outputs. - **Performance** — BLAS / OpenMP / CUDA accelerated tiers, with a scalar reference tier kept around as the parity anchor. - **Reproducibility** — every binding ships a `.n4a` bundle format that round-trips through the C ABI; a model trained in Python can be loaded and parity-checked in R or MATLAB. - **Auditability** — the parity gate compares pls4all predictions to the external reference library that "owns" each algorithm (sklearn for PLS, ropls for OPLS, spls for sparse PLS, …) and publishes a verdict for every cell. ## The three layers ``` ┌──────────────────────────────────────────────────────────────┐ │ Tier-2 idiomatic API │ │ pls4all.sklearn.PLSRegression(...) (Python) │ │ pls(y ~ ., data, ncomp=) (R) │ │ pls4all.fitrpls(X, y, "NumComponents", k) (MATLAB) │ │ new pls4all.PLS({nComponents: k}) (JS) │ └────────────────────────┬─────────────────────────────────────┘ │ ┌────────────────────────▼─────────────────────────────────────┐ │ Tier-1 raw / canonical API │ │ pls4all._methods.pls_fit(ctx, cfg, X, y, k) (Python) │ │ pls4all_method("pls", X, y, n_components=k) (R) │ │ pls4all.pls_fit(X, y, k) (MATLAB) │ └────────────────────────┬─────────────────────────────────────┘ │ ┌────────────────────────▼─────────────────────────────────────┐ │ Tier-0 — C ABI (libn4m) │ │ n4m_* symbols (96 of them, frozen at ABI 1.x) │ └──────────────────────────────────────────────────────────────┘ ``` The C ABI is the only place numerical algorithms live. Every binding above is a *reformatter* — no PLS math is duplicated in Python or R. ## What's in the box - **~70 algorithms** — every PLS variant in mainstream use plus the full chemometrics variable-selection catalogue. The complete catalogue is the [methods index](../methods/index.md). - **A cross-binding parity gate** — for each `(algorithm, n, p, threads)` cell, every binding's predictions are compared element-wise to the reference library for that algorithm. See the [benchmark overview](../benchmarks/overview.md) and the [interactive dashboard](../landing/dashboard.md). - **A stable C ABI** — frozen at 1.x; semantic versioning enforced by a per-PR ABI symbol gate. See [abi/reference](../abi/reference.md). - **A `.n4a` bundle format** — content-addressed serialisation of a fitted model, portable across languages. - **Acceleration matrix** — five libn4m builds (`ref`, `blas`, `omp`, `blas+omp`, `cuda`) so every cell can also serve as a benchmark of the acceleration stack itself. ## What pls4all is *not* - **Not a data-loading framework.** pls4all assumes you arrive with `(X, y)` already in memory. Spectroscopy file formats, signal-type detection, dataset versioning live in upstream tooling (e.g. [`nirs4all`](https://github.com/GBeurier/nirs4all)). - **Not a pipeline DSL.** Pipelines are composed in the host language (sklearn `Pipeline`, R `caret` / `mlr3`, MATLAB function chains). - **Not a deep-learning library.** pls4all is strictly the PLS family plus the chemometrics adjuncts (variable selection, calibration transfer, diagnostics). ## Where to go next | If you want to… | Read | |---|---| | Run your first fit in your language | [Getting started](getting_started.md) | | Understand the data model and tiers | [Core concepts](concepts.md) | | See what's measured and how | [Benchmark overview](../benchmarks/overview.md) | | Browse the algorithm catalogue | [Methods index](../methods/index.md) | | Compare bindings in a live UI | [GitHub Pages dashboard](../landing/dashboard.md) | | Read pls4all in your language | [Python](../bindings/python.md) · [R](../bindings/r.md) · [MATLAB / Octave](../bindings/matlab.md) · [JS](../bindings/js.md) |