What is pls4all?¶
pls4all is a portable PLS / NIRS engine written in C++17, exposed
through a stable C ABI, and packaged behind thin first-class bindings
for the current target languages — Python, R, MATLAB / Octave, and
JavaScript / WebAssembly. Additional language bindings (Go, Rust, Julia,
Ruby, .NET, Lua, Nim, JNI, Android) exist as frozen proofs-of-concept
under bindings/_archive/ and are revived on request.
It is built around a single claim: the same numerical PLS result, in every language, with timings that match or beat each language’s established library.
Why this project exists¶
PLS (Partial Least Squares) and the wider chemometrics catalogue — sparse PLS, OPLS, CPPLS, MB-PLS, kernel PLS, AOM/POP, calibration transfer, variable selection (VIP / CARS / GA / SPA / …) — are scattered across an ecosystem:
Language |
Where the algorithms live |
|---|---|
Python |
|
R |
|
MATLAB |
|
Each library has its own numerical conventions (NIPALS vs SIMPLS, unit-variance vs centring-only, deflation policy, intercept handling). Comparing two methods across two languages quickly becomes a multi-month integration project. pls4all collapses that surface to a single C++ kernel with a single set of conventions, then exposes each language’s idiomatic API on top.
The benefits stack:
Determinism across languages. Same kernel and same generated datasets, with numerical parity checked by explicit gates instead of claiming byte-identical outputs.
Performance — BLAS / OpenMP / CUDA accelerated tiers, with a scalar reference tier kept around as the parity anchor.
Reproducibility — every binding ships a
.n4abundle format that round-trips through the C ABI; a model trained in Python can be loaded and parity-checked in R or MATLAB.Auditability — the parity gate compares pls4all predictions to the external reference library that “owns” each algorithm (sklearn for PLS, ropls for OPLS, spls for sparse PLS, …) and publishes a verdict for every cell.
The three layers¶
┌──────────────────────────────────────────────────────────────┐
│ Tier-2 idiomatic API │
│ pls4all.sklearn.PLSRegression(...) (Python) │
│ pls(y ~ ., data, ncomp=) (R) │
│ pls4all.fitrpls(X, y, "NumComponents", k) (MATLAB) │
│ new pls4all.PLS({nComponents: k}) (JS) │
└────────────────────────┬─────────────────────────────────────┘
│
┌────────────────────────▼─────────────────────────────────────┐
│ Tier-1 raw / canonical API │
│ pls4all._methods.pls_fit(ctx, cfg, X, y, k) (Python) │
│ pls4all_method("pls", X, y, n_components=k) (R) │
│ pls4all.pls_fit(X, y, k) (MATLAB) │
└────────────────────────┬─────────────────────────────────────┘
│
┌────────────────────────▼─────────────────────────────────────┐
│ Tier-0 — C ABI (libn4m) │
│ n4m_* symbols (96 of them, frozen at ABI 1.x) │
└──────────────────────────────────────────────────────────────┘
The C ABI is the only place numerical algorithms live. Every binding above is a reformatter — no PLS math is duplicated in Python or R.
What’s in the box¶
~70 algorithms — every PLS variant in mainstream use plus the full chemometrics variable-selection catalogue. The complete catalogue is the methods index.
A cross-binding parity gate — for each
(algorithm, n, p, threads)cell, every binding’s predictions are compared element-wise to the reference library for that algorithm. See the benchmark overview and the interactive dashboard.A stable C ABI — frozen at 1.x; semantic versioning enforced by a per-PR ABI symbol gate. See abi/reference.
A
.n4abundle format — content-addressed serialisation of a fitted model, portable across languages.Acceleration matrix — five libn4m builds (
ref,blas,omp,blas+omp,cuda) so every cell can also serve as a benchmark of the acceleration stack itself.
What pls4all is not¶
Not a data-loading framework. pls4all assumes you arrive with
(X, y)already in memory. Spectroscopy file formats, signal-type detection, dataset versioning live in upstream tooling (e.g.nirs4all).Not a pipeline DSL. Pipelines are composed in the host language (sklearn
Pipeline, Rcaret/mlr3, MATLAB function chains).Not a deep-learning library. pls4all is strictly the PLS family plus the chemometrics adjuncts (variable selection, calibration transfer, diagnostics).
Where to go next¶
If you want to… |
Read |
|---|---|
Run your first fit in your language |
|
Understand the data model and tiers |
|
See what’s measured and how |
|
Browse the algorithm catalogue |
|
Compare bindings in a live UI |
|
Read pls4all in your language |
Python · R · MATLAB / Octave · JS |