Benchmarks¶
pls4all is benchmarked across three axes. The benchmark matrix carries two separate parity gates: binding parity checks pls4all bindings against the native C++ row, while reference parity checks every successful implementation against the registry-declared external oracle.
Cross-binding parity + timing — same algorithm, same data, pls4all bindings and external references in one matrix. For each
(algo, n, p, threads)cell we report adaptive wall-clock time and the relevant visible parity verdict: reference parity for C++/external rows, binding parity for internal binding rows.Accelerated backend timing — same SIMPLS, libn4m built five ways (
ref,blas,omp,blas_omp,cuda). Shows the speedup stack at the C++ layer.Algorithm matrix — pls4all’s ~60 algorithms timed at representative sizes, with accuracy verified against the reference implementation per algorithm.
Pages¶
Reproducing the benchmarks¶
# Canonical method/reference matrix, including build + docs render.
# Existing cells in results/full_matrix.csv are skipped by default.
benchmarks/cross_binding/run_overnight.sh
# Exhaustive stress matrix with registry-declared references.
FULL_MATRIX=1 REFERENCE_BACKENDS=registry benchmarks/cross_binding/run_overnight.sh
# Legacy fixed/all external-reference audit. NOT_IMPLEMENTED rows are expected.
FULL_MATRIX=1 REFERENCE_BACKENDS=all benchmarks/cross_binding/run_overnight.sh
# Include CUDA too when that preset is available.
FULL_MATRIX=1 LIBP4A_BUILD=all benchmarks/cross_binding/run_overnight.sh
# On the Pages branch (main), also publish the refreshed dashboard.
PUBLISH_WEB=1 benchmarks/cross_binding/run_overnight.sh
# Exhaustive run, then publish the refreshed dashboard from main.
FULL_MATRIX=1 PUBLISH_WEB=1 benchmarks/cross_binding/run_overnight.sh
# From a work branch, commit/push dashboard sources without live deploy.
PUBLISH_WEB=1 DEPLOY_PAGES=0 benchmarks/cross_binding/run_overnight.sh
# Render an existing CSV only.
python benchmarks/cross_binding/combine_and_render.py \
--csvs benchmarks/cross_binding/results/full_matrix.csv \
--out docs/benchmarks/cross_binding.md
See overview.md for the gate semantics and methodology.md for run mechanics, tolerances, threading conventions and hardware context.