Development — Testing¶
This page is the operational checklist for the current pls4all test
surface. CONTRIBUTING.md describes policy; this page lists the commands
that maintainers should run before changing parity, bindings, dashboard
generation or release packaging.
Core C++ gate¶
cmake --preset dev-release
cmake --build --preset dev-release --parallel
ctest --preset dev-release --output-on-failure
May 2026 audit result: ctest --preset dev-release passed locally
(n4m_tests, 1/1).
Fixture determinism¶
cd parity/python_generator
python -m pip install -r requirements-lock.txt -e .
generate-fixtures --check --out ../fixtures
Current blocker: AOM/POP fixture generation still searches for the
historical AOM_v0 oracle under workstation paths. A clean clone without
that oracle fails before it can prove fixture determinism. The release gate
needs a pinned or vendored source for that oracle.
Lockfile¶
python -m benchmarks.parity_timing.lockfile --check
May 2026 audit result: passed locally for 71 methods. This check is structural; it does not prove the referenced libraries are importable or that the live reference parity gate is green.
Python binding tests¶
PYTHONPATH=bindings/python/src \
python -m pytest bindings/python/tests -q
May 2026 stabilization result: the full Python suite passes locally. The
previous UVE pipeline failure is covered by UVESelector.min_features and
a deterministic fallback; min_features=0 remains available for raw-mask
parity benches.
The narrower sklearn parity script currently passed:
bindings/python/scripts/sklearn_parity_gate.sh
That script does not cover the full selector pipeline smoke, so it cannot replace the full Python test suite.
Cross-binding samples¶
Small reference-parity sample:
python benchmarks/cross_binding/orchestrator.py \
--algorithms pls pcr --registry-cells --threads 1 \
--libn4m-build blas-omp --n-runs 2 \
--canonical-pls4all-only --reference-backends registry \
--timeout 180 --out-csv /tmp/pls4all_audit_cross_binding.csv --force
May 2026 audit result: pls, pcr, sklearn and R references were mostly
green, but an external ikpls row was marked as binding-parity failure
while still passing reference parity. The orchestrator now writes external
binding parity as not applicable and keeps their reference parity verdict.
Dual-gate unit regression:
PYTHONPATH=. \
python -m pytest benchmarks/cross_binding/tests/test_orchestrator_parity.py -q
Slow-method smoke:
python benchmarks/cross_binding/orchestrator.py \
--algorithms pcr iriv_select vissa_select bve_select pso_select \
--registry-cells --threads 1 --libn4m-build blas-omp --n-runs 2 \
--only-pls4all --timeout 240 \
--out-csv /tmp/pls4all_audit_slow_methods.csv --force
Use this only as a timing smoke. Because --only-pls4all omits external
oracle rows, it relies on the stored oracle snapshots produced by a prior
registry-reference run. If a snapshot is missing, the run fails Gate 2 with
reference oracle missing; run canonical reference backend.
Release preflight¶
scripts/bump_version.sh --check
nm -D --defined-only build/dev-release/cpp/src/libn4m.so.1.16.0 \
| awk '{print $3}' | sort -u \
| diff -u cpp/abi/expected_symbols_linux.txt -
readelf -d build/dev-release/cpp/src/libn4m.so.1.16.0 \
| grep -E 'SONAME|NEEDED|RPATH|RUNPATH'
The ABI symbol snapshot must match intentionally. If a new n4m_*
symbol is public, update cpp/abi/expected_symbols_linux.txt and explain
the ABI change in docs/abi/changes_log.md.
May 2026 audit result: version sync passed, SONAME was libn4m.so.1,
but the Linux symbol diff failed because the current shared library
exports additional public n4m_* symbols not present in
cpp/abi/expected_symbols_linux.txt.
Docs and dashboard¶
python -m json.tool docs/_static/bench-data.json >/dev/null
python docs/_extras/build_landing.py \
--results benchmarks/cross_binding/results \
--out /tmp/bench-data.json
python -m sphinx -b html docs docs/_build/html --keep-going
The dashboard page embeds benchmark JSON at Sphinx build time. Updating
only docs/_static/bench-data.json does not refresh an already-built
index.html.