split_kmeans — K Means Splitter

Group: Splitters · Binding: n4m.sklearn.KMeansSplitter · C ABI: n4m_split_kmeans_*

Description

K-means++ diversity splitter.

Parameters

Name

Type

Default

test_size

float

0.25

seed

int

0

max_iter

int

100

Explanations

Bibliographic source

Standard spectroscopic operator — see the nirs4all preprocessing / augmentation handbook and the cited literature within the binding docstring.

Mathematical principle

K-means++ diversity splitter.

Implementation

C ABI n4m_split_kmeans_* in libn4m (create / apply / destroy lifecycle), wrapped by n4m.sklearn.KMeansSplitter. The same numerical kernel backs every language binding.

Usage

from n4m.sklearn import KMeansSplitter
op = KMeansSplitter()
X_transformed = op.fit_transform(X)

Benchmarks

Adaptive wall-clock per cell measured against full_matrix.csv. Only backends that implement this method are listed; libraries without the method are omitted.

Verdict  ·  ✓ ref / ≈ ref / ~ shape mark a reference-gate pass at strict / relaxed / qualitative tolerance  ·  ✓ bind = pls4all binding agrees with the C++ baseline  ·  ⇄ cross-check = documented by-design selector/RNG/model, noncanonical API/facade convention, or secondary oracle  ·  ✗ divergent  ·  ⚠ error  ·  — not run. The fastest backend per column is marked 🏆.

Reference gate: strict — numeric equivalence (rmse_rel_tol 1e-12).

BackendParity50×250 (ms)250×50 (ms)
C++ native · libn4m
pls4all.cpp.blas✓ ref
pls4all.cpp.blas+omp✓ ref0.98 ms5.09 ms
pls4all.cpp.omp✓ ref
pls4all.cpp.ref✓ ref0.86 ms4.25 ms
Python · pls4all
pls4all.sklearn0.64 ms🏆2.29 ms🏆

See also: methods index · interactive dashboard