Home / Benchmark Results

Benchmark Results¶

How does ProTest compare to other combinatorial testing tools? We benchmarked ProTest's SIPO engine against the standard scenarios from pairwise.org — the definitive collection of pairwise testing tool efficiency comparisons.

All results below use pairwise (t=2) coverage. Each scenario was run with SIPO using the FullHorizontal enhancement strategy. Download any model file below — they include the saved results so you can inspect the covering arrays directly and try to beat them.

Results Summary — pairwise.org Scenarios¶

These are the six standard benchmark scenarios from pairwise.org:

Scenario	Exhaustive	Best Known	SIPO	PICT	SIPO vs PICT	Download
3^4	81	9	9	9	Same	3_4.cahtt
3^13	1,594,323	15	15	18	-17%	3_13.cahtt
2^100	1.3 x 10^30	10	10	15	-33%	2_100.cahtt
10^20	10^20	180	192	210	-9%	10_20.cahtt
4^15 3^17 2^29	~10^18	29	28	37	-24%	4_15__3_17__2_29.cahtt
4^1 3^39 2^35	~10^20	21	22	27	-19%	4_1__3_39__2_35.cahtt

Use SIPO when minimizing test case count matters — it typically produces smaller covering arrays than PICT, especially for models with many parameters. PICT may be faster for quick iterations. See the detailed results below for per-scenario comparisons.

How to Read This Table¶

Exhaustive — Total possible combinations (what you'd need without combinatorial testing)
Best Known — The smallest covering array found by any tool, per pairwise.org
SIPO — ProTest's SIPO engine result
PICT — Microsoft PICT's result, from pairwise.org
Download — Click to download the .cahtt model file with saved results

Comparison with Other Tools¶

Full comparison from pairwise.org, with ProTest SIPO added:

Tool	3^4	3^13	4^15 3^17 2^29	4^1 3^39 2^35	2^100	10^20	Available?
ProTest SIPO	9	15	28	22	10	192	Yes
AETG	9	15	41	28	10	180	No (proprietary)
TestCover	9	15	29	21	10	181	No (website defunct)
EXACT	9	15	?	21	10	?	No (research only)
IPO-s	9	17	32	23	10	220	Via NIST ACTS
CoverTable	9	17	34	26	12	195	Unknown
PICT	9	18	37	27	15	210	Yes (open source)
CTS	9	15	39	29	10	210	Unknown
IPO	9	17	34	26	15	212	Via NIST ACTS
AllPairs	9	17	34	26	14	197	Unknown
Jenny	11	18	38	28	16	193	Yes (open source)
DDA	?	18	35	27	15	201	Unknown
ecFeed	10	19	37	28	16	203	Yes (open source)
TConfig	9	15	40	30	14	231	Unknown
JCUnit	10	23	49	33	18	245	Yes (open source)
LazyParams	10	20	45	33	16	288	Yes (open source)

ProTest SIPO achieved 28 rows for 4^15 3^17 2^29, improving on the previous best of 29. The "Available?" column notes which tools can still be downloaded or used.

Detailed Results by Scenario¶

3^4 — Four 3-Level Parameters¶


Parameters	4 parameters, 3 values each
Total pairs	C(4,2) = 6
Exhaustive	81 combinations
Best known	9 rows
SIPO	9 rows (optimal)
PICT	9 rows

This is a small model — both PICT and SIPO reach the theoretical optimum.

Download 3_4.cahtt

3^13 — Thirteen 3-Level Parameters¶


Parameters	13 parameters, 3 values each
Total pairs	C(13,2) = 78
Exhaustive	1,594,323 combinations
Best known	15 rows
SIPO	15 rows (optimal)
PICT	18 rows

SIPO reaches the best known optimum. PICT produces 18 rows for this scenario.

Download 3_13.cahtt

2^100 — One Hundred Binary Parameters¶


Parameters	100 parameters, 2 values each
Total pairs	C(100,2) = 4,950
Best known	10 rows
SIPO	10 rows (optimal)
PICT	15 rows

SIPO matches the best known result. PICT produces 15 rows for this scenario.

Download 2_100.cahtt

10^20 — Twenty 10-Level Parameters¶


Parameters	20 parameters, 10 values each
Total pairs	C(20,2) = 190
Exhaustive	10^20 combinations
Best known	180 rows (AETG)
SIPO	192 rows
PICT	210 rows

SIPO produces 192 rows; PICT produces 210. The best known result of 180 was achieved by AETG.

Download 10_20.cahtt

4^15 3^17 2^29 — Mixed Levels (61 Parameters)¶


Parameters	61 total: 15 with 4 values, 17 with 3 values, 29 with 2 values
Total pairs	C(61,2) = 1,830
Previous best known	29 rows (TestCover)
SIPO	28 rows (new best known)
PICT	37 rows

SIPO achieves 28 rows — one fewer than the previous pairwise.org best of 29 (TestCover).

Download 4_15__3_17__2_29.cahtt

4^1 3^39 2^35 — Mixed Levels (75 Parameters)¶


Parameters	75 total: 1 with 4 values, 39 with 3 values, 35 with 2 values
Total pairs	C(75,2) = 2,775
Best known	21 rows (EXACT)
SIPO	22 rows
PICT	27 rows

SIPO produces 22 rows, within 1 of the best known result of 21 (EXACT). PICT produces 27 rows.

Download 4_1__3_39__2_35.cahtt

Reproducibility¶

All download files include saved results with full trial details. You can also regenerate by opening any .cahtt file in the ProTest UI, selecting SIPO engine, and clicking Generate. Or from the CLI:

protest generate -i <model>.cahtt -o results.csv --engine sipo

Results will vary by random seed and number of parallel trials. More trials increase the chance of finding a smaller covering array.

Methodology¶

Algorithm: SIPO (Wagner, Kampel & Simos, IWOCA 2021)
Enhancement: FullHorizontal
Strength: 2 (pairwise)
SIPO Base (N): 10,000
Reference data: pairwise.org efficiency comparison
Verification: All covering arrays verified for complete pairwise coverage