CPU Training Speed on HIGGS: OpenNN vs PyTorch
OpenNN trained the HIGGS benchmark faster than PyTorch on CPU when both frameworks were measured on the same machine with batch size 100. OpenNN completed the epoch in 138.78 seconds, compared with 215.57 seconds for PyTorch. Both CPU runs used MKL-backed numerical execution.
This article is a CPU-focused addendum to the HIGGS benchmark comparison. The dataset, network architecture and GPU results are discussed in the main article, so here we focus only on the CPU timing result.
Contents
CPU Setup
The benchmark used the same HIGGS neural network configuration as the main comparison: a fully connected neural network with five hidden layers of 300 units and hyperbolic tangent activations. The batch size was 100.
| Item | Configuration |
|---|---|
| Processor | Intel Core i9-12900K |
| Threads | 16 CPU threads |
| OpenNN build | CPU FP32 with MKL-backed dense operations |
| PyTorch version | PyTorch 2.6.0, CPU execution with MKL backend active |
Throughput is calculated from the 10 million training samples divided by the measured epoch time. The validation pass is part of the epoch timing in both runs.
Results
HIGHER IS BETTER ->
| Framework | CPU backend | Epoch time | Throughput | Relative speed |
|---|---|---|---|---|
| OpenNN | CPU / MKL | 138.78 s | 72k samples/s | 1.55x faster than PyTorch |
| PyTorch | CPU / MKL | 215.57 s | 46k samples/s | 1.00x |
OpenNN reduced the CPU epoch time from 215.57 seconds to 138.78 seconds. That is a 35.6% reduction in training time for this benchmark run.
Discussion
The result shows that OpenNN can be highly competitive on CPU even when PyTorch also has MKL available. Dense neural networks spend most of their time in matrix operations, and this benchmark compares two MKL-backed CPU execution paths on the same processor.
The comparison is deliberately narrow: it measures CPU training speed for the HIGGS network, with batch size 100, on one machine. It does not repeat the dataset explanation or the GPU analysis from the main article. The important practical point is simple: under comparable optimized CPU conditions, OpenNN is faster than the PyTorch CPU run measured here.
Conclusions
- OpenNN completed the CPU epoch in 138.78 seconds.
- PyTorch completed the same CPU timing run in 215.57 seconds.
- OpenNN was 1.55x faster, reducing training time by 35.6%.
- For CPU tabular benchmarks, OpenNN can compete strongly with PyTorch when both use optimized MKL-backed CPU execution.