CPU Training Speed on HIGGS: OpenNN vs PyTorch

OpenNN trained the HIGGS benchmark faster than PyTorch on CPU when both frameworks were measured on the same machine with batch size 100. OpenNN completed the epoch in 138.78 seconds, compared with 215.57 seconds for PyTorch. Both CPU runs used MKL-backed numerical execution.

This article is a CPU-focused addendum to the HIGGS benchmark comparison. The dataset, network architecture and GPU results are discussed in the main article, so here we focus only on the CPU timing result.

Headline result: OpenNN was 1.55x faster than PyTorch in this CPU run, reducing epoch time by 35.6%.

Contents

CPU Setup

The benchmark used the same HIGGS neural network configuration as the main comparison: a fully connected neural network with five hidden layers of 300 units and hyperbolic tangent activations. The batch size was 100.

Item Configuration
Processor Intel Core i9-12900K
Threads 16 CPU threads
OpenNN build CPU FP32 with MKL-backed dense operations
PyTorch version PyTorch 2.6.0, CPU execution with MKL backend active

Throughput is calculated from the 10 million training samples divided by the measured epoch time. The validation pass is part of the epoch timing in both runs.

Results

CPU training throughput on HIGGS
Samples per second · Intel Core i9-12900K · batch size 100 · one representative epoch
HIGHER IS BETTER ->
OpenNN
72k

PyTorch · CPU
46k

OpenNN completes the CPU epoch 1.55x faster in this run. Throughput rounded to the nearest thousand samples per second.
Framework CPU backend Epoch time Throughput Relative speed
OpenNN CPU / MKL 138.78 s 72k samples/s 1.55x faster than PyTorch
PyTorch CPU / MKL 215.57 s 46k samples/s 1.00x

OpenNN reduced the CPU epoch time from 215.57 seconds to 138.78 seconds. That is a 35.6% reduction in training time for this benchmark run.

Discussion

The result shows that OpenNN can be highly competitive on CPU even when PyTorch also has MKL available. Dense neural networks spend most of their time in matrix operations, and this benchmark compares two MKL-backed CPU execution paths on the same processor.

The comparison is deliberately narrow: it measures CPU training speed for the HIGGS network, with batch size 100, on one machine. It does not repeat the dataset explanation or the GPU analysis from the main article. The important practical point is simple: under comparable optimized CPU conditions, OpenNN is faster than the PyTorch CPU run measured here.

Conclusions

  • OpenNN completed the CPU epoch in 138.78 seconds.
  • PyTorch completed the same CPU timing run in 215.57 seconds.
  • OpenNN was 1.55x faster, reducing training time by 35.6%.
  • For CPU tabular benchmarks, OpenNN can compete strongly with PyTorch when both use optimized MKL-backed CPU execution.

References