Octeract

benchmarks

Introduction

The Octeract benchmark consists of over 3,000 test problems from the literature, namely all problems in MINLPLib, 360 problems from QPLib, PrincetonLib, and the entire COCONUT Benchmark. It also includes approximately 50 of our own problems, most of which are designed to break solvers in various ways.
The data presented here is meant to provide insight into the capabilities of the engine, with respect to its stability, ability to handle different types of mathematical structures, performance, and parallel scaling. It is important to understand that performance benchmarks are inherently biased in some way, even when performed by a third party.
Each solver is different and superior to all others in some way (even among similar types of solvers). This has varying levels of importance to different users. For instance, many solvers cannot parse certain types of mathematical forms, and the user must reformulate manually. The treatment of such problems when generating results for a benchmark is up to the person doing that benchmark.
Octeract Engine comes with a native symbolic engine and distributed processing capabilities. These advantages make its comparison with other solvers hard to define, because: (a) it can process any algebraic mathematical form without manual reformulation, and (b) it allows users to invest more computing power (potentially thousands of CPUs) to solve a problem more quickly. Such differences cannot really be captured properly in a benchmark because the interpretation is ambiguous. Therefore, the best way to gauge whether a solver is a good fit for your applications of interest, is to try it out for yourself.

PARALLEL SCALING

In serial mode the engine needs 180s to solve 662 problems, however with 8 cores the same number of problems is solved 3 times faster (in 60s vs 180s). Overall, the number of solved problems for a specific timeout demonstrates linear improvement as more computing power is invested.
This is a very interesting observation because, as can be seen in the logarithmic plots below, the problems that are solved as we allow more timeout in parallel mode can be much larger on average.
Specifically, there are two main markers of good scaling that we can observe in the raw data:
  • For the same timeout, as the number of cores increases the point swarm becomes less dense on the upper left side, especially as we switch from 4 to 8 cores. This means that those same problems are solved more quickly with more cores.
  • For the same timeout, we see more points appearing on the right hand-side of the plots as we increase the number of cores. This means large problems that we not solved previously, are now solved due to the increase in computing power.
Overall, Octeract Engine performs very well across the entire test set, finding feasible solutions for 62% of the problems in just 3 minutes with 8 processors, with 37% of these solutions being proven to be globally optimal. The largest problem solved throughout these runs with default settings was parabol5_2_1 at 40,400 variables, in about 25s.
More computing power is helpful here as well. It should be noted that for constraint satisfaction problems we are unlikely to see much benefit in MPI mode with default settings because multistart is off by default and most of these problems are solved at the root node. Nevertheless, parallel branch and bound does help considerably, as we see in the plot, in cases where a feasible point is not found at the root node.

BENCHMARK SETUP

– Hardware
All problems were run on an Ubuntu 18.04 cluster of 3 networked machines with 64GB of RAM and Intel Xeon Gold 6130 CPUs @ 2.10GHz, yielding 32 physical cores per machine.
– Scheduling
All jobs, including MPI jobs, were scheduled so that all 96 cores were saturated at all times.
– Timeout
Solver timeout was enforced by invoking octeract_solver -t, which sends a hard kill signal from the Linux shell. All time reported in the benchmarks is real time as measured by the engine using std::chrono with microsecond resolution, rounded up to the second decimal.
– Options
All problems were run using default options as described in the engine’s documentation.
– Third-party solvers
The sub-problems generated throughout the DGO procedure for this benchmark were solved using open source solvers (such as CLP and IPOPT). Because commercial solvers tend to scale better for large problems than their open source counterparts, performance for large problems improves significantly if a commercial LP and/or NLP solver is available.

RAW BENCHMARK METADATA

- Serial mode

- Parallel (MPI) mode

Plots can not be displayed on mobile view

Please use a computer or request the desktop version of this page to display the benchmarks.