This is the empirical evaluation of the paper Proving Non-Termination by Acceleration Driven Clause Learning.
We implemented our contributions in our tool LoAT. For more information about LoAT, we refer to the general LoAT website.
Getting LoAT
We provide a pre-compiled binary of LoAT (Linux, 64 bit). Moreover, you can find the source code of LoAT at GitHub.Evaluation
Examples
We evaluated our implementation on all 1222 examples from the category Termination of Integer Transition Systems and all 335 examples from the category Termination of C Integer Programs of the Termination Problems Data Base.
Tools
We compared our implementation (LoAT ADCL) with other leading termination analyzers: iRankFinder, T2, Ultimate, VeryMax, and the previous version of LoAT (LoAT '22).
For T2, VeryMax, and Ultimate, we took the versions of their last participations at the Termination and Complexity Competition (2015, 2019, and 2022). For iRankFinder, we used the configuration from the evaluation of our IJCAR '22 paper, which is tailored towards proving non-termination.
We excluded AProVE, as it cannot prove non-termination of ITSs, and it uses LoAT and T2 as backends when analyzing C programs. Moreover, we excluded Ultimate from the evaluation on ITSs, as it cannot parse them. For the same reason, we excluded T2 and LoAT '22 from the evaluation on C integer programs.
StarExec Bundles
We provide StarExec bundles for all tools used in our evaluation.- LoAT ADCL, configuration
loat_nonterm_proofout
andloat_c_nonterm_proofout
, respectively - LoAT '22, configuration
loat_nonterm_proofout
- T2, configuration
default fixed
- VeryMax, configuration
termcomp2019_ITS
andtermcomp2017
, respectively - iRankFinder, configuration
competition_its
andcompetition_c
, respectively - Ultimate, configuration
default
Results
The table below shows the results for our evaluation on ITSs. Here, avg, rt, and TO abberviate average, runtime, and Timeouts, respectively. Moreover, VB refers to the virtual best solver, i.e., the combination of all tools from our evaluation.
NO | unique NO | Yes | avg rt | median rt | TO | avg rt NO | median rt NO | |
---|---|---|---|---|---|---|---|---|
LoAT ADCL | 521 | 9 | 0 | 48.6s | 0.1s | 183 | 2.9s | 0.1s |
LoAT '22 | 494 | 2 | 0 | 7.4s | 0.1s | 0 | 6.2s | 0.1s |
T2 | 442 | 3 | 615 | 17.2s | 0.6s | 45 | 7.4s | 0.6s |
VeryMax | 421 | 6 | 631 | 28.3s | 0.5s | 30 | 30.5s | 14.5s |
iRankFinder | 409 | 0 | 642 | 32.0s | 2.0s | 93 | 12.3s | 1.7s |
VB | 536 | -- | 661 | -- | -- | -- | -- | -- |
Impressively, the virtual best solver can solve all but 25 of the 1222 examples. As a reference, the participants of the first run of the category Termination of ITSs on the same set of examples in 2015 could solve all but 105 examples.
The next table shows the results for our evaluation on C Integer Programs.
NO | unique NO | Yes | avg rt | median rt | TO | avg rt NO | median rt NO | |
---|---|---|---|---|---|---|---|---|
LoAT ADCL | 102 | 2 | 0 | 24.7s | 0.1s | 26 | 0.2s | 0.1s |
VeryMax | 103 | 1 | 213 | 5.5s | 0.1s | 0 | 12.0s | 8.3s |
Ultimate | 100 | 0 | 205 | 14.8s | 4.6s | 9 | 6.1s | 3.9s |
iRankFinder | 90 | 1 | 206 | 9.9s | 1.7s | 2 | 5.0s | 1.6s |
VB | 109 | -- | 215 | -- | -- | -- | -- | -- |
Finally, we provide tables with the detailed results of our benchmarks on separate sites.