Overview of performance values

The following statistics were calculated from the performance values of each algorithm:
obs nas min qu_1st med mean qu_3rd max sd coeff_var
abscon 4021 0 0.392 1.002 5.963 949.31 2967.39 3612 1551.58 1.63443
choco 4021 0 0.261 2.618 8.581 1076.59 3600.02 3610.86 1606.02 1.49176
claspcnf_direct 4021 0 0.028 1.812 6.436 202.223 26.956 3603.09 730.133 3.61054
claspcnf_directorder 4021 0 0.023 1.231 7.344 216.975 32.023 3602.92 769.969 3.54866
claspcnf_support 4021 0 0.028 2.258 8.826 212.555 41.936 3617.62 741.058 3.48642
cryptominisat_direct 4021 0 0.028 2.146 7.552 468.944 46.797 3605.94 1143.67 2.43882
cryptominisat_directorder 4021 0 0.027 1.9 10.462 479.732 52.66 3603.35 1161.56 2.42127
cryptominisat_support 4021 0 0.024 3.247 10.698 439.432 48.231 3600.09 1118.93 2.54631
gecode 4021 0 0.003 0.458 3600 2015.04 3600 3600.26 1764.96 0.875893
glucose_direct 4021 0 0.027 1.83 7.388 441.397 59.548 3600.32 1118.06 2.533
glucose_directorder 4021 0 0.026 0.672 8.689 478.461 62.809 3600.27 1157.83 2.41991
glucose_support 4021 0 0.026 2.509 10.837 511.964 119.396 3600.51 1170.19 2.28569
lingeling_direct 4021 0 0.026 1.729 7.083 440.029 45.216 3600.01 1114.62 2.53307
lingeling_directorder 4021 0 0.025 0.919 8.857 477.839 53.45 3600.01 1164.34 2.43668
lingeling_support 4021 0 0.024 2.967 10.115 450.657 49.004 3600.01 1131.87 2.5116
minisat22_direct 4021 0 0.027 1.731 7.517 446.305 74.835 3605.24 1073.77 2.40591
minisat22_directorder 4021 0 0.023 0.684 8.692 489.167 74.378 3605.44 1141.25 2.33304
minisat22_support 4021 0 0.028 2.316 10.954 549.702 135.477 3617.86 1188.88 2.16278
mistral_nj 4021 0 0.028 0.222 25.848 1433.51 3600 3600.32 1729.67 1.2066
riss3g_direct 4021 0 0.028 2.785 6.747 318.321 38.842 3603.1 951.78 2.99
riss3g_directorder 4021 0 0.024 1.719 8.495 379.948 59.813 3603.11 1035.12 2.72439
riss3g_support 4021 0 0.027 4.001 9.647 343.257 58.678 3610.34 975.73 2.84256

Summary of the runstatus per algorithm

The following table summarizes the runstatus of each algorithm over all instances (in %).

ok timeout memout not_applicable crash other
abscon 59.239 37.876 2.885 0.000 0.000 0.000
choco 62.795 37.180 0.000 0.000 0.025 0.000
claspcnf_direct 39.642 10.669 49.639 0.000 0.050 0.000
claspcnf_directorder 45.660 11.340 42.950 0.000 0.050 0.000
claspcnf_support 40.040 11.092 48.819 0.000 0.050 0.000
cryptominisat_direct 33.972 18.304 47.675 0.000 0.050 0.000
cryptominisat_directorder 40.537 18.453 40.960 0.000 0.050 0.000
cryptominisat_support 36.558 15.344 48.048 0.000 0.050 0.000
gecode 33.997 62.820 0.000 0.000 3.183 0.000
glucose_direct 36.185 15.643 48.122 0.000 0.050 0.000
glucose_directorder 42.129 16.911 40.885 0.000 0.075 0.000
glucose_support 36.459 17.060 46.431 0.000 0.050 0.000
lingeling_direct 35.911 17.583 46.456 0.000 0.050 0.000
lingeling_directorder 42.402 18.379 39.169 0.000 0.050 0.000
lingeling_support 36.757 17.085 46.108 0.000 0.050 0.000
minisat22_direct 35.638 15.021 49.266 0.000 0.075 0.000
minisat22_directorder 41.482 16.414 42.054 0.000 0.050 0.000
minisat22_support 35.961 16.936 47.053 0.000 0.050 0.000
mistral_nj 32.678 48.421 0.895 0.000 18.005 0.000
riss3g_direct 33.748 12.733 50.510 0.000 3.009 0.000
riss3g_directorder 38.100 14.424 45.536 0.000 1.940 0.000
riss3g_support 35.364 13.454 48.769 0.000 2.412 0.000

Dominated Algorithms

Here, you'll find an overview of dominating/dominated algorithms:
None of the algorithms was superior to any of the other.

An algorithm (A) is considered to be superior to an other algorithm (B), if it has at least an equal performance on all instances (compared to B) and if it is better on at least one of them. A missing value is automatically a worse performance. However, instances which could not be solved by either one of the algorithms, were not considered for the dominance relation.


Visualisations

Important note w.r.t. some of the following plots:
If appropriate, we imputed performance values for failed or censored runs. We used max + 0.3 * (max - min), in case of minimization problems, or min - 0.3 * (max - min), in case of maximization problems.
In addition, a small noise is added to the imputed values (except for the cluster matrix, based on correlations, which is shown at the end of this page).


Boxplots of performance values


Imputing the performance values of failed or censored runs (as described in the red note at the beginning of this section):
plot of chunk unnamed-chunk-4

Discarding the performance values of failed or censored runs:
## Warning: Removed 53268 rows containing non-finite values (stat_boxplot).
plot of chunk unnamed-chunk-5

Estimated densitities of performance values


Imputing the performance values of failed or censored runs (as described in the red note at the beginning of this section):
plot of chunk unnamed-chunk-6

Discarding the performance values of failed or censored runs:
plot of chunk unnamed-chunk-7

Estimated cumulative distribution functions of performance values


Imputing the performance values of failed runs (as described in the red note at the beginning of this section):
plot of chunk unnamed-chunk-8

Discarding the performance values of failed or censored runs:
plot of chunk unnamed-chunk-9

Scatterplot matrix of the performance values

The figure underneath shows pairwise scatterplots of the performance values.

Imputing the performance values of failed and censored runs (as described in the red note at the beginning of this section):
plot of chunk unnamed-chunk-10

Clustering algorithms based on their correlations

The following figure shows the correlations of the ranks of the performance values. Per default it will show the correlation coefficient of spearman. Missing values were imputed prior to computing the correlation coefficients. The algorithms are ordered in a way that similar (highly correlated) algorithms are close to each other. Per default the clustering is based on hierarchical clustering, using Ward's method.

plot of chunk unnamed-chunk-11