22 Commits

Author SHA1 Message Date
Johannes Kanig
26ba30ddbd V411-001 fix cvc5 in bench scripts
* add fake_cvc5 script
* remove yet another list of "all" provers, replace by reference to
  config file
* improve setting of default 100% baseline, allow for partial
  baseline
* output all baseline violations instead of just one

Change-Id: Ib27f23f3046e9fa43dc2f32f74d6d71a624118fc
2022-04-14 07:39:20 +00:00
Johannes Kanig
d1479142d8 TB02-015 prover benchmarking testsuite
Commit scripts for a new testsuite that runs provers on all VCs for
a given test, using timeouts (and not steps). Criteria for success
are success percentages.

* Makefile:
  entry for new testsuite, to be used by anod script
* benchmark_script: removed or moved scripts elsewhere
* testsuite/gnatprove/MANIFEST.bench: list of tests to be run in
  this testsuite
* testsuite/gnatprove/bench: scripts for new testsuite
* testsuite/gnatprove/bench/create_benchmarks.py:
  script to run generate all VCs for selected tests
* testsuite/gnatprove/bench/benchtests.py:
  script to run provers on all VCs
* testsuite/gnatprove/bench/process_results.py:
  script to collect results and produce testsuite-like output
* testsuite/gnatprove/bench/prover_stats.py:
  moved from benchmark_script folder; fixed support for alt-ergo
* testsuite/gnatprove/lib/python/test_support.py:
  don't use replay in benchmark mode
* testsuite/gnatprove/tests/*/bench.yaml: baseline files for new
   testsuite, specify percentages for each prover

Change-Id: I936a928aaee3c7f94c7b8aefe5854f111b7c0763
2021-07-23 09:10:06 +02:00
Johannes Kanig
4afd648878 U618-007 add support for cvc5 to prover_stats script
cvc5 command line options and output changed from cvc4, so we add a
new class to this benchmarking script

Change-Id: I2cc09648c4f8326e612ad4abde1c2acb15c3695d
2021-06-18 18:09:48 +09:00
Johannes Kanig
568f1c5564 U618-007 fix prover_stats script for high parallelism
The prover stats script was reading from the why3server socket and
hoping to always get complete answers from the why3server. With
higher parallelism messages seem to be cut more often and we need to
deal with this by buffering the messages to find the newline.

Change-Id: I94a4d3e32347107021409bd3199c067a2d013886
2021-06-18 17:35:24 +09:00
Johannes Kanig
54ca895397 TB04-006 improve benchmark scripts
Previously, if a check had several VCs associated with it, only the
first one was generated. We now generate all of them for each
prover.

Change-Id: I53485a8e63a079baa51ffbff85bf0a0da1929c4b
2020-11-06 09:17:51 +01:00
Johannes Kanig
13c215204d TB02-015 checkin Makefile to run provers on all VCs
Change-Id: I364771f9030716fd1902c4032b556803b5c7481a
2020-11-03 10:26:05 +01:00
Johannes Kanig
1ca50be70f (no-tn-check) small fixes to benchmark scripts
Change-Id: I719a70c17cf14eac48663631973a9cc9d75e1d25
2020-10-30 00:36:49 +01:00
Johannes Kanig
9d7dd9bc81 (no-tn-check) fix benchmark script
provers should contain colibri

Change-Id: I29e711181bd42e732bf62261929b8e5444336387
2020-10-29 00:41:45 +01:00
Johannes Kanig
9a5d795868 R308-053 new collate_benchmarks script
new script gives users more control over set of tests to generate
benchs for

Change-Id: Ib2aea44094349f2c6cb832f56d8ea01fcfeb29b1
2020-09-17 09:53:35 +02:00
Johannes Kanig
203bfb14b8 (no-tn-check) fix benchmark scripts
scripts to extract (and run provers on) VCs from the testsuite were
still in python2, and not up to date with the recent testsuite
changes.

Change-Id: I39d4f301c62982ed4713fb0eedac91ca458c10c2
2020-09-15 04:41:53 +02:00
Johannes Kanig
48faadd430 S909-001 add --disable-weaks option to alt-ergo
Change-Id: If8dd1cbee83f6a435ebc8a0cd4546d3cac9d259a
2020-02-10 12:39:22 +09:00
Johannes Kanig
69a817d7c4 (no-tn-check) remove git clean command
This command is very aggressive and could delete user files

Change-Id: I7845fa867740a38bfc788f1bafa31282cc49a9e7
2020-01-30 09:25:16 +01:00
Johannes Kanig
1747178427 R308-053 move prover_stats script to benchmark folder
Change-Id: I62e824e3778a06cc984858e849c5f0e8e105166f
2020-01-30 09:24:58 +01:00
Johannes Kanig
d429ea151d R308-053 document how to analyze time/steps behavior of provers
Change-Id: I9a5dd27701b561417f2adf3a42d01b72e149adc7
2020-01-30 09:24:41 +01:00
Johannes Kanig
66c666bae2 (no-tn-check) fixes to benchmark scripts
make Florian's scripts work with current master

Change-Id: Iecdd08060fbd8b305f84806638ab9e19031c7dee
2019-04-09 11:07:14 +02:00
Johannes Kanig
dd5c378ec9 Q613-002 make why3.conf dynamic again
Replace the static why3.conf file by a JSON file "gnatprove.conf" with
roughly the same contents. The why3.conf file is now dynamically
generated on each invocation of gnatprove dynamically.  This will allow
for more flexibility when it comes to prover wrappers.

* why3.conf: removed
* gnatprove.conf: added
* why3.conf.fake: removed
* gnatprove.adb:
    (Write_Why3_Conf_File): read gnatprove.conf file and write
    why3.conf file in object directory

Change-Id: Ibd152fda35be56e460a945a8eb8c970fa1f49721
2017-06-23 09:15:26 +09:00
Yannick Moy
220aed8740 Q522-052 Remove unused imports in Python detected by pyflakes
Change-Id: I14455df466ba5e79505944d2538067d3a20b7e11
2017-05-23 10:03:24 +02:00
Johannes Kanig
4c86c187af P715-001 fix --benchmark mode
We now implement --benchmark mode by a why3.conf file that points to the
fake executables instead of the real ones. Gnatwhy3 will read that fake
file instead of the real one in benchmark mode.
2016-07-21 16:17:32 +09:00
Florian Schanda
bb1246d2bb P622-005 proof - sync with cvc4 master
Commit scripts I used for this ticket.
2016-06-30 13:56:36 +01:00
Florian Schanda
6970321e47 Minor update to benchmarking scripts for Z3 investigations. 2015-07-16 13:20:25 +01:00
Florian Schanda
0bb196b751 Minor fix for benchmark extraction script to tell apart z3 and cvc4. 2015-07-15 14:12:37 +01:00
Florian Schanda
2df8a44fa2 O424-010 test - benchmarking made easy
I have added a blend between Claire's and my scripts; added a hidden
--benchmark option to gnatprove, and a corresponding option to the
testsuite and a bunch of scripts (and some docs) to make use of all this.
2015-05-06 15:19:03 +01:00