Commit Graph

  • 397de2b799 Make output consistent when jobs > 1 Seb M'Caw 2025-08-06 10:41:36 +01:00
  • 824b9e2dd0 Make tool/eval selection CLI case-insensitive Seb M'Caw 2025-08-06 10:19:46 +01:00
  • 45913654d0 Remove redundant nesting from serialised DirectoryContents Seb M'Caw 2025-08-06 10:18:22 +01:00
  • cbcea8661e Add --canonical switch Seb M'Caw 2025-08-05 17:31:32 +01:00
  • f3c8e154f8 Make UnpackedDirectoryContextManager private Seb M'Caw 2025-08-04 16:03:26 +01:00
  • de9e965ed4 Move is_unpacked_sample() definition closer to Sample definition Seb M'Caw 2025-08-04 16:01:14 +01:00
  • f73a81a4d7 Make desc a SampleOperation property Seb M'Caw 2025-08-04 15:20:11 +01:00
  • 08891bde4e Improve handling of incompatible or failed samples Seb M'Caw 2025-08-04 15:03:01 +01:00
  • 526b2ef050 Remove redundant type parameter Seb M'Caw 2025-08-01 16:51:12 +01:00
  • 04ef2088db Add support for multiple evaluations Seb M'Caw 2025-08-01 16:48:24 +01:00
  • 294955bf24 Rename get_sample_files[_git_aware] Seb M'Caw 2025-07-31 16:43:00 +01:00
  • fd80fcb151 Clean up CLI Seb M'Caw 2025-07-31 16:38:32 +01:00
  • 88bf1f8629 Add PATH check for gnatprove Seb M'Caw 2025-07-31 16:27:55 +01:00
  • 026889906b Move UnsupportedSampleTypeError Seb M'Caw 2025-07-31 13:16:34 +01:00
  • b157e0af1d Enable strict Mypy checks Seb M'Caw 2025-07-31 13:15:06 +01:00
  • 62dff66b58 Separate evals from tools Seb M'Caw 2025-07-31 12:55:25 +01:00
  • 3881556801 Merge branch 'topic/Update-README' into 'main' Rowan Walshe 2025-07-31 09:42:01 +01:00
  • fd9e9a3ce1 Update README Rowan Walshe 2025-07-30 17:21:01 +01:00
  • 8651d09a94 Define abstract SampleOperation Seb M'Caw 2025-07-30 16:35:25 +01:00
  • 3f462cc84c Make Dataset generic in sample type Seb M'Caw 2025-07-30 15:38:45 +01:00
  • 101765169b Add basic GNATprove evaluation Seb M'Caw 2025-07-30 12:46:17 +01:00
  • c9b32b5fd7 Factor common logic out of generate_completions() Seb M'Caw 2025-07-29 16:23:14 +01:00
  • 8714b21d7f Mark required args as required=True Seb M'Caw 2025-07-29 15:45:21 +01:00
  • ee8db6004e Merge branch 'topic/Include-more-problems' into 'main' Rowan Walshe 2025-07-29 14:10:46 +01:00
  • 48669d0c68 Include more problems Rowan Walshe 2025-07-29 14:10:46 +01:00
  • c725f2ff37 Define EvaluationTool Seb M'Caw 2025-07-29 11:13:29 +01:00
  • ad2c389be6 Factor out run_cmd_with_timeout() from ShellScript Seb M'Caw 2025-07-29 10:42:36 +01:00
  • 2b65b6b0d4 Define Dataset.dirname() and Sample.working_dir() Seb M'Caw 2025-07-29 10:30:15 +01:00
  • b7ec3933d7 Define DirectoryContents Seb M'Caw 2025-07-28 11:37:37 +01:00
  • ad365cd7e7 Add placeholder for performing evaluation Rowan Walshe 2025-07-25 13:28:53 +01:00
  • 80538ce506 Add initial evaluation types Rowan Walshe 2025-07-25 13:02:10 +01:00
  • 89640a3e70 Update CI to use Python 3.12 Rowan Walshe 2025-07-25 12:45:43 +01:00
  • 16c981a773 Update types Rowan Walshe 2025-07-25 12:28:36 +01:00
  • 0e8585ec01 Added tqdm progress bar for the generation step Rowan Walshe 2025-07-24 16:21:27 +01:00
  • bb0df1847a Added dummy script for cheaper testing Rowan Walshe 2025-07-24 16:20:52 +01:00
  • ec558a1894 Fix issue where results were being written to the wrong file Rowan Walshe 2025-07-24 13:55:20 +01:00
  • 34ab914e01 Fix testsuite Rowan Walshe 2025-07-24 11:08:30 +01:00
  • ceeb8a0717 Run the formatter Rowan Walshe 2025-07-23 20:45:12 +01:00
  • 8fc2c1e471 Fix mypy issues Rowan Walshe 2025-07-23 20:44:29 +01:00
  • c0982ee043 Fix a number of ruff warnings Rowan Walshe 2025-07-23 19:49:04 +01:00
  • 06f4c6c8c7 Remove dummy spark-assistant Rowan Walshe 2025-07-23 19:28:58 +01:00
  • 54e2a00d50 Simplify dataset Rowan Walshe 2025-07-23 18:38:38 +01:00
  • e5e7b4da49 Add results Rowan Walshe 2025-07-23 17:57:33 +01:00
  • 5bc01bce43 Fix an issue where some some samples are attempted multiple times Rowan Walshe 2025-07-23 17:57:19 +01:00
  • eee5de308f Add two more problems to the dataset Rowan Walshe 2025-07-23 17:48:25 +01:00
  • 3ab7c54135 Add a simple Shell script tool for calling out to Claude code Rowan Walshe 2025-07-23 17:11:52 +01:00
  • 46f6226c18 Add prompts for ineffective statement examples Rowan Walshe 2025-07-23 16:16:00 +01:00
  • 0559165777 Update dataset formatting Rowan Walshe 2025-07-23 16:13:28 +01:00
  • ad55b8d3ce Fix exceptions when running generate Rowan Walshe 2025-07-23 12:39:34 +01:00
  • 09734e7905 Add force arg to pack cli Rowan Walshe 2025-07-23 12:21:13 +01:00
  • 001bdc160d Move around utility functions Rowan Walshe 2025-07-23 12:18:54 +01:00
  • a7f04929de WIP Rowan Walshe 2025-07-23 12:07:18 +01:00
  • 0d28e85c13 Sort imports Rowan Walshe 2024-12-13 14:46:41 +00:00
  • 1733df3899 Add spark-assistant as a dependent Rowan Walshe 2024-12-13 14:33:30 +00:00
  • 8feadf960d Progress on script for generation completion Rowan Walshe 2024-12-12 17:27:01 +00:00
  • 1bd31dd658 Switch from dataclass-wizard to pydantic Rowan Walshe 2024-12-11 16:59:34 +00:00
  • 15ecd6cd71 Start working on script for generating completions Rowan Walshe 2024-12-11 12:59:51 +00:00
  • 7a36fa776c Refactor pack/unpack logic to pull out dataset load/save logic Rowan Walshe 2024-12-11 12:17:38 +00:00
  • 9182d06273 Iterate over dirs in the same order each time when packing Rowan Walshe 2024-12-05 10:35:24 +00:00
  • 6ee653f6e0 Move debugging Rowan Walshe 2024-12-05 10:26:48 +00:00
  • 6543d216a2 Add extra logging for failed tests Rowan Walshe 2024-12-05 10:23:57 +00:00
  • 95225687dc Set git author for testsuite Rowan Walshe 2024-12-05 10:11:51 +00:00
  • c8cced32d2 Try escaping quotes Rowan Walshe 2024-12-05 10:01:33 +00:00
  • e30dba7277 Remove spaces from subprocess calls in tests Rowan Walshe 2024-12-05 09:59:02 +00:00
  • 09b6531a76 Don't run ruff using uvx Rowan Walshe 2024-12-05 09:51:42 +00:00
  • 02932c73c3 Update default index so that build-system requires come from our GitLab Rowan Walshe 2024-12-05 09:43:12 +00:00
  • 18cc7a2626 Try setting UV_EXTRA_INDEX_URL Rowan Walshe 2024-12-04 16:49:54 +00:00
  • f9bdfd891e Update to allow python versions 3.11 and newer Rowan Walshe 2024-12-04 15:55:57 +00:00
  • 34a92bc738 Don't try to download python in CI Rowan Walshe 2024-12-04 15:44:47 +00:00
  • fd570d9bf3 Switch from poetry to uv Rowan Walshe 2024-12-04 15:39:01 +00:00
  • eec304c163 Move mypy job Rowan Walshe 2024-12-03 15:28:42 +00:00
  • 287cdf6ada Use ruff instead of flake8, black, isort, and pylint Rowan Walshe 2024-12-03 15:28:03 +00:00
  • 929a38bc02 Pack and Unpack script changes Rowan Walshe 2024-12-02 17:20:02 +00:00
  • b796a5e6d2 Finished first draft of pack and unpack Rowan Walshe 2024-11-28 18:31:48 +00:00
  • acf3bed84e Ad utils module Rowan Walshe 2024-11-28 17:36:20 +00:00
  • dc0ef95c7c Remove references to template files from pack and unpack scripts Rowan Walshe 2024-11-28 17:33:49 +00:00
  • 4f91774f51 More work on unpack script Rowan Walshe 2024-11-28 17:31:35 +00:00
  • 957a2a576e Improve call to git ls-files Rowan Walshe 2024-11-28 13:36:35 +00:00
  • ed961c5bb2 Start working on unpack script Rowan Walshe 2024-11-28 11:24:31 +00:00
  • c5dd547746 Add Makefile Rowan Walshe 2024-11-28 11:24:13 +00:00
  • 0b0595ae28 Build compact dataset Rowan Walshe 2024-11-28 11:24:05 +00:00
  • 5feecd8f68 Add two samples to the unpacked spark dataset Rowan Walshe 2024-11-28 11:22:45 +00:00
  • 1eb653643e Update pack script to include template files Rowan Walshe 2024-11-27 18:32:17 +00:00
  • 815ccfd3b6 First draft of pack script Rowan Walshe 2024-11-27 13:28:13 +00:00
  • 2b16e6df04 Remove obj files Rowan Walshe 2024-11-27 11:09:38 +00:00
  • fabf94d7a5 Updated CI template to use GitLab report formats Rowan Walshe 2024-11-26 17:50:17 +00:00
  • 9abaa1797a Fix git ls-files Rowan Walshe 2024-11-26 17:50:04 +00:00
  • 980abeee8e Lots of project setup and initial work on packing/unpacking scripts Rowan Walshe 2024-11-25 17:38:10 +00:00
  • 8a7b597b9a Initial commit Rowan Walshe 2024-11-22 19:49:55 +00:00