The test breaks on UC24 in a peculiar way. The `netplan` CLI tool is a Python
script which attempts to load libnetplan.so.* through ctypes. However, in a
snap, the contents of /etc come from the host, so in case on UC24 environment,
the netplan client invoked from eg. core20 would observe actual ld.so.cache from
UC24. Thus Python's ctype library, would call `ldconfig-p`, which then consumes
the ld.so.cache from UC24, thus listing incorrect version of the libnetplan
library (specifically UC24 has libnetplan.so.1, while earlier versions had
libnetplan.so.0.0).
Attempt to fix this by providing a custom wrapper for ldconfig, which generates
a cache on the side under $SNAP_DATA, thus using the libraries which are
actually visible to the snap.
Signed-off-by: Maciej Borzecki <maciej.borzecki@canonical.com>
* o/snapstate: when invoked for rollback, check for any changes related to snapd
When the snapd.service unit fails to be activated, an OnFailure handler will
execute snap-failure which in turn starts the snapd process from the previous
revision of the snap with SNAPD_REVERT_TO_REV set in its environment. It may
happen that the snapd unit fails at runtime without an associated change to the
snapd snap, however snap-failure is not able to detect such case, and so the
snapd process started in its context would continue to run. Avoid this by
extending the logic within snapd to check it if has been started by snap-failure
with the intention of handling a rollback, and so whether there is a change
related to snapd snap in the state. When the conditions have not been met, snapd
exits and snap-failure continues to restart the snapd service.
Related issues: SNAPDENG-21605
Signed-off-by: Maciej Borzecki <maciej.borzecki@canonical.com>
* tests/core/snapd-failover: account for improved snap-failure behavior
Signed-off-by: Maciej Borzecki <maciej.borzecki@canonical.com>
* overlord: add Is() to startup error
Add Is() support to startupError, so that error can be introspected at runtime.
Signed-off-by: Maciej Borzecki <maciej.borzecki@canonical.com>
* o/snapstate: return an explicit error from StartUp() when no recovery was detected
When snapd is invoked in a context of recovery but the state does not reflect
this, return an explicit error indicating that further startup should be carried out.
Signed-off-by: Maciej Borzecki <maciej.borzecki@canonical.com>
* daemon: return an explicit error when startup in recovery context was aborted
Return an explicit error when startup in failure recovery context was aborted
due to lack of operations in the state which may have triggered it.
Signed-off-by: Maciej Borzecki <maciej.borzecki@canonical.com>
* cmd/snapd: handle unnecessary failure recovery
Gracefully handle unnecessary failure recovery but exiting with 0 status, so
that snap-failure may continue with cleanup and restart.
Signed-off-by: Maciej Borzecki <maciej.borzecki@canonical.com>
* o/snapstate: improve the check for asserting if restart was warranted
Signed-off-by: Maciej Borzecki <maciej.borzecki@canonical.com>
* overlord: add managers test for handling of runtime restart with failure handling
Signed-off-by: Maciej Borzecki <maciej.borzecki@canonical.com>
* many: tweak names and comments
Signed-off-by: Maciej Borzecki <maciej.borzecki@canonical.com>
* o/snapstate: tweak unit test names
Signed-off-by: Maciej Borzecki <maciej.borzecki@canonical.com>
* cmd/snapd: use fmt.Fprintln
Signed-off-by: Maciej Borzecki <maciej.borzecki@canonical.com>
* overlord/snapstate: tweak naming
Signed-off-by: Maciej Borzecki <maciej.borzecki@canonical.com>
---------
Signed-off-by: Maciej Borzecki <maciej.borzecki@canonical.com>
/etc/timezone is an old file that was only supported on Debian and
Ubuntu but has been removed in 24.04 and upcoming Debian.
Whether /etc/timezone works or not is a problem for systemd. We do not
have to test it.
* tests: deal with pre-release suffix in systemd versions
Debian sid is using an rc version of systemd which tests.session and
various other tests weren't dealing with. This change cuts the
pre-release suffix from all of them although some were only used by
distros in which this isn't a problem. This is more consistent and
robust IMO.
Signed-off-by: Miguel Pires <miguel.pires@canonical.com>
* tests: fix var ref
Signed-off-by: Miguel Pires <miguel.pires@canonical.com>
---------
Signed-off-by: Miguel Pires <miguel.pires@canonical.com>
* data/systemd/snapd.service: use RestartMode=direct
Systemd 254 introduced a new behavior, when the a service during automatic
restart goes through the failed/inactive state such that OnFailure/OnSuccess
dependencies get triggered. In previous releases those dependencies would only
be triggered when the unit has failed to active (or finished). This results in
an unexpected behavior when the snapd.failure.service is invoked at runtime
without an ongoing snapd refresh. Snap-failure starts the snapd binary from the
previous revision of the snapd snap, but since there was no snap change in
progress, snapd just continues to run however with the parent process being
snap-failure instead of systemd. Setting RestartMode=direct brings back the old
behavior when service was automatically restarted.
See e67129e5e4/NEWS (L1796-L1801)
Signed-off-by: Maciej Borzecki <maciej.borzecki@canonical.com>
* tests/core/snapd-failover: verify snapd process cgroup (and hence systemd unit)
Make sure that snapd process is running within the context of the snapd.service
unit.
Signed-off-by: Maciej Borzecki <maciej.borzecki@canonical.com>
* tests/core/snapd-failover: update log match to catch the right event
Update the test to look for a log that actually matches what is logged in the
system. Specifically `Starting...` is logged when the unit gets activated, while
`Started..` when the unit completed activation. In case of one-shot units, the
'starting' log comes first, while 'started' is logged after the unit has become
active.
Signed-off-by: Maciej Borzecki <maciej.borzecki@canonical.com>
* tests/core/snapd-failover: encode systemd unit failure behavior in the test
When running the test on UC6, verify that snapd.failure.service was indeed
triggered in the simplest scenario.
Signed-off-by: Maciej Borzecki <maciej.borzecki@canonical.com>
---------
Signed-off-by: Maciej Borzecki <maciej.borzecki@canonical.com>
* osutil: switch to -u UID:GID for strace-static
This moves us off the custom patch and onto an upstream feature
we've heled develop. The feature is not released yet but the
patch has been integrated into the strace-static snap.
Jira: SNAPDENG-19870
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
* tests: make strace test less fragile
The test tried to carefully match the error message to the version of strace
used, which in turn depends on the host OS. It's much easier to just check both
error mesages.
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
* tests: avoid systemctl kill which has issues on systemd 255
On systemd 255 systemctl kill fails after attempting to kill snapd.service with
the following message:
$ sudo systemctl kill --signal=SIGKILL snapd.service
Failed to kill unit snapd.service: Failed to send signal SIGKILL to auxiliary processes: Invalid argument
Kill the pid by hand to avoid triggering this behavior.
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
---------
Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
* tests: first set of test fixes for uc24
* add details and 2 more fixes
* fix 2 more tests
* gix user-state test
* fix lp-1813365
* Add missing details
* fix listing test
this changes needs to be done like this until the os.query is-core-* is
fixed
* fix shellcheck
* fix merge saving jobs
* tests: initial changes to run core suite in uc24
Changes needed to run core test suite in uc24
* Squashed 'tests/lib/external/snapd-testing-tools/' changes from 1c8efb77e1..1db5214d5f
1db5214d5f Improve the remote docs (#36)
2e4a3153a2 1 more comment
3a0fc57e1e add explanation about why we check for ( Do | Doing )
4cf8e635bf fix os.query test after merge
b89b4f8647 fix artifacts name
d30cee6da0 Merge remote-tracking branch 'upstream/main'
5ef5dcbe8f Tests use artifacts in spread tests (#51)
555c43d2ab Support auto-refresh with Do instead of Doing
96c2b0c19c remove tests support for ubuntu 23.04 (EoL)
74082c0c34 Tests improve remote wait (#49)
5121bfb659 remove support for opensuse leap 15.4 (#48)
30df700d08 Add new systems support (#47)
1f08938925 Support check amazon linux version (#46)
43533bdd97 Change the exit value checking for test formats (#45)
3c88244c04 Update check-test-format to support a dir and a list of files (#44)
510d95f429 add extra check for error in auto-refresh detection function
3289d4031b Try open the log with latin-1 encoding when utf-8 is not working
9db785499f improved how the tools are waiting for system reboot
2a5c4414a3 fix shellcheck errors
5e7b63883d Fixes for osquery and tests pkgs (#43)
4c9145e2ac support reboot waiting for auto-refresh
45768f5188 show changes in unknown status after refresh
8013c30c2a Remove support for ubuntu 22.10
b32b80bf54 Fix remote.rait-for test in bionic
5675c625e9 Enable fedora 38
55f4471957 Support for new oss
f2e88b357c New tool used to query spread json reports
cacd35ede0 utils/spread-shellcheck: explain disabled warnings (#42)
c82afb2dee Support --no-install-recommends parameter when installing dependencies with tests.pkgs
b84eea92e2 spread-shellcheck: fix quotes in environment variables (#41)
ab1e51c29f New comparison in os-query for core systems (#40)
e5ae22a5d4 systemd units can be overwritten
63540b845a Fix error messages in remote pull and push
75e8a426a5 make sure the unit is removed in tests.systemd test
9089ff5c02 Update tests to use the new tests.systemd stop-unit
44ecd5e56a Move tests.systemd stop-units to stop-unit
01a2a83b4b Update tests.systemd to have stop units as systemd.sh
162e93bd35 update tests.systemd CLI options to be the same than retry command
14aa43a405 new feature to re-run failed spread tests (#39)
604cb782db Fix shellcheck in systemd tool
bfc71082c8 Update the tests.systemd to allow parameters waiting for service status
8a2d0a99df Adding quiet tool and removing set +-x from tests.pkgs
d90935d2a4 A comment explaining about the default values for wait-for
3232c5dba7 Add support for ubuntu 23.04
a7164fba07 remove fedora 35 support, add fedora 37 support
89b9eb5301 Update systems supported
92bb6a0664 Include snap-sufix in the snaps.name tool
git-subtree-dir: tests/lib/external/snapd-testing-tools
git-subtree-split: 1db5214d5fe91d90b4ffcd4768db8080fcc245ab
* fix core version under test
* adding missing model
* add missing function in nested.sh
* fix keys used for uc24
* Squashed 'tests/lib/external/snapd-testing-tools/' changes from 1db5214d5f..dacfd81de9
dacfd81de9 fix is_core functions
git-subtree-dir: tests/lib/external/snapd-testing-tools
git-subtree-split: dacfd81de95e05a9e56d84be45e0611275b083f4
* use pc-kernel from beta channel
* removing file created for workflow tests
* remove more dirs created during automatic merge
* restore perimssions for files in test snap-repair
* restore tools permissions merged incorrectly
* fix wording in test
* Squashed 'tests/lib/external/snapd-testing-tools/' changes from dacfd81de9..b89ec98b23
b89ec98b23 use local variables in os.query tool
git-subtree-dir: tests/lib/external/snapd-testing-tools
git-subtree-split: b89ec98b239dc9ef729b6af68ce1b5028b4eee23
* fix remove test details
fakestore that was setup in this test was leaking into other tests causing
frequent failures.
This test was caught using this debug PR (#13717).
Signed-off-by: Zeyad Gouda <zeyad.gouda@canonical.com>
* Add logic to support testflinger backend in spread tests
This change won't make tests fail and allow to start using testflinger
backend until it is merged into spread
* create user on testflinger backend
* skip adding user group on testflinger backend
* snap: fix doc string on SelfContainedSetPrereqTracker
* o/devicestate: make sure that snaps for pre-existing model are already installed in tests
* overlord: make sure that snaps for pre-existing model are already installed in tests
* overlord, o/devicestate: use SelfContainedSetPrereqTracker to track prereqs during remodel
* overlord, o/devicestate, o/s/snapstatetest: move common test helpers to snapstatetest
* snap: add SelfContainedSetPrereqTracker.Snaps method for getting all snaps tracked by tracker
* overlord, o/devicestate: prevent remodel to model with base that does not match gadget base
* tests, tests/core/remodel-base: update remodeling test to also swap gadget when swapping base
* Revert "daemon: move the closing of snapdListener"
This reverts commit fe9c662b1e.
* New procedure to determine when a nested vm is not booting as expected
The idea is to stop a nested test which:
. The nested service is not active
. The log hasn't changed during the last 60 seconds
By doing this, when a vm fails to boot, the test will fail much faster
than before, reducing the nested tests execution time.
* fix shellcheck errors
* Fix comparison for log sizes
* fix issue retrieving the lines in the file
* add env vars to retry to make sure the evaluation is done while retrying
* fix shellcheck
* fix quotes
* Add check to detect infinit loops
* fix issues and improve checks
* Fix shellcheck errors
* fix misspelling
* make less verbose the waiting for the machine is ready
* Revert "Revert "daemon: move the closing of snapdListener""
This reverts commit b5231f05c18f889439d81d9e01eff5f7bf4a7dc3.
* move retry - 1 operation
* Update tests/lib/nested.sh
Co-authored-by: Maciej Borzecki <maciek.borzecki@gmail.com>
* Update tests/lib/nested.sh
Co-authored-by: Maciej Borzecki <maciek.borzecki@gmail.com>
* tests/lib/nested: fix shellcheck errors
Signed-off-by: Maciej Borzecki <maciej.borzecki@canonical.com>
* tests/core/snapd-maintenance-msg: add details
Signed-off-by: Maciej Borzecki <maciej.borzecki@canonical.com>
---------
Signed-off-by: Maciej Borzecki <maciej.borzecki@canonical.com>
Co-authored-by: Maciej Borzecki <maciek.borzecki@gmail.com>
Co-authored-by: Maciej Borzecki <maciej.borzecki@canonical.com>
* Fix basic20plus test for uc22 on arm
This test is failing because uc22 on arm is not using grub, it uses
piboot instead. The test is failing to find the kernel.efi
* spli arm scenarios
* add details to basic20plus test
---------
Co-authored-by: Ernest Lotter <ernest.lotter@canonical.com>
* o/snapstate: try to make the check-rerefresh summary cleaner/clearer
@niemeyer remarked that the message was a bit mysterious and that listing all
snaps for auto-refresh got very noisy
we still need to address the mechanics of how the task is run but that needs
support from overlord/state and is a bigger change
* o/snapstate: rework rerefresh summary
Thanks @degville
Signed-off-by: Miguel Pires <miguel.pires@canonical.com>
* o/snapstate: unit test rerefresh summary
Signed-off-by: Miguel Pires <miguel.pires@canonical.com>
* tests: update spread test w/ new rerefresh summary
Signed-off-by: Miguel Pires <miguel.pires@canonical.com>
---------
Signed-off-by: Miguel Pires <miguel.pires@canonical.com>
Co-authored-by: Miguel Pires <miguel.pires@canonical.com>
* Fix system-snap-refresh in uc20
Simplify the test and validate the base snap during restore is the same
revision than the initial one (no need to remove).
Also it is included a new function to determine the revision of a snap
and the test is updated accordingly.
* fix shellcheck
* Update tests/core/system-snap-refresh/task.yaml
Co-authored-by: Andrew Phelps <136256549+andrewphelpsj@users.noreply.github.com>
* Fix spread error in MATCH
* make sure not extra output is displayed showing rev
---------
Co-authored-by: Andrew Phelps <136256549+andrewphelpsj@users.noreply.github.com>