snapd

mirror of https://github.com/token2/snapd.git synced 2026-03-13 11:15:47 -07:00

Author	SHA1	Message	Date
Samuele Pedroni	d92fa5844e	wrappers/services.go: do not restart disabled or inactive services Merge pull request #10266 from mardy/service-restart This implementes the suggestion described in the forum: https://forum.snapcraft.io/t/command-line-interface-to-manipulate-services/262/47 Fixes: https://bugs.launchpad.net/snapd/+bug/1803212	2021-06-18 09:48:18 +02:00
Samuele Pedroni	54e84fb8db	many: fix imports order (according to gci) had to make the comment in wrappers/services_test.go a one-liner otherwise half of it is lost last set of files needing changing (as per current master)	2021-06-16 09:54:31 +02:00
Ian Johnson	bcbc27a816	systemd: refactor "[not set]" handling in getPropertyUintValue As suggested by Michael Signed-off-by: Ian Johnson <ian.johnson@canonical.com>	2021-06-11 09:54:22 -05:00
Alberto Mardegan	256dafa6c2	many: address review comments	2021-06-11 11:36:40 +03:00
Ian Johnson	00c84e7e2e	systemd: refactor property parsers for int values in CurrentTasksCount, etc As suggested by Maciej, Michael and Samuele in snapcore/snapd#10354. Signed-off-by: Ian Johnson <ian.johnson@canonical.com>	2021-06-10 15:31:31 -05:00
Alberto Mardegan	8d7b3e68be	systemd: create a systemdtest package	2021-06-09 16:53:03 +03:00
Alberto Mardegan	0bb437a669	many: address review comments	2021-06-09 08:55:44 +03:00
Alberto Mardegan	1f6d7efd01	Merge branch 'master' into service-restart	2021-06-09 08:38:56 +03:00
Alberto Mardegan	f5fba92f9e	Merge branch 'master' into service-restart	2021-06-08 08:55:17 +03:00
Ian Johnson	e71b122c53	systemd/systemd: use ParseUint and parse a 64 bit max size for MemoryCurrent This is needed because on older systemd systems such as Xenial, CentOS 7 and Amazon Linux 2, systemd reports the current memory usage as 16 Exbibytes, which does not fit into a 32-bit unsigned int. Signed-off-by: Ian Johnson <ian.johnson@canonical.com>	2021-06-07 09:18:11 -05:00
Ian Johnson	14a709fe81	systemd: add CurrentTasksCount to get number of tasks in a slice/service This will be used for quota group tracking purposes. Also refactor to share some common code around getting various properties from systemctl like this that share some parsing code. Signed-off-by: Ian Johnson <ian.johnson@canonical.com>	2021-06-07 09:17:57 -05:00
Ian Johnson	5904235e4a	systemd: also recognize "unknown" is-active output as being inactive CentOS 7 and Amazon Linux 2 with systemd 219 report an unknown, non-existent slice unit when queried whether it is active or not a "unknown", so add this string to the cases that mean the unit is not active. Signed-off-by: Ian Johnson <ian.johnson@canonical.com>	2021-06-07 09:17:11 -05:00
Samuele Pedroni	ea65e1c4fc	systemd: rename Error.exitCodeErr to runErr for clarity	2021-06-03 14:33:36 +02:00
Michael Vogt	887a7f2523	systemd: improve systemctl error reporting We recently had a build failure on trusty in a schroot environment. It turned out that the issue was that no systemctl binary was installed there. But the error was a bit misleading: ``` [--version] failed with exit status 0: ``` This commit improves the error reporting for non-exit code errors and errors that do not have an associated exit status.	2021-06-02 18:08:16 +02:00
Ian Johnson	73b89dac11	Merge pull request #10306 from anonymouse64/feature/quota-groups-9 snap/quota: add CurrentMemoryUsage for current memory usage of a quota group The function returns 0 for slices that don't currently exist.	2021-05-26 12:32:41 -05:00
Ian Johnson	a93e9eff30	systemd/systemd.go: adjust error message for consistency Thanks to Samuele for the suggestion Signed-off-by: Ian Johnson <ian.johnson@canonical.com>	2021-05-26 11:57:32 -05:00
Ian Johnson	9f3904f674	systemd/systemd.go: adjust error messages Thanks to Michael and Samuele for the suggestions. Signed-off-by: Ian Johnson <ian.johnson@canonical.com>	2021-05-26 09:51:08 -05:00
Ian Johnson	42a8bc304e	systemd/systemd.go: use systemctlError interface instead of *systemd.Error This lets us actually mock an inactive service result outside of the systemd package without mocking the systemctl command itself. Signed-off-by: Ian Johnson <ian.johnson@canonical.com>	2021-05-25 21:03:47 -05:00
Ian Johnson	172ed92467	systemd: add CurrentMemoryUsage to get current memory usage for a unit The unit can be a service or a slice. If the unit or slice doesn't exist or is not active, then an error is returned. Signed-off-by: Ian Johnson <ian.johnson@canonical.com>	2021-05-25 14:44:20 -05:00
Alberto Mardegan	7ddeddf110	systemd: move mock to systemd package	2021-05-18 12:46:37 +03:00
Pawel Stolowski	c2479d7bdd	systemd: wait for zfs mounts * Wait for zfs mounts. * Add a comment linking to https://bugs.launchpad.net/ubuntu/+source/snapd/+bug/1922293 * Update mountunit test for After=zfs-mount.service.	2021-04-23 12:12:52 +02:00
Ian Johnson	406dbc0c96	o/servicestate/servicemgr.go: add ensure loop for snap service units (#10164 ) This fixes LP #1924805 with a few different parts. We stop using Requires=usr-lib-snapd.mount in snap app service unit files, since this causes those services to be killed when, as part of a normal snapd refresh (not an app snap refresh) we call systemctl stop usr-lib-snapd.mount). We introduce a new servicestate manager Ensure() function which will use EnsureSnapServices to re-write snap service unit files at most once per-snapd start. This ensures that units which were written with Requires=usr-lib-snapd.mount are re-written now to use Wants=usr-lib-snapd.mount instead. This new servicestate Ensure() function will also track what changes were made to services, and if any services were modified to no longer include Requires= (thus fixing the bug for future snapd refreshes), then we will go and decide if any of those modified units were running and killed by systemd during the snapd snap refresh process. If they were running, then we will restart them. If we identify services which were killed and we tried to start them back up again, but we failed for whatever reason, we will fall back to rebooting immediately to try and recover the system to a working state. Fixes: https://bugs.launchpad.net/snapd/+bug/1924805 * o/servicestate/servicemgr.go: add ensure loop for snap service units This is a first attempt to fix LP #1924805 by having snapd when it starts up refresh service unit definitions and also manually start services which were killed by systemd when we stopped usr-lib-snapd.mount during a snapd snap refresh. Also change Requires=usr-lib-snapd.mount to Wants=usr-lib-snapd.mount to prevent future snap services from being killed when the snapd snap is refreshed. Signed-off-by: Ian Johnson <ian.johnson@canonical.com> * wrappers,o/snapstate/backend: try Wants instead of Requires for usr-lib-snapd.mount * daemon/api_base_test.go: improve test comments This was confusing to debug when failing, including the actual response Result in the comment and also explaining that the revision must be of type string should help folks in the future debugging failures here. Signed-off-by: Ian Johnson <ian.johnson@canonical.com> * wrappers: export SnapdToolingMountUnit This might be better served in dirs long-term, but for now we ought to use it from servicestate regardless of where it lives to reduce duplication. Signed-off-by: Ian Johnson <ian.johnson@canonical.com> * systemd, o/servicestate: move InactiveEnter... to systemd package; refactor Refactor the bug fixing restarting bits of ServiceManager.Ensure() to a separate function so it can be more easily tested and is simpler to read. Also use the exported constant for usr-lib-snapd.mount, use the newly exported systemd.InactiveEnterTimestamp method to get the times for units, and fix some subtle bugs such as not running for the snapd snap, using snapstate for getting the device context for easier mocking, and not passing a nil timings to StartServices which panics on a nil timings. Also add a MockEnsureSnapServices to easily disable running this logic from other packages. Signed-off-by: Ian Johnson <ian.johnson@canonical.com> * tests/daemon, overlord: disable running servicemgr.Ensure() during tests The servicestate manager's Ensure() loop now tries to use systemctl during it's start, so it will race with the existing tests over which bit of code uses the mocked systemctl first, sometimes failing the test when the Ensure() loop wins. Instead, just disable running that Ensure() loop entirely. Signed-off-by: Ian Johnson <ian.johnson@canonical.com> * o/servicestate/servicemgr_test.go: add some basic unit tests This is not yet a complete set of unit tests, but is a good start. Signed-off-by: Ian Johnson <ian.johnson@canonical.com> * systemd: add basic unit tests Signed-off-by: Maciej Borzecki <maciej.zenon.borzecki@canonical.com> * o/servicestate: handle vitality-rank in service options, restart on failure Also refactor the unit tests so they are less duplicative and add cases for: * a service using VitalityRank * needing to reboot because of failures * disabled services not getting restarted Signed-off-by: Ian Johnson <ian.johnson@canonical.com> * o/servicestate/servicemgr.go: rename helper This makes it more clear that this helper is only called when we have a snapd snap. Signed-off-by: Ian Johnson <ian.johnson@canonical.com> * o/servicestate/servicemgr_test.go: use a common time format variable Signed-off-by: Ian Johnson <ian.johnson@canonical.com> * o/servicestate: rename mock helper, check that it is only used from inside test Thanks to Samuele for the suggestion. Signed-off-by: Ian Johnson <ian.johnson@canonical.com> * o/servicestate/servicemgr.go: delete implemented TODO Signed-off-by: Ian Johnson <ian.johnson@canonical.com> * o/managers_test.go: add first high level managers test for service state More to follow... Signed-off-by: Ian Johnson <ian.johnson@canonical.com> * o/servicestate/servicemgr_test.go: check errors from Chtimes Signed-off-by: Ian Johnson <ian.johnson@canonical.com> * o/managers_test.go: refactor current ensure snap services test, add another Refactor the existing test to handle the case where snapd needs to restart services because they were killed, and add another test case where snapd fails to start the services and thus falls back to attempting a reboot instead. Signed-off-by: Ian Johnson <ian.johnson@canonical.com> * o/servicestate/servicemgr_test.go: refactor systemctl checks The issue is that the cleanup function that was deferred was being run as part of the tear down function, and thus the Check() was not being raised up. Instead making it something that is deferred directly by the Test() functions ensures that the fixtures are still properly setup and so they fail the test properly. Also fix 2 unit tests which were passing, but this was not caught due to the above issue, and would only fail with the tests/unit/go spread test. Signed-off-by: Ian Johnson <ian.johnson@canonical.com> systemd: handle extra whitespace at the end of output in InactiveEnterTimestamp In real systemd output, there is a newline at the end of this string. Signed-off-by: Ian Johnson <ian.johnson@canonical.com> * o/servicestate/servicemgr.go: truncate all times to second precision When we switched to using systemctl show --property InactiveEnterTimestamp instead of using d-bus, we lost a significant amount of precision of the timestamp, and now we only get times accurate to the second, while the time precision for the modification time of the usr-lib-snapd.mount unit file is much more accurate, ironically actually leading to the seemingly true fact that the usr-lib-snapd.mount file was modified _after_ usr-lib-snapd.mount was successfully stopped, since we only have the time that the usr-lib-snapd.mount unit being stopped to the second. For example, these are the times from a VM: usr-lib-snapd.mount file modification time = 2021-04-21 02:58:03.142412741 usr-lib-snapd.mount InactiveEnterTimestamp = 2021-04-21 02:58:03 snap.test-snap.svc1.service InactiveEnterTimestamp = 2021-04-21 02:58:03 This means that the time range we end up forming is the nonsensical range: [ 2021-04-21 02:58:03.142412741 , 2021-04-21 02:58:03 ] and the code is written such that the candidate timestamp of 2021-04-21 02:58:03 is less than the lower end of the range, and thus must be outside of the range. The fix implemented here is to truncate all times to the least precise time we have, which is that of systemctl, meaning that all times get truncated to seconds and the above example then becomes: usr-lib-snapd.mount file modification time = 2021-04-21 02:58:03 usr-lib-snapd.mount InactiveEnterTimestamp = 2021-04-21 02:58:03 snap.test-snap.svc1.service InactiveEnterTimestamp = 2021-04-21 02:58:03 and has the very specific range of a single second: [ 2021-04-21 02:58:03 , 2021-04-21 02:58:03 ] which at least does contain the time of the snap service being stopped and thus ends up fixing the bug. In addition, in order to have the most predictable behavior here, we should also always fix our time range here so that if the lower time bound somehow ends up being in the future significantly compared to the upper time bound, we just fix the range to be only the upper time bound, since that number at least should be of the same precision that the candidate snap service stop times. We should eventually switch back to using D-Bus for getting this property to have greater precision. Signed-off-by: Ian Johnson <ian.johnson@canonical.com> * overlord: test/adjust ServiceManager accessor * systemd: move method to match interface order * o/servicestate: move helper function * o/servicestate: adjust to new EnsureSnapServices signature/behavior * o/servicestate: try to restart services only conditonally if there was no Requires=usr-lib-snapd.mount there is no so far known reason for any service to have been stopped by snapd refresh code * o/servicestate: drop addressed TODOs * overlord/servicestate: implement shortcut for file mod time before boot time Implement a TODO in the code about taking a shortcut where we know that the usr-lib-snapd.mount unit file was written before the time of the current boot, we know that no services could have been killed due to a refresh of snapd during this boot (and if we rebooted during the change, but after usr-lib-snapd.mount was written then services would not be killed and we would revert anyways). Add some unit tests around this. Signed-off-by: Ian Johnson <ian.johnson@canonical.com> * o/servicestate/servicemgr.go: use only MockCommand tests over func mocking The two tests here were basically identical and there's not much to be gained from having the additional mocking style here when we can just use MockCommand for both cases in relatively simple implementations. Signed-off-by: Ian Johnson <ian.johnson@canonical.com> * overlord: fixup incomplete/confusing comments Signed-off-by: Ian Johnson <ian.johnson@canonical.com> Co-authored-by: Samuele Pedroni <pedronis@lucediurna.net> Co-authored-by: Maciej Borzecki <maciej.zenon.borzecki@canonical.com>	2021-04-22 16:33:04 +02:00
Paweł Stołowski	ebf9f687d2	Fix format.	2021-04-06 14:53:03 +00:00
Paweł Stołowski	21b0210ab1	Add doc comment for Installed flag.	2021-04-06 14:53:03 +00:00
Paweł Stołowski	25ff4f2b20	Enrich UnitStatus returned by systemd.Status() with Installed flag; it's true if UnitFileState property is not empty.	2021-04-06 14:53:03 +00:00

1 2 3 4 5 ...

302 Commits