Commit Graph

130 Commits

Author SHA1 Message Date
Yu Watanabe
5c19169fe9 tree-wide: fix typo 2022-08-10 19:43:38 +09:00
Michal Sekletar
03860190fe scope: allow unprivileged delegation on scopes
Previously it was possible to set delegate property for scope, but you
were not able to allow unprivileged process to manage the scope's cgroup
hierarchy. This is useful when launching manager process that  will run
unprivileged but is supposed to manage its own (scope) sub-hierarchy.

Fixes #21683
2022-08-04 17:01:13 +02:00
Frantisek Sumsal
e99b9285cb core: drop a stray %m specifier from a warning message
since in this specific case (r == 0) `errno` is irrelevant and most likely
set to zero, leading up to a confusing message:

```
[  120.595085] H systemd[1]: session-5.scope: No PIDs left to attach to the scope's control group, refusing: Success
[  120.595144] H systemd[1]: session-5.scope: Failed with result 'resources'.
```
2022-07-16 07:00:40 +09:00
Lennart Poettering
8d3e4ac7cd scope: refuse activation of scopes if no PIDs to add are left
If all processes we are supposed to add are gone by the time we are
ready to do so, let's fail.

THis is heavily based on Cunlong Li's work, who thankfully tracked this
down.

Replaces: #20577
2021-10-27 23:17:50 +02:00
Frantisek Sumsal
ecea250d77 core: fix the return type for xxx_running_timeout() functions
otherwise we might return an invalid value, since `usec_t` is 64-bit,
whereas `int` might not be.

Follow-up to: 5918a93
Fixes: #20872
2021-09-29 12:28:21 +09:00
Albert Brox
5918a93355 core: implement RuntimeMaxDeltaSec directive 2021-09-28 16:46:20 +02:00
Zbigniew Jędrzejewski-Szmek
04499a70fb Drop the text argument from assert_not_reached()
In general we almost never hit those asserts in production code, so users see
them very rarely, if ever. But either way, we just need something that users
can pass to the developers.

We have quite a few of those asserts, and some have fairly nice messages, but
many are like "WTF?" or "???" or "unexpected something". The error that is
printed includes the file location, and function name. In almost all functions
there's at most one assert, so the function name alone is enough to identify
the failure for a developer. So we don't get much extra from the message, and
we might just as well drop them.

Dropping them makes our code a tiny bit smaller, and most importantly, improves
development experience by making it easy to insert such an assert in the code
without thinking how to phrase the argument.
2021-08-03 10:05:10 +02:00
Zbigniew Jędrzejewski-Szmek
48d83e3368 core: align string tables 2021-07-19 11:33:52 +02:00
Zbigniew Jędrzejewski-Szmek
5291f26d4a tree-wide: add FORMAT_TIMESPAN() 2021-07-09 11:03:36 +02:00
Zbigniew Jędrzejewski-Szmek
5dcadb4c83 core: disable event sources before unreffing them
This mirrors the change done for systemd-resolved in
9793530228. Quoting that patch:

> We generally operate on the assumption that a source is "gone" as soon as we
> unref it. This is generally true because we have the only reference. But if
> something else holds the reference, our unref doesn't really stop the source
> and it could fire again.

In particular, we take temporary references from sd-event code, and when called
from an sd-event callback, we could temporarily see this elevated reference
count. This patch doesn't seem to change anything, but I think it's nicer to do
the same change as in other places and not rely on _unref() immediately
disabling the source.
2021-05-12 12:08:52 +02:00
Franck Bui
e9eec8b5d2 scope: on unified, make sure to unwatch all PIDs once they've been moved to the cgroup scope
Commit 428a9f6f1d freed u->pids which is
problematic since the references to this unit in m->watch_pids were no more
removed when the unit was freed.

This patch makes sure to clean all this refs up before freeing u->pids by
calling unit_unwatch_all_pids().
2020-12-01 09:33:14 +01:00
Yu Watanabe
614f57ed76 core/scope: use set_ensure_put() 2020-11-27 14:35:20 +09:00
Yu Watanabe
d85ff94477 core: use SYNTHETIC_ERRNO() macro 2020-11-27 14:35:20 +09:00
Franck Bui
428a9f6f1d core: serialize u->pids until the processes have been moved to the scope cgroup
Otherwise if a daemon-reload happens somewhere between the enqueue of the job
start for the scope unit and scope_start() then u->pids might be lost and none
of the processes specified by "PIDs=" will be moved into the scope cgroup.
2020-11-20 15:57:59 +01:00
Yu Watanabe
db9ecf0501 license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
Anita Zhang
e08dabfec7 core: clean up inactive/failed {service|scope}'s cgroups when the last process exits
If processes remain in the unit's cgroup after the final SIGKILL is
sent and the unit has exceeded stop timeout, don't release the unit's
cgroup information. Pid1 will have failed to `rmdir` the cgroup path due
to processes remaining in the cgroup and releasing would leave the cgroup
path on the file system with no tracking for pid1 to clean it up.

Instead, keep the information around until the last process exits and pid1
sends the cgroup empty notification. The service/scope can then prune
the cgroup if the unit is inactive/failed.
2020-10-27 13:20:40 +01:00
Anita Zhang
4d824a4e0b core: add ManagedOOM*= properties to configure systemd-oomd on the unit
This adds the hook ups so it can be read with the usual systemd
utilities. Used in later commits by sytemd-oomd.
2020-10-07 16:17:23 -07:00
Zbigniew Jędrzejewski-Szmek
f6e9aa9e45 pid1: convert to the new scheme
In all the other cases, I think the code was clearer with the static table.
Here, not so much. And because of the existing dump code, the vtables cannot
be made static and need to remain exported. I still think it's worth to do the
change to have the cmdline introspection, but I'm disappointed with how this
came out.
2020-05-05 22:40:37 +02:00
Michal Sekletár
d9e45bc3ab core: introduce support for cgroup freezer
With cgroup v2 the cgroup freezer is implemented as a cgroup
attribute called cgroup.freeze. cgroup can be frozen by writing "1"
to the file and kernel will send us a notification through
"cgroup.events" after the operation is finished and processes in the
cgroup entered quiescent state, i.e. they are not scheduled to
run. Writing "0" to the attribute file does the inverse and process
execution is resumed.

This commit exposes above low-level functionality through systemd's DBus
API. Each unit type must provide specialized implementation for these
methods, otherwise, we return an error. So far only service, scope, and
slice unit types provide the support. It is possible to check if a
given unit has the support using CanFreeze() DBus property.

Note that DBus API has a synchronous behavior and we dispatch the reply
to freeze/thaw requests only after the kernel has notified us that
requested operation was completed.
2020-04-30 19:02:51 +02:00
Lennart Poettering
c80a9a33d0 core: clearly refuse OnFailure= deps on units that can't fail
Similar, refuse triggering deps on units that cannot trigger.

And rework how we ignore After= dependencies on device units, to work
the same way.

See: #14142
2020-01-09 11:03:53 +01:00
Zbigniew Jędrzejewski-Szmek
a5f6f346d3 Merge pull request #13423 from pwithnall/12035-session-time-limits
Add `RuntimeMaxSec=` support to scope units (time-limited login sessions)
2019-10-28 14:57:00 +01:00
Philip Withnall
9ed7de605d scope: Support RuntimeMaxSec= directive in scope units
Just as `RuntimeMaxSec=` is supported for service units, add support for
it to scope units. This will gracefully kill a scope after the timeout
expires from the moment the scope enters the running state.

This could be used for time-limited login sessions, for example.

Signed-off-by: Philip Withnall <withnall@endlessm.com>

Fixes: #12035
2019-10-28 09:44:31 +01:00
Zbigniew Jędrzejewski-Szmek
75193d4128 core: adjust load functions for other unit types to be more like service
No functional change, just adjusting code to follow the same pattern
everywhere. In particular, never call _verify() on an already loaded unit,
but return early from the caller instead. This makes the code a bit easier
to follow.
2019-10-11 13:46:05 +02:00
Zbigniew Jędrzejewski-Szmek
c362077087 core: turn unit_load_fragment_and_dropin_optional() into a flag
unit_load_fragment_and_dropin() and unit_load_fragment_and_dropin_optional()
are really the same, with one minor difference in behaviour. Let's drop
the second function.

"_optional" in the name suggests that it's the "dropin" part that is optional.
(Which it is, but in this case, we mean the fragment to be optional.)
I think the new version with a flag is easier to understand.
2019-10-11 10:45:33 +02:00
Chris Down
bc0623df16 cgroup: analyze: Report memory configurations that deviate from systemd
This is the most basic consumer of the new systemd-vs-kernel checker,
both acting as a reasonable standalone exerciser of the code, and also
as a way for easy inspection of deviations from systemd internal state.
2019-10-03 15:06:25 +01:00