diff --git a/docs/CONTAINER_INTERFACE.md b/docs/CONTAINER_INTERFACE.md
index dcecdecc3e..4f59746ee7 100644
--- a/docs/CONTAINER_INTERFACE.md
+++ b/docs/CONTAINER_INTERFACE.md
@@ -165,10 +165,15 @@ manager, please consider supporting the following interfaces.
issuing `journalctl -m`. The container machine ID can be determined from
`/etc/machine-id` in the container.
-3. If the container manager wants to cleanly shutdown the container, it might
+3. If the container manager wants to cleanly shut down the container, it might
be a good idea to send `SIGRTMIN+3` to its init process. systemd will then
do a clean shutdown. Note however, that since only systemd understands
- `SIGRTMIN+3` like this, this might confuse other init systems.
+ `SIGRTMIN+3` like this, this might confuse other init systems. A container
+ manager may implement the `$NOTIFY_SOCKET` protocol mentioned below in which
+ case it will receive a notification message `X_SYSTEMD_SIGNALS_LEVEL=2` that
+ indicates if and when these additional signal handlers are installed. If
+ these signals are sent to the container's PID 1 before this notification
+ message is sent they might not be handled correctly yet.
4. To support [Socket Activated
Containers](https://0pointer.de/blog/projects/socket-activated-containers.html)
@@ -190,12 +195,14 @@ manager, please consider supporting the following interfaces.
unit they created for their container. That's private property of systemd,
and no other code should modify it.
-6. systemd running inside the container can report when boot-up is complete
- using the usual `sd_notify()` protocol that is also used when a service
- wants to tell the service manager about readiness. A container manager can
- set the `$NOTIFY_SOCKET` environment variable to a suitable socket path to
- make use of this functionality. (Also see information about
- `/run/host/notify` below.)
+6. systemd running inside the container can report when boot-up is complete,
+ boot progress and functionality as well as various other bits of system
+ information using the `sd_notify()` protocol that is also used when a
+ service wants to tell the service manager about readiness. A container
+ manager can set the `$NOTIFY_SOCKET` environment variable to a suitable
+ socket path to make use of this functionality. (Also see information about
+ `/run/host/notify` below, as well as the Readiness Protocol section on
+ [systemd(1)](https://www.freedesktop.org/software/systemd/man/latest/systemd.html)
## Networking
diff --git a/man/sd_notify.xml b/man/sd_notify.xml
index a56d039468..d8fe6468a2 100644
--- a/man/sd_notify.xml
+++ b/man/sd_notify.xml
@@ -446,9 +446,14 @@
The notification messages sent by services are interpreted by the service manager. Unknown
- assignments may be logged, but are otherwise ignored. Thus, it is not useful to send assignments which
- are not in this list. The service manager also sends some messages to its
- notification socket, which are then consumed by the machine or container manager.
+ assignments are ignored. Thus, it is is safe (but often without effect) to send assignments which are not
+ in this list. The protocol is extensible, but care should be taken to ensure private extensions are
+ recognizable as such. Specifically, it is recommend to prefix them with X_ followed by
+ some namespace identifier. The service manager also sends some messages to its
+ notification socket, which may then consumed by a supervising machine or container manager further up the
+ stack. The service manager sends a number of extension fields, for example
+ X_SYSTEMD_UNIT_ACTIVE=, for details see
+ systemd1.
diff --git a/man/systemd.xml b/man/systemd.xml
index 960df97f0b..b66707faba 100644
--- a/man/systemd.xml
+++ b/man/systemd.xml
@@ -372,6 +372,14 @@
Signals
+ The service listens to various UNIX process signals that can be used to request various actions
+ asynchronously. The signal handling is enabled very early during boot, before any further processes are
+ invoked. However, a supervising container manager or similar that intends to request these operations via
+ this mechanism must take into consideration that this functionality is not available during the earliest
+ initialization phase. An sd_notify() notification message carrying the
+ X_SYSTEMD_SIGNALS_LEVEL=2 field is emitted once the signal handlers are enabled, see
+ below. This may be used to schedule submission of these signals correctly.
+
SIGTERM
@@ -769,10 +777,11 @@
$NOTIFY_SOCKET
- Set by systemd for supervised processes for
- status and start-up completion notification. See
- sd_notify3
- for more information.
+ Set by service manager for its services for status and readiness notifications. Also
+ consumed by service manager for notifying supervising container managers or service managers up the
+ stack about its own progress. See
+ sd_notify3 and the
+ relevant section below for more information.
@@ -1109,7 +1118,7 @@
- System credentials
+ System Credentials
During initialization the service manager will import credentials from various sources into the
system's set of credentials, which can then be propagated into services and consumed by
@@ -1151,14 +1160,16 @@
vmm.notify_socket
Contains a AF_VSOCK or AF_UNIX address where to
- send a READY=1 notification datagram when the system has finished booting. See
- sd_notify3 for
- more information. Note that in case the hypervisor does not support SOCK_DGRAM
- over AF_VSOCK, SOCK_SEQPACKET will be tried instead. The
- credential payload for AF_VSOCK should be in the form
+ send a READY=1 notification message when the service manager has completed
+ booting. See
+ sd_notify3 and
+ the next section for more information. Note that in case the hypervisor does not support
+ SOCK_DGRAM over AF_VSOCK,
+ SOCK_SEQPACKET will be tried instead. The credential payload for
+ AF_VSOCK should be a string in the form
vsock:CID:PORT.
- This feature is useful for hypervisors/VMMs or other processes on the host to receive a
+ This feature is useful for machine managers or other processes on the host to receive a
notification via VSOCK when a virtual machine has finished booting.
@@ -1177,6 +1188,77 @@
+
+ For a list of system credentials various other components of systemd consume, see
+ systemd.system-credentials7.
+
+
+
+ Readiness Protocol
+
+ The service manager implements a readiness notification protocol both between the manager and its
+ services (i.e. down the stack), and between the manager and a potential supervisor further up the stack
+ (the latter could be a machine or container manager, or in case of a per-user service manager the system
+ service manager instance). The basic protocol (and the suggested API for it) is described in
+ sd_notify3.
+
+ The notification socket the service manager (including PID 1) uses for reporting readiness to its
+ own supervisor is set via the usual $NOTIFY_SOCKET environment variable (see
+ above). Since this is directly settable only for container managers and for the per-user instance of the
+ service manager, an additional mechanism to configure this is available, in particular intended for use
+ in VM environments: the vmm.notify_socket system credential (see above) may be set to
+ a suitable socket (typically an AF_VSOCK one) via SMBIOS Type 11 vendor strings. For
+ details see above.
+
+ The notification protocol from the service manager up the stack towards a supervisor supports a
+ number of extension fields that allow a supervisor to learn about specific properties of the system and
+ track its boot progress. Specifically the following fields are sent:
+
+
+ An X_SYSTEMD_HOSTNAME=… message will be sent out once the initial
+ hostname for the system has been determined. Note that during later runtime the hostname might be
+ changed again programmatically, and (currently) no further notifications are sent out in that case.
+
+
+
+ An X_SYSTEMD_MACHINE_ID=… message will be sent out once the machine
+ ID of the system has been determined. See
+ machine-id5 for
+ details.
+
+
+
+ An X_SYSTEMD_SIGNALS_LEVEL=… message will be sent out once the
+ service manager installed the various UNIX process signal handlers described above. The field's value
+ is an unsigned integer formatted as decimal string, and indicates the supported UNIX process signal
+ feature level of the service manager. Currently, only a single feature level is defined:
+
+
+ X_SYSTEMD_SIGNALS_LEVEL=2 covers the various UNIX process signals
+ documented above – which are a superset of those supported by the historical SysV init
+ system.
+
+
+ Signals sent to PID 1 before this message is sent might not be handled correctly yet. A consumer
+ of these messages should parse the value as an unsigned integer indication the level of support. For
+ now only the mentioned level 2 is defined, but later on additional levels might be defined with higher
+ integers, that will implement a superset of the currently defined behaviour.
+
+
+
+ X_SYSTEMD_UNIT_ACTIVE=… and
+ X_SYSTEMD_UNIT_INACTIVE=… messages will be sent out for each target unit as it
+ becomes active or stops being active. This is useful to track boot progress and functionality. For
+ example, once the ssh-access.target unit is reported started SSH access is
+ typically available, see
+ systemd.special7 for
+ details.
+
+
+
+
+ Note that these extension fields are sent in addition to the regular READY=1 and
+ RELOADING=1 notifications.