Exec'd processes cannot be stitched back to the original caller
and are killed after restore. So ignore failures
to restore host FDs (generally stdio) that belong
to them.
Fixes#11439
PiperOrigin-RevId: 732972054
We used to track the foreground process group & session on the
TTYFileOperation, but these are already tracked in kernel.TTY.ThreadGroup.
So remove TTYFileOperations.fgProcessGroup and .session, and replace them with
a kernel.TTY.
This is analogous to how sentry-internal tty's already work.
Updates #10925
PiperOrigin-RevId: 681957240
The /dev/tty acts as a replica for the current thread group's controlling
terminal.
In a follow-up, I will make /dev/tty work for donated host ttys.
Updates #10925
PiperOrigin-RevId: 681629892
Recently printf.Analyzer has become stricter
(https://github.com/golang/go/issues/60529)
which led to new findings.
gvisor nogo tests run this analyzer and fail if it produces findings.
PiperOrigin-RevId: 671657227
This option, enabled via `runsc checkpoint --exclude-committed-zero-pages`,
instructs `pgalloc.MemoryFile.SaveTo()` to also exclude definitely-committed
zero pages from checkpointing (in addition to possibly-committed zero pages,
which are always scanned for and excluded). This is useful when the application
being checkpointed is known to have a large number of committed zero pages:
pages that (1) have been touched by application memory accesses, a syscall such
as read(), or page pinning by e.g. a driver, and (2) have not been subsequently
released by the application to the operating system by e.g. munmap() or
madvise(MADV_DONTNEED) (+ page unpinning if necessary), and (3) are filled with
zero bytes.
Minor changes:
- In `MemoryFile.updateUsageLocked()`, pass file offset to `checkCommitted` so
that `MemoryFile.SaveTo()`'s `checkCommitted` can use `FALLOC_FL_PUNCH_HOLE`
to decommit pages rather than `MADV_REMOVE` (which translates addresses to
file offsets and then invokes `FALLOC_FL_PUNCH_HOLE`).
- In `MemoryFile.SaveTo()`, buffer up to a hugepage worth of pages to decommit
rather than decommitting one page per syscall.
- Increment `MemoryFile.usageExpected` in `MemoryFile.LoadFrom()`, such that
the first following call to `MemoryFile.UpdateUsage()` might skip the call
to `MemoryFile.updateUsageLocked()` (if memory usage hasn't changed since
loading).
PiperOrigin-RevId: 632370455
Checkpoint of any container continues to trigger an entire pod
checkpoint, which includes the state of all containers.
Restore must be done for each of the containers, one at a time.
The actual restore is triggered when the last container is restored.
The set of flags and spec must be the same for the `restore` command
as it was for the `create` commands when the containers were created.
Containers are identified by their names and can be restored in any
order. If containers have no name, they must be stored in the same
order they were created. Container IDs and host FDs are rewired
correctly after restore.
Updates #1956
PiperOrigin-RevId: 629272041
Processes that are exec'ed into a container cannot be properly
restored because the caller is no longer present. This change
tracks processes that are exec'ed and kill them upon restore.
Updates #1956
PiperOrigin-RevId: 623644184
Enables save resume with checkpoint command. Previously when --leave-running
was set, the sandbox was destroyed after the checkpoint and restored with the
same id. With this change the sandbox will not be destroyed and resumes running
after the checkpoint.
PiperOrigin-RevId: 623282685
This is done to allow re-mapping of container IDs after the
container has been restored. Upon restore, container names
remain the same, but container ID may be (and likely are)
different.
Updates #1956
PiperOrigin-RevId: 621996848
FD numbers can vary between depending on the options used with
runsc command. For example, there are extra FDs passed to
`runsc boot` if `debug-log` is enabled. So instead of requiring
all FDs to have the exact same numbering during restore, provide
a mechanism to remap the FD. Each host FD has a unique identifier
with a map to their corresponding FD. Then during restore, FD
numbers are remapped to the correct ones.
Updates #1956
PiperOrigin-RevId: 615215783
Adds support for per container stats in runsc based on cgroups.
1. Removed the 'cgroupfs' config flag.
2. Mounts the cgroups (/sys/fs/cgroup/<controller>) which will be shared
across all containers during root/pause container startup.
3. The container cgroups (eg:/sys/fs/cgroup/controller/<container-id>) are
mounted along with other container mounts before starting the container
process if the cgroups mount is in the spec.
Updates #172
PiperOrigin-RevId: 590752853
The total(sandbox) memory usage using the GetContainerMemoryUsage API will
return incorrect usage when called before calling the API for each individual
containers in the sandbox. This is because the memory usage for the containers
cgroup is not updated while calculating the total usage. This CL fixes it by
updating the usage for every child cgroup, which will return the correct memory
usage for the parent cgroup.
PiperOrigin-RevId: 574300913