3739 Commits

Author SHA1 Message Date
Lucas Manning 3a9ba17351 Fix device FD reference leaks and add support for VFIO_GROUP_UNSET_CONTAINER.
Fixes #11545

PiperOrigin-RevId: 739274186
2025-03-21 13:07:59 -07:00
Ayush Ranjan 738e1d995f nvproxy: Add HasStatus.SetStatus and provide failWithStatus() util functions.
This creates a more centralized way for nvproxy to return errors to the the
user mode driver via the NvStatus field in ioctl structs. As opposed to failing
the ioctl with mysterious EINVALs.

Also updated the following structs to NOT implement HasStatus interface:
- IoctlRegisterFD
- RMAPIVersion
- IoctlSysParams

These don't have a Status field so it is misleading for them to implement
HasStatus. Created frontendIoctlSimpleNoStatus() and
frontendIoctlInvokeNoStatus() for such structs to use.

PiperOrigin-RevId: 738959856
2025-03-20 15:29:44 -07:00
Jamie Liu c16d3fdfad pgalloc: log async page loading progress and info about awaited loads
PiperOrigin-RevId: 738555942
2025-03-19 15:17:36 -07:00
Ayush Ranjan 97820ce5c9 nvproxy: Add support for 570.124.06.
The following command does not report any changes in the structs we proxy:
```
make run TARGETS=//tools/nvidia_driver_differ:run_differ \
  ARGS="--base 570.86.15 --next 570.124.06"
```

PiperOrigin-RevId: 738093448
2025-03-18 12:06:36 -07:00
Jimmy Tran 8d5f3c982a Handle sighandling.KillItself() return error.
Call dumpAndPanicSyscallError for the rare case where we fail to kill the the
Sentry upon detecting an unexpected stub exit. This will provide enough
information determine if a panic occur due to failed SIGKILL attempt or an
unexpected event.

PiperOrigin-RevId: 737751257
2025-03-17 14:29:52 -07:00
Jamie Liu 768c0364e4 kvm: unlock OS thread during machine.available.Wait()
PiperOrigin-RevId: 737750303
2025-03-17 14:25:23 -07:00
Jamie Liu fbca0560dd kvm: honor memmap.File.MemoryType()
Updates #11436

PiperOrigin-RevId: 737689743
2025-03-17 11:34:50 -07:00
Lucas Manning 8482715727 Enable save/restore with TPUproxy.
This change also adds some small cleanup to TPU code.

PiperOrigin-RevId: 737673712
2025-03-17 10:55:06 -07:00
Nicolas Lacasse 6b0a0af862 Implement basic packet mode support for ptys.
From man TIOCPKT:
"""
In packet mode, each subsequent read(2) will return a packet that either
contains a single nonzero control byte, or has a single byte containing zero
('\0') followed by data written on the slave side of the pseudoterminal."
"""

This CL implements only the data portion of packet mode, not the control bytes,
but that seems to be enough to get xfce4-terminal to work.

PiperOrigin-RevId: 737175092
2025-03-15 09:26:56 -07:00
Jamie Liu b01944883b Add memmap.File.MemoryType()
This has no effect (outside of debug logging) until cl/723723715.

Updates #11436

PiperOrigin-RevId: 736686635
2025-03-13 17:08:52 -07:00
Lucas Manning 11aeff69c2 Fix host-backed event FD restore.
Before this change, host-backed event FDs would always crash the sandbox
during exit when the sentry tried to wait on the fdnotifier for an FD that
wasn't there.

PiperOrigin-RevId: 736585573
2025-03-13 11:51:37 -07:00
Ayush Ranjan 906fb319cc nvproxy: Add option to use the device gofer optionally.
We always use the device gofer in runsc, because the sandbox's filesystem
does not have the GPU devices mounted in it.

PiperOrigin-RevId: 736316547
2025-03-12 17:08:54 -07:00
Nicolas Lacasse f9b1ce2f7d Clean up tty.CheckChange and call it in SetForegroundProcessGroup.
Previously, CheckChange (corresponding to Linux's tty/tty_check_change()) was
only used the host TTY implementation, not the devpts implementation.

Furthermore, ThreadGroup.SetForegroundProcessGroup() duplicated some of the
logic in CheckChange, notably sending SIGTTOU to background tasks. This means
that, for host TTYs, we could send SIGTTOU multiple times. In some
circumstances, this leads the ioctl returning ERESTARTSYS in an infinite loop.

PiperOrigin-RevId: 735934036
2025-03-11 16:46:55 -07:00
Jamie Liu 8153170320 nvproxy: reduce kernel mmap_lock contention from rmAllocOSDescriptor()
PiperOrigin-RevId: 734667529
2025-03-07 13:23:48 -08:00
Fabricio Voznika c041d9bd58 Add missing binary_sha256 field
Fixes #11466

PiperOrigin-RevId: 734209881
2025-03-06 11:01:58 -08:00
Ayush Ranjan 156f457e28 Add kernel.ThreadGroup.ForEachTask().
PiperOrigin-RevId: 733991896
2025-03-05 22:23:11 -08:00
Ayush Ranjan 138e98fb7d nvproxy: Refactor DriverVersion out to nvconf package.
This allows for runsc to be able to use DriverVersion without having to depend
on the entirety of nvproxy.

PiperOrigin-RevId: 733912696
2025-03-05 16:43:03 -08:00
Jamie Liu cc69f4f190 pgalloc: no-op MemoryFile.UpdateUsage() during saving
During checkpointing, MemoryFile.SaveTo() narrows the set of pages known by
memory accounting to be "committed" to those containing non-zero bytes, in
order to avoid saving zero pages and therefore bloating the checkpoint. In the
process of doing so, it needs to touch pages in order to determine whether they
contain non-zero bytes, and does so without holding MemoryFile.mu (in a
MemoryFile.updateUsageLocked() callback). Thus, a concurrent call to
MemoryFile.UpdateUsage() => MemoryFile.updateUsageLocked() can racily observe
that touched zero pages are committed (via mincore()) and mark them
known-committed accordingly, causing them to be unintentionally saved in the
checkpoint.

When SaveOpts.ExcludeCommittedZeroPages is set, MemoryFile.SaveTo() does not
decommit previously-known-committed zero pages, since doing so would cost time;
the motivation for decommitting zero pages is to avoid increasing (real, host)
memory usage during checkpointing, but previously-known-committed pages must
have been using memory even before being touched. However, this significantly
widens the race window described above, since any future call to
MemoryFile.UpdateUsage() will observe said pages to be committed and mark them
known-committed again, effectively negating SaveOpts.ExcludeCommittedZeroPages.

To fix this, inhibit MemoryFile.UpdateUsage() during MemoryFile.SaveTo(); in
essence, when MemoryFile.SaveTo() is in progress, it exclusively defines what
pages are known-committed.

PiperOrigin-RevId: 733876220
2025-03-05 14:50:59 -08:00
Jamie Liu df6a537346 Make it possible to pass internal /dev/null via control.Proc.Exec*
PiperOrigin-RevId: 733614278
2025-03-05 00:22:48 -08:00
Andrei Vagin d844c7bbb7 Allow Sentry to kill itself even when it is the init process
PiperOrigin-RevId: 733399678
2025-03-04 11:25:32 -08:00
Jamie Liu 2247aceb99 kvm: enable CPUID faulting on all VCPUs
This feature is controlled by an MSR; MSRs are per-CPU.

The Intel SDM doesn't document CPUID faulting, at least as of the Dec 2024
revision; despite the deleted comment in ring0/kernel_amd64.go, there is no
Vol. 3 Table 2-43, and every table in Vol. 4 ("Model-Specific Registers") lists
bit 31 in MSR_PLATFORM_INFO as "reserved". The only documentation seems to be
that cited by Linux's e9ea1e7f53b85 ("x86/arch_prctl: Add
ARCH_[GET|SET]_CPUID"): "Intel Virtualization Technology FlexMigration
Application Note" 323850-004, 2012. This document positions CPUID faulting as
an alternative way to support cross-CPU migration for VMs that don't use VMX;
consequently it does not clarify if CPUID faulting is effective in guest ("VMX
non-root") mode, or if the CPUID VM exit takes precedence. If the former is the
case then CPUID faulting is probably faster than setting app CPUID with
KVM_SET_CPUID2, and vice versa. But regardless, this is much simpler.

PiperOrigin-RevId: 733113944
2025-03-03 17:15:39 -08:00
Ayush Ranjan f06d4e7ebe goferfs: Add S/R support for open FDs to deleted files.
This support is only needed when the gofer mount in question is writable.
By default, the rootfs has an overlayfs applied, so the gofer lower layer is
not writabled. But if you are using --overlay2=none, then this change should
allow you to save sandbox with open FDs to deleted files in rootfs.

Updates #11425

PiperOrigin-RevId: 733021267
2025-03-03 12:38:10 -08:00
Fabricio Voznika 0c17600995 Fix restore with pending exec session
Exec'd processes cannot be stitched back to the original caller
and are killed after restore. So ignore failures
to restore host FDs (generally stdio) that belong
to them.

Fixes #11439

PiperOrigin-RevId: 732972054
2025-03-03 10:30:25 -08:00
Jamie Liu d71a9b3df5 gofer: fix ref drop when racily-unlinked synthetic file is invalidated
PiperOrigin-RevId: 732340885
2025-02-28 20:25:53 -08:00
gVisor bot 86abc85f37 Merge pull request #11473 from Champ-Goblem:shim-add-cgroup-v2-metrics-support
PiperOrigin-RevId: 730560110
2025-02-25 14:52:09 -08:00