5997 Commits

Author SHA1 Message Date
Lucas Manning 3a9ba17351 Fix device FD reference leaks and add support for VFIO_GROUP_UNSET_CONTAINER.
Fixes #11545

PiperOrigin-RevId: 739274186
2025-03-21 13:07:59 -07:00
gVisor bot 225a7bc0d2 Internal change.
PiperOrigin-RevId: 739263572
2025-03-21 12:32:21 -07:00
Ayush Ranjan 738e1d995f nvproxy: Add HasStatus.SetStatus and provide failWithStatus() util functions.
This creates a more centralized way for nvproxy to return errors to the the
user mode driver via the NvStatus field in ioctl structs. As opposed to failing
the ioctl with mysterious EINVALs.

Also updated the following structs to NOT implement HasStatus interface:
- IoctlRegisterFD
- RMAPIVersion
- IoctlSysParams

These don't have a Status field so it is misleading for them to implement
HasStatus. Created frontendIoctlSimpleNoStatus() and
frontendIoctlInvokeNoStatus() for such structs to use.

PiperOrigin-RevId: 738959856
2025-03-20 15:29:44 -07:00
Jamie Liu c16d3fdfad pgalloc: log async page loading progress and info about awaited loads
PiperOrigin-RevId: 738555942
2025-03-19 15:17:36 -07:00
Ayush Ranjan 97820ce5c9 nvproxy: Add support for 570.124.06.
The following command does not report any changes in the structs we proxy:
```
make run TARGETS=//tools/nvidia_driver_differ:run_differ \
  ARGS="--base 570.86.15 --next 570.124.06"
```

PiperOrigin-RevId: 738093448
2025-03-18 12:06:36 -07:00
Jimmy Tran 8d5f3c982a Handle sighandling.KillItself() return error.
Call dumpAndPanicSyscallError for the rare case where we fail to kill the the
Sentry upon detecting an unexpected stub exit. This will provide enough
information determine if a panic occur due to failed SIGKILL attempt or an
unexpected event.

PiperOrigin-RevId: 737751257
2025-03-17 14:29:52 -07:00
Jamie Liu 768c0364e4 kvm: unlock OS thread during machine.available.Wait()
PiperOrigin-RevId: 737750303
2025-03-17 14:25:23 -07:00
Lucas Manning 129d4b63a7 Add support for more TPU devices.
PiperOrigin-RevId: 737696637
2025-03-17 11:53:29 -07:00
Jamie Liu fbca0560dd kvm: honor memmap.File.MemoryType()
Updates #11436

PiperOrigin-RevId: 737689743
2025-03-17 11:34:50 -07:00
Lucas Manning 8482715727 Enable save/restore with TPUproxy.
This change also adds some small cleanup to TPU code.

PiperOrigin-RevId: 737673712
2025-03-17 10:55:06 -07:00
Nicolas Lacasse 6b0a0af862 Implement basic packet mode support for ptys.
From man TIOCPKT:
"""
In packet mode, each subsequent read(2) will return a packet that either
contains a single nonzero control byte, or has a single byte containing zero
('\0') followed by data written on the slave side of the pseudoterminal."
"""

This CL implements only the data portion of packet mode, not the control bytes,
but that seems to be enough to get xfce4-terminal to work.

PiperOrigin-RevId: 737175092
2025-03-15 09:26:56 -07:00
Jamie Liu b01944883b Add memmap.File.MemoryType()
This has no effect (outside of debug logging) until cl/723723715.

Updates #11436

PiperOrigin-RevId: 736686635
2025-03-13 17:08:52 -07:00
Lucas Manning 11aeff69c2 Fix host-backed event FD restore.
Before this change, host-backed event FDs would always crash the sandbox
during exit when the sentry tried to wait on the fdnotifier for an FD that
wasn't there.

PiperOrigin-RevId: 736585573
2025-03-13 11:51:37 -07:00
Ayush Ranjan 906fb319cc nvproxy: Add option to use the device gofer optionally.
We always use the device gofer in runsc, because the sandbox's filesystem
does not have the GPU devices mounted in it.

PiperOrigin-RevId: 736316547
2025-03-12 17:08:54 -07:00
Nicolas Lacasse f9b1ce2f7d Clean up tty.CheckChange and call it in SetForegroundProcessGroup.
Previously, CheckChange (corresponding to Linux's tty/tty_check_change()) was
only used the host TTY implementation, not the devpts implementation.

Furthermore, ThreadGroup.SetForegroundProcessGroup() duplicated some of the
logic in CheckChange, notably sending SIGTTOU to background tasks. This means
that, for host TTYs, we could send SIGTTOU multiple times. In some
circumstances, this leads the ioctl returning ERESTARTSYS in an infinite loop.

PiperOrigin-RevId: 735934036
2025-03-11 16:46:55 -07:00
Jamie Liu 44b9737347 Increase GOMAXPROCS during aio.GoQueue usage
PiperOrigin-RevId: 735048540
2025-03-08 23:52:19 -08:00
Jamie Liu 8153170320 nvproxy: reduce kernel mmap_lock contention from rmAllocOSDescriptor()
PiperOrigin-RevId: 734667529
2025-03-07 13:23:48 -08:00
Lucas Manning 9581066c30 Clone the packet during ipv6 processing.
PiperOrigin-RevId: 734359605
2025-03-06 18:22:11 -08:00
Nayana Bidari d01514263b Do not process ACKs when endpoint is in error state.
PiperOrigin-RevId: 734333026
2025-03-06 16:37:04 -08:00
Ayush Ranjan e699298d58 Add Container.RestoreInTest() to handle known Docker bugs.
This can be used by all test users. Avoids duplicated code. We can handle all
known issues in one place.

There is a Docker bug which causes restore to fail sporadically. See
https://github.com/moby/moby/issues/42900. This has been broken at least since
Docker v19.03.12 (when the issue was reported) and was fixed in v25.0.4. Added
the handling for this issue.

Also got rid of the testutil.Poll() around restore. That can hide gVisor
restore flakiness issues. That was added in 0990ef7517 ("Make
checkpoint/restore e2e test less flaky"). The original sleep has been restored.

PiperOrigin-RevId: 734303878
2025-03-06 15:13:16 -08:00
Fabricio Voznika c041d9bd58 Add missing binary_sha256 field
Fixes #11466

PiperOrigin-RevId: 734209881
2025-03-06 11:01:58 -08:00
Michael Pratt 46833fbeee Allow prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME) through syscall filters
As of https://go.dev/cl/646095, the Go runtime calls
prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME) when mapping memory to annotate
mappings in /proc/self/maps. Since this is a system call made throughout the
application lifetime, it needs to be allowed through the system call filters.

PiperOrigin-RevId: 734182524
2025-03-06 09:54:22 -08:00
Ayush Ranjan 156f457e28 Add kernel.ThreadGroup.ForEachTask().
PiperOrigin-RevId: 733991896
2025-03-05 22:23:11 -08:00
Ayush Ranjan 138e98fb7d nvproxy: Refactor DriverVersion out to nvconf package.
This allows for runsc to be able to use DriverVersion without having to depend
on the entirety of nvproxy.

PiperOrigin-RevId: 733912696
2025-03-05 16:43:03 -08:00
Jamie Liu cc69f4f190 pgalloc: no-op MemoryFile.UpdateUsage() during saving
During checkpointing, MemoryFile.SaveTo() narrows the set of pages known by
memory accounting to be "committed" to those containing non-zero bytes, in
order to avoid saving zero pages and therefore bloating the checkpoint. In the
process of doing so, it needs to touch pages in order to determine whether they
contain non-zero bytes, and does so without holding MemoryFile.mu (in a
MemoryFile.updateUsageLocked() callback). Thus, a concurrent call to
MemoryFile.UpdateUsage() => MemoryFile.updateUsageLocked() can racily observe
that touched zero pages are committed (via mincore()) and mark them
known-committed accordingly, causing them to be unintentionally saved in the
checkpoint.

When SaveOpts.ExcludeCommittedZeroPages is set, MemoryFile.SaveTo() does not
decommit previously-known-committed zero pages, since doing so would cost time;
the motivation for decommitting zero pages is to avoid increasing (real, host)
memory usage during checkpointing, but previously-known-committed pages must
have been using memory even before being touched. However, this significantly
widens the race window described above, since any future call to
MemoryFile.UpdateUsage() will observe said pages to be committed and mark them
known-committed again, effectively negating SaveOpts.ExcludeCommittedZeroPages.

To fix this, inhibit MemoryFile.UpdateUsage() during MemoryFile.SaveTo(); in
essence, when MemoryFile.SaveTo() is in progress, it exclusively defines what
pages are known-committed.

PiperOrigin-RevId: 733876220
2025-03-05 14:50:59 -08:00