Commit Graph

4947 Commits

Author SHA1 Message Date
Andrei Vagin aa2c8c33c6 Implement setns for mount namespaces
PiperOrigin-RevId: 552859231
2023-08-01 11:12:29 -07:00
Kevin Krakauer 1f14891734 removed now-renamed bufferv2 package
PiperOrigin-RevId: 552835803
2023-08-01 10:03:58 -07:00
Jing Chen ab7259268b Add devices cgroups which is partially implemented. It could export the device cgroups to devices.deny and devices.allow, add/remove device cgroups rules will need to be implemented to fully support device cgroups.
PiperOrigin-RevId: 552673967
2023-07-31 21:22:07 -07:00
Andrei Vagin 41bb04c149 Implement mount namespaces
This change implements only the basic functions of mount namespaces.
All features that depends on user namespaces will be implemented separately.

PiperOrigin-RevId: 552673896
2023-07-31 21:12:21 -07:00
Lucas Manning e77ec6e719 Issue a panic in the case of a failed mount promise.
Failing to resolve a mount promise is never expected behavior. The
sandbox should crash in this case, since something has gone
fatally wrong.

PiperOrigin-RevId: 552652799
2023-07-31 18:57:25 -07:00
Lucas Manning 1d4792c566 Implement accel fd methods and gasket ioctls.
The implementation of memory mapping and interrupt registration is
very similar to what's already been done for nvproxy.

PiperOrigin-RevId: 552644264
2023-07-31 18:05:28 -07:00
Kevin Krakauer 8bccff393c netstack: fix flaky forwarding test
PiperOrigin-RevId: 552626894
2023-07-31 16:41:22 -07:00
Fabricio Voznika 500658dc81 Return correct number of PIDs with multi-container
Updates #172

PiperOrigin-RevId: 552620987
2023-07-31 16:20:47 -07:00
Jing Chen 7f067c7e1d Implement setns CLONE_NEWIPC namespace type.
PiperOrigin-RevId: 552619565
2023-07-31 16:12:45 -07:00
Andrei Vagin ef95be6e1c kernel: check that a task has a network namespace
task.GetNetworkNamespace has to be used when we try to access a remote task.

PiperOrigin-RevId: 552593738
2023-07-31 14:43:55 -07:00
Lucas Manning 5babda5341 Lock around endpoint info access in UDP onICMPError.
PiperOrigin-RevId: 552593077
2023-07-31 14:32:56 -07:00
Andrei Vagin 46115504ec Implement the setns syscall
This change introduces the nsfs file system. Each new namespace allocates
a new nsfs inode.

Here are reasons why we need these inodes:
* each namespace has to have an unique id.
* proc/pid/ns/ contains one entry for each namespace. Bind mounting one of
  the files in this directory to somewhere else in the filesystem keeps the
  corresponding namespace alive even if all processes currently in
  the namespace terminate.
* setns() allows the calling process to join an existing namespace specified
  by a file descriptor.

PiperOrigin-RevId: 550694515
2023-07-24 15:45:08 -07:00
Lucas Manning 19e04218b9 Add methods for generating PCI sysfs paths and registering accel devices.
The TPU userspace driver needs access to specific PCI device information
located in Linux sysfs. We mirror the sysfs paths the driver reads on the host
in the Sentry sysfs. This way we can ensure we only expose the host device
information that's strictly necessary for TPU to run.

PiperOrigin-RevId: 550005271
2023-07-21 11:43:55 -07:00
Michael Pratt f3e4a1fc3b Remove last remaining !go1.22 build tag
The last remaining !go1.22 build is protecting the definition of
pkg/sync.maptype, which is a copy of runtime.maptype. We need to ensure these
definitions match so we can safely access the hasher field.

At its core, this CL achieves this check by ensuring that
unsafe.Offsetof(maptype{}.Hasher) matches the offset in the runtime version of
the type.

Several things happen along the way to achieve this:

* As of May 2023, runtime.maptype is actually a type alias for
internal/abi.MapType. checkoffset was failing to record the offsets because it
skipped type aliases for no good reason. Simply removing the type alias check
is sufficient to make type aliases work. (This part of the CL is technically
unnecessary because this CL ultimately references internal/abi.MapType
directly in anticipation of removal of the type alias. But there is no reason
not to allow type aliases).

* The checkconst / checkoffset regexp unintentionally does not allow / in
package paths, even though the rest of the package supports /. Fix this.

* checkconst was comparing the literal AST expression string against the
runtime value (i.e., "unsafe.Offsetof(maptype{}.Hasher)" vs "72", which fails
comparison. Switch to getting the resolved constant value from the type
checker.

* nogo/check.importer only loads package facts on direct import (stored in
importer.cache). If a package is not directly imported ImportPackageFact will
not find the facts. Typically packages need to ensure they directly depend on
packages they want facts from (e.g., pkg/sync has a dummy import of runtime in
runtime.go). This doesn't work for internal/abi because we cannot directly
import an internal package. Work around this as a hack by unconditionally
"importing" internal/abi when analyzing any package.

With regard to the last point, not that the nogo/defs.bzl nogo integration only
provides facts from the direct dependencies and the entire stdlib (since the
stdlib is analyzed as one bundle). So this trick only works for a stdlib
package. A bazel package indirect dependency would be missing facts altogether.

PiperOrigin-RevId: 549999084
2023-07-21 11:22:36 -07:00
Ayush Ranjan 41ec0d4189 Add nvproxy support for V100 Nvidia GPUs.
Tested on 1 V100 GPU:
```
$ docker run --runtime=runsc --rm --gpus all nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda11.7.1-ubi8
[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done
```

PiperOrigin-RevId: 549837326
2023-07-20 22:09:39 -07:00
gVisor bot 1ba123fff2 Merge pull request #9007 from andrew-d:andrew/tcp-forwarder-on-ignored
PiperOrigin-RevId: 549727271
2023-07-20 13:44:41 -07:00
Lucas Manning d4510e760b Add seccomp filters for TPU proxying and stub out accel fd methods.
PiperOrigin-RevId: 549718797
2023-07-20 13:16:46 -07:00
Nayana Bidari aff5168121 Plumb memory cgroup id in memmap.IncRef.
Update the memmap IncRef method to pass memory cgroup id and store it in the
FrameRefSet which will be used for memory accounting. During DecRef, the
memCgID from the FrameRefSet will be retrieved and passed to MemoryLocked.Dec
to remove the memory from the cgroup.

PiperOrigin-RevId: 549656411
2023-07-20 09:39:44 -07:00
Lucas Manning 9f4df2187e Add accel and gasket ABI definitions.
PiperOrigin-RevId: 549376196
2023-07-19 11:33:20 -07:00
Nayana Bidari a87aa73698 Increment/decrement memory accounted per cgroup.
- Adds a new field in the usageInfo to store the memory cgroup id.
- Creates a map of cgroup ids and memory stats to track the memory per cgroup
in MemoryLocked struct.
- Introduces new methods to increment, decrement, move, copy and get the total
memory usage per cgroup.

PiperOrigin-RevId: 549148091
2023-07-18 16:50:09 -07:00
Jamie Liu ef410e665b Pass NV2080_CTRL_CMD_MC_SERVICE_INTERRUPTS through nvproxy.
Fixes #9176

PiperOrigin-RevId: 549125072
2023-07-18 15:18:19 -07:00
Jamie Liu f43a5fc63a Remove panic in ConsumeCoverageData() when no coverage is observed.
A call to ConsumeCoverageData() can observe zero incremental coverage
immediately after a concurrent call to ConsumeCoverageData() unlocks coverageMu
if sync.Mutex.Lock/Unlock are excluded from coverage instrumentation.

PiperOrigin-RevId: 549119637
2023-07-18 14:57:13 -07:00
Nicolas Lacasse 150831fad9 kernfs: Don't try to cache anonymous inodes.
They have no parent, so are not reachable again.

PiperOrigin-RevId: 548765107
2023-07-17 12:33:03 -07:00
Ayush Ranjan 05f62e5e66 Do not hold metadataMu on gofer O_DIRECT read path.
dentry.writeback() takes dataMu when it needs to. This lock seems to be
unnecessary.

PiperOrigin-RevId: 548763586
2023-07-17 12:16:28 -07:00
Andrew Dunham 057e0b7eae pkg/tcpip/transport/tcp: add statistics for dropped connections
When the TCP forwarder ignores a connection due to having too many
in-flight connections, it's not easy to log a message or update a metric
for later debugging. Add a metric that will be incremented in this case
so that the user of the Forwarder can observe this.

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
2023-07-17 15:07:55 -04:00