This change implements only the basic functions of mount namespaces.
All features that depends on user namespaces will be implemented separately.
PiperOrigin-RevId: 552673896
Failing to resolve a mount promise is never expected behavior. The
sandbox should crash in this case, since something has gone
fatally wrong.
PiperOrigin-RevId: 552652799
This change introduces the nsfs file system. Each new namespace allocates
a new nsfs inode.
Here are reasons why we need these inodes:
* each namespace has to have an unique id.
* proc/pid/ns/ contains one entry for each namespace. Bind mounting one of
the files in this directory to somewhere else in the filesystem keeps the
corresponding namespace alive even if all processes currently in
the namespace terminate.
* setns() allows the calling process to join an existing namespace specified
by a file descriptor.
PiperOrigin-RevId: 550694515
The TPU userspace driver needs access to specific PCI device information
located in Linux sysfs. We mirror the sysfs paths the driver reads on the host
in the Sentry sysfs. This way we can ensure we only expose the host device
information that's strictly necessary for TPU to run.
PiperOrigin-RevId: 550005271
The last remaining !go1.22 build is protecting the definition of
pkg/sync.maptype, which is a copy of runtime.maptype. We need to ensure these
definitions match so we can safely access the hasher field.
At its core, this CL achieves this check by ensuring that
unsafe.Offsetof(maptype{}.Hasher) matches the offset in the runtime version of
the type.
Several things happen along the way to achieve this:
* As of May 2023, runtime.maptype is actually a type alias for
internal/abi.MapType. checkoffset was failing to record the offsets because it
skipped type aliases for no good reason. Simply removing the type alias check
is sufficient to make type aliases work. (This part of the CL is technically
unnecessary because this CL ultimately references internal/abi.MapType
directly in anticipation of removal of the type alias. But there is no reason
not to allow type aliases).
* The checkconst / checkoffset regexp unintentionally does not allow / in
package paths, even though the rest of the package supports /. Fix this.
* checkconst was comparing the literal AST expression string against the
runtime value (i.e., "unsafe.Offsetof(maptype{}.Hasher)" vs "72", which fails
comparison. Switch to getting the resolved constant value from the type
checker.
* nogo/check.importer only loads package facts on direct import (stored in
importer.cache). If a package is not directly imported ImportPackageFact will
not find the facts. Typically packages need to ensure they directly depend on
packages they want facts from (e.g., pkg/sync has a dummy import of runtime in
runtime.go). This doesn't work for internal/abi because we cannot directly
import an internal package. Work around this as a hack by unconditionally
"importing" internal/abi when analyzing any package.
With regard to the last point, not that the nogo/defs.bzl nogo integration only
provides facts from the direct dependencies and the entire stdlib (since the
stdlib is analyzed as one bundle). So this trick only works for a stdlib
package. A bazel package indirect dependency would be missing facts altogether.
PiperOrigin-RevId: 549999084
Tested on 1 V100 GPU:
```
$ docker run --runtime=runsc --rm --gpus all nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda11.7.1-ubi8
[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done
```
PiperOrigin-RevId: 549837326
Update the memmap IncRef method to pass memory cgroup id and store it in the
FrameRefSet which will be used for memory accounting. During DecRef, the
memCgID from the FrameRefSet will be retrieved and passed to MemoryLocked.Dec
to remove the memory from the cgroup.
PiperOrigin-RevId: 549656411
- Adds a new field in the usageInfo to store the memory cgroup id.
- Creates a map of cgroup ids and memory stats to track the memory per cgroup
in MemoryLocked struct.
- Introduces new methods to increment, decrement, move, copy and get the total
memory usage per cgroup.
PiperOrigin-RevId: 549148091
A call to ConsumeCoverageData() can observe zero incremental coverage
immediately after a concurrent call to ConsumeCoverageData() unlocks coverageMu
if sync.Mutex.Lock/Unlock are excluded from coverage instrumentation.
PiperOrigin-RevId: 549119637
When the TCP forwarder ignores a connection due to having too many
in-flight connections, it's not easy to log a message or update a metric
for later debugging. Add a metric that will be incremented in this case
so that the user of the Forwarder can observe this.
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>