Some fixes:
* First argument of Task.CopyContext should always be the context.Context
derived from the currently running task, because it is used to get a
CopyScratchBuffer, which must be from the current task. This solved a bunch
of data races.
* Fix logic around which process is remote and which is local. These were
getting mixed up.
* Always read iovec structs (local and remote) from the local process's address
space, since they are syscall arguments. Only use the remote process address
space to read the memory pointed to by the remote iovecs.
* Added ptrace permissions check, per linux.
* Delete unused code from kernel/task_usermem.go
* Rewrote tests so that we read to (write from) a subprocess, rather than the
other way around. So we don't need CAP_PTRACE to run the tests.
* Also make tests async-signal-safe after call to fork(). I think this was the
source of the flakyness on linux previously.
PiperOrigin-RevId: 570506366
The intention of this change is to cover a sufficient surface to accommodate
the use of running Docker within gVisor, rather than a full implementation.
This implements the following features:
- Keys as a first-class concept in the kernel.
- Tracking keys in user namespaces.
- Task session keyrings: possession, inheritance.
- Key permission enforcement.
- The following `keyctl(2)` operations:
- `KEYCTL_GET_KEYRING_ID`
- `KEYCTL_DESCRIBE`
- `KEYCTL_JOIN_SESSION_KEYRING`
- `KEYCTL_SETPERM`
Notably, this does not implement:
- The ability to actually add any keys other than the session keyring
(which does not hold any cryptographic key data).
- Other special keyrings (thread keyring, process keyring, user session
keyring, etc.).
- Lots of `keyctl(2)` operations.
- Key expiration.
- Key garbage collection. Keys live until their user namespace is destroyed.
However, each user namespace is limited to 200 keys, so memory growth is
bounded.
- `add_key(2)`
- `request_key(2)`
... However, this makes design choices that seem odd given the limited scope
of this change, but make sense when taking into account the desire to
eventually accommodate them in the future. For example, there are many
`switch` statements with only one option for session keyrings, which would get
more options when adding support for other special keyrings. Similarly, the
signature of `PossessedKeys` takes in all 3 special "possessed" keyrings, but
currently only ever gets the session keyring as non-nil.
PiperOrigin-RevId: 567047896
This change introduces the nsfs file system. Each new namespace allocates
a new nsfs inode.
Here are reasons why we need these inodes:
* each namespace has to have an unique id.
* proc/pid/ns/ contains one entry for each namespace. Bind mounting one of
the files in this directory to somewhere else in the filesystem keeps the
corresponding namespace alive even if all processes currently in
the namespace terminate.
* setns() allows the calling process to join an existing namespace specified
by a file descriptor.
PiperOrigin-RevId: 550694515
This will be used to plumb the syscall number through to a counter metric that
exports the number of times an unimplemented syscall has been called.
Plenty of syscall implementations call `EmitUnimplementedEvent` for flags and
settings that are not implemented. With `sysno` available, they will be able
to plumb that bit of information through.
PiperOrigin-RevId: 518635831
This flag is added for tests that need to trigger a panic in the sentry
kernel. Only done for x86_64 which does have a dedicated syscall number for
afs_syscall; ARM does not.
PiperOrigin-RevId: 512731631
Syzkaller has found several issues with the two syscalls and a rework is
required. Disable tests and the syscall until issues can be fixed.
PiperOrigin-RevId: 491716795
Added a raw syscall points to all syscalls. Added schematized syscall
points to the following syscalls:
- Chdir
- Fchdir
- Setgid
- Setuid
- Setsid
- Setresuid
- Setresgid
PiperOrigin-RevId: 451001973
Added a raw syscall points to all syscalls. Added schematized syscall
points to the following syscalls:
- read
- close
- socket
- connect
- execve
- creat
- openat
- execveat
Updates #4805
PiperOrigin-RevId: 446008358