49 Commits

Author SHA1 Message Date
Konstantin Bogomolov f7875f90fd systrap: Unset fpstate_changed if ctx did not change.
For some reason we only cleared this flag in the sighandler, where it doesn't
make a difference.

PiperOrigin-RevId: 707690365
2024-12-18 15:45:14 -08:00
Andrei Vagin 03a28d158e platform/systrap: return memory access type based on a page fault error code
Now we don't need to trigger a second fault to figure out whether it was write
or read access.

Fixes #11008

Co-developed-by: Jamie Liu <jamieliu@google.com>
PiperOrigin-RevId: 697677262
2024-11-18 10:33:59 -08:00
Andrei Vagin f9c7e51064 systrap/arm64: add the end header after FPSIMD state in signal frame
gVisor doesn't support other than FPSIMD extension, so we have to be sure that
only FPSIMD states are restored from signal frames. Stub processes are forked
from the sentry and so they can have other extensions such as SVE which shares
FPSR and FPCR registers with FPSIMD.

Fixes #10900

PiperOrigin-RevId: 684878144
2024-10-11 10:22:33 -07:00
Konstantin Bogomolov a94f5e598f systrap: Replace all instances of unix.RawSyscall with pkg/hostsyscall variants.
PiperOrigin-RevId: 681941600
2024-10-03 10:48:09 -07:00
Andrei Vagin d432952bbe systrap: prevent corruptions of spinning sueues
The current synchronization is based on an assumption that a queue buffer can't
be recycled if the current thread itself is in the queue. Unfortunately, this
assumption is incorrect, and thus the queue buffer can be corrupted.

This change reworks the synchronization part so that the start index is updated
only after committing changes in the queue buffer.

Fixes #10000

PiperOrigin-RevId: 615226911
2024-03-12 17:32:04 -07:00
Jamie Liu aecf514158 Don't XSAVE PKRU state.
We don't implement any of the `pkey_*` syscalls, so applications can't use
protection keys.

Updates #10087

PiperOrigin-RevId: 611287272
2024-02-28 17:47:27 -08:00
Andrei Vagin 9a02a687f0 Set -g0 to make cc_pie_obj produce deterministic output to help Bazel caching.
The output object file contains debug information, which, by default, contains
the absolute path to source files. These absolute paths are volatile data
because they contain host/user specific paths, which makes
the files differ between invocations. This then negatively affects Bazel's
ability to accurately detect changes and cache them, which means people
need to rebuilt it much more frequently.

To fix, specify `-g0`, which will omit debugging information entirely.

PiperOrigin-RevId: 609575453
2024-02-22 19:11:08 -08:00
Andrei Vagin bfd27a1e43 systrap: track the spinning queue length in a separate counter
The current implementation has a race condition resulting in the skipping of
one element in the queue array. When retrieving objects from the queue, the
stub code can get stuck in an infinite loop due to unexpected unused elements.

Updates #10000

PiperOrigin-RevId: 608679165
2024-02-20 11:35:21 -08:00
Konstantin Bogomolov fe66cae2ed Enumerate known systrap stub failures to exit process cleanly.
This helps to rectify a long standing problem of Systrap panicking
when encountering corrupted sysmsg stub memory.

These errors specifically are easier to notice and debug since we
check for them in the stub code and flag them to the sentry
explicitly. They are now very grep-able to make finding their origin
in the stub code easier.

PiperOrigin-RevId: 604743496
2024-02-06 13:19:26 -08:00
Konstantin Bogomolov bb84006816 Fixup AMX workaround for ptrace.
SETREGSET/GETREGSET expect AMX portions of fpstate to always be used.
For this reason we need to allocate enough memory for this to happen,
even if we never populate the AMX portions within initX86FPState.

PiperOrigin-RevId: 599702181
2024-01-18 20:12:31 -08:00
Konstantin Bogomolov e9bdc76c02 Exclude AMX extended state from being xsave/xrstor'd.
For now we are going to completely disable using AMX, so we will
always subtract extended state size reserved from AMX from the rest
of the extended state size, and hardcode the AMX XCR0 bits to be
always off.

Fixes #9750.

PiperOrigin-RevId: 599302059
2024-01-17 15:11:03 -08:00
gVisor bot da0832d240 Merge pull request #9588 from avagin:typos
PiperOrigin-RevId: 576718473
2023-10-25 19:58:04 -07:00
Andrei Vagin 5f4abad306 Fix a few typos
It is an idea of running codespell as part of our presubmit checks.
Before enabling it for new changes, let's fix what it has found.

Signed-off-by: Andrei Vagin <avagin@gmail.com>
2023-10-25 12:13:42 -07:00
Konstantin Bogomolov cffce1a94a systrap: Revise slow-path enablement.
The current systrap fastpath heuristics do a good job getting high
performance when there are idle CPUs, but fail when there are not
enough and do much worse then even the "pure slowpath".

Here is a summary of changes made to remedy that:

1. Disable stub and dispatcher fastpath by default.
2. Decouple fastpath states to be separate between dispatcher and stub
   fastpath.
3. Implement response latency metrics for both sentry->stub and
   stub->sentry messages. Use these latency metrics in order keep track of
   the baseline latency for both sides. With baseline latency established,
   compare fastpath latency to determine how beneficial it is to keep
   fastpath enabled.

Some sampled benchmarks:
- sysbench-X-Y:
```
```
- gettid_benchmark
- getpid_benchmark

Some benchmark results (5-run average):
- On a 4 core machine:

[]() | HEAD | ThisCL
-------------------|----------|-----------
sysbench-1-8:      | 48218ms  |  50282ms
sysbench-2-4:      | 65900ms  |  72282ms
sysbench-4-2:      |427880ms  | 175714ms
sysbench-1-2:      | 12998ms  |  13688ms

getpid_benchmark: (HEAD)
```
Benchmark             Time             CPU   Iterations
-------------------------------------------------------
BM_Getpid          3471 ns         3441 ns       212121
BM_GetpidOpt       1039 ns         1029 ns       700000
```

getpid_benchmark: (This CL)
```
Benchmark             Time             CPU   Iterations
-------------------------------------------------------
BM_Getpid          3718 ns         3600 ns       200000
BM_GetpidOpt       1320 ns         1281 ns       538462
```

gettid_benchmark: Like getpid, this CL slightly slower on lower thread count
                  test variants.

- On a 1 core machine:
getpid_benchmark: (HEAD)

```
Benchmark             Time             CPU   Iterations
-------------------------------------------------------
BM_Getpid         74868 ns        75000 ns        10000
BM_GetpidOpt      74463 ns        74286 ns         8750
```

getpid_benchmark: (This CL)
```
Benchmark             Time             CPU   Iterations
-------------------------------------------------------
BM_Getpid         12425 ns        12443 ns        53846
BM_GetpidOpt       8645 ns         8686 ns        87500
```

gettid_benchmark: Same trend as for getpid_benchmark across the board.

  Another interesting case to look at for 1-core machines is copying one large file:
```
  ./runsc --rootless --network none --ignore-cgroups do --force-overlay=false sh -c "time head -c 1073741824 </dev/zero >full-file"
```
- file copy (HEAD):   36.07user 0.00system 0:36.44elapsed 98%CPU
- file copy (This CL): 2.96user 0.23system 0:07.14elapsed 44%CPU

Fixes #9119.

PiperOrigin-RevId: 576600019
2023-10-25 11:56:38 -07:00
Konstantin Bogomolov 821459c942 systrap: Enable using xsaveopt.
PiperOrigin-RevId: 554906814
2023-08-08 12:33:59 -07:00
Andrei Vagin ffcbc70b9a systrap: don't change an fpu state from the stub code
syshandler doesn't expect that the current fpu state can be changed.

Reported-by: syzbot+d51c9b676e1eebdfde0a@syzkaller.appspotmail.com
PiperOrigin-RevId: 547964140
2023-07-13 16:46:44 -07:00
Andrei Vagin 226f5145b6 systrap: preempt long running contexts
A context is preempted if it is running longer than 10ms
and there are other contexts in the queue.

PiperOrigin-RevId: 533376587
2023-05-19 00:35:21 -07:00
Andrei Vagin 74e63e9e29 Update packages
PiperOrigin-RevId: 532582853
2023-05-16 15:01:22 -07:00
Andrei Vagin cd358f833a systrap: don't wake up each thread separately
Now all threads are waiting on queue->num_thread_to_wakeup,
it is a single point for all threads.

This change allows us to avoid cases when num_active_threads
are inconsistent with threads states, because they can't be
changed atomically.

PiperOrigin-RevId: 531020012
2023-05-10 15:36:38 -07:00
Andrei Vagin bd0acf9da9 systrap: add wrappers for gcc atomic functions
It makes code a bit more readable.

PiperOrigin-RevId: 530758808
2023-05-09 17:39:33 -07:00
Konstantin Bogomolov d7f590dd00 Clean up context decoupling experiment.
This change removes code branches and variables only used in coupled-context
mode.

PiperOrigin-RevId: 529776383
2023-05-05 11:55:50 -07:00
Andrei Vagin ff424dce7f systrap: queue_get_context has to detect cases when a ring buffer is recycled
queue_get_context reads `start`, then it gets a value of ringbuffer[start] and
increments `start` if the value is a valid context ID. The issue is that
the ring buffer can be recycled between first two operations.

This change puts an index into a buffer value. It allows us to detect when a
buffer is recycled and we read a value of a wrong index.

PiperOrigin-RevId: 529552389
2023-05-04 17:03:19 -07:00
Andrei Vagin ac45274bd7 systrap: disable the fast path in stub threads when it isn't effective
Here are two conditions when the fast path is effective:
* the side that has to change a state is running on CPU when another side
  is polling the state. This is why the Sentry threads have a higher
  priority than stub threads.
* The sentry handles events faster than the overhead of scheduling another
  stub thread.

This patch addresses the second condition. The fast path in stub threads is
disabled when we reach the limit of stub threads. The idea is that more stub
threads can generate more events to the sentry.

PiperOrigin-RevId: 529120681
2023-05-03 10:02:21 -07:00
Andrei Vagin c34261d265 systrap: limit a number of stub thread by a number of awake contexts
A context is awake if its guest thread isn't in the interruption sleep state.

The idea here is that a task in the sleep state will not return within the fast
path timeout and so we don't need to hold a stub thread for it.

PiperOrigin-RevId: 529006216
2023-05-02 23:58:59 -07:00
Andrei Vagin 96f2aca71f systrap: don't copy an FPU state in sighandler
We can set ucontext->uc_mcontext.fpregs to the memory with a target state.

PiperOrigin-RevId: 527340901
2023-04-26 12:27:42 -07:00