gvisor

mirror of https://github.com/netbirdio/gvisor.git synced 2026-05-22 17:12:49 -07:00

Author	SHA1	Message	Date
Jamie Liu	4f3f5d8412	Fix sync.SeqCount.ReadOk() on non-x86 architectures. sync.SeqCount relies on the following memory orderings: - All stores following BeginWrite() in program order happen after the atomic read-modify-write (RMW) of SeqCount.epoch. In the Go 1.19 memory model, this is implied by atomic loads being acquire-seqcst. - All stores preceding EndWrite() in program order happen before the RMW of SeqCount.epoch. In the Go 1.19 memory model, this is implied by atomic stores being release-seqcst. - All loads following BeginRead() in program order happen after the load of SeqCount.epoch. In the Go 1.19 memory model, this is implied by atomic loads being acquire-seqcst. - All loads preceding ReadOk() in program order happen before the load of SeqCount.epoch. The Go 1.19 memory model does not imply this property. The x86 memory model does imply this final property, and in practice the current Go compiler does not reorder memory accesses around the load of SeqCount.epoch, so sync.SeqCount behaves correctly on x86. However, on ARM64, the instruction that is actually emitted for the atomic load from SeqCount.epoch is LDAR: ``` gvisor/pkg/sentry/kernel/kernel.SeqAtomicTryLoadTaskGoroutineSchedInfo(): gvisor/pkg/sentry/kernel/seqatomic_taskgoroutineschedinfo_unsafe.go:34 56371c: f9400025 ldr x5, [x1] 563720: f9400426 ldr x6, [x1, #8] 563724: f9400822 ldr x2, [x1, #16] 563728: f9400c23 ldr x3, [x1, #24] gvisor/pkg/sentry/kernel/seqatomic_taskgoroutineschedinfo_unsafe.go:36 56372c: d503201f nop gvisor/pkg/sync/sync.(*SeqCount).ReadOk(): gvisor/pkg/sync/seqcount.go:107 563730: 88dffc07 ldar w7, [x0] 563734: 6b0400ff cmp w7, w4 ``` LDAR is explicitly documented as not implying the required memory ordering: https://developer.arm.com/documentation/den0024/latest/Memory-Ordering/Barriers/One-way-barriers Consequently, SeqCount.ReadOk() is incorrectly memory-ordered on weakly-ordered architectures. To fix this, we need to introduce an explicit memory fence. On ARM64, there is no way to implement the memory fence in question without resorting to assembly, so the implementation is straightforward. On x86, we introduce a compiler fence, since future compilers might otherwise reorder memory accesses to after atomic loads; the only apparent way to do so is also by using assembly, which unfortunately introduces overhead: - After the call to sync.MemoryFenceReads(), callers zero XMM15 and reload the runtime.g pointer from %fs:-8, reflecting the switch from ABI0 to ABIInternal. This is a relatively small cost. - Before the call to sync.MemoryFenceReads(), callers spill all registers to the stack, since ABI0 function calls clobber all registers. The cost of this depends on the state of the caller before the call, and is not reflected in BenchmarkSeqCountReadUncontended (which does not read any protected state between the calls to BeginRead() and ReadOk()). Both of these problems are caused by Go assembly functions being restricted to ABI0. Go provides a way to mark assembly functions as using ABIInternal instead, but restricts its use to functions in package runtime (https://github.com/golang/go/issues/44065). runtime.publicationBarrier(), which is semantically "sync.MemoryFenceWrites()", is implemented as a compiler fence on x86; defining sync.MemoryFenceReads() as an alias for that function (using go:linkname) would mitigate the former problem, but not the latter. Thus, for simplicity, we define sync.MemoryFenceReads() in (ABI0) assembly, and have no choice but to eat the overhead. ("Fence" and "barrier" are often used interchangeably in this context; Linux uses "barrier" (e.g. `smp_rmb()`), while C++ uses "fence" (e.g. `std::atomic_memory_fence()`). We choose "fence" to reduce ambiguity with "write barriers", since Go is a GC'd language.) PiperOrigin-RevId: 573861378	2023-10-16 10:50:58 -07:00
Ayush Ranjan	f6ed4523dc	Reformat codebase. PiperOrigin-RevId: 449358041	2022-05-17 17:48:35 -07:00
Jamie Liu	377cfd813d	Add sync.SeqCount.BeginWriteOk. PiperOrigin-RevId: 430268914	2022-02-22 12:25:20 -08:00
Jamie Liu	8a45c81616	Allow use of SeqAtomic with pointer-containing types. Per runtime.memmove, pointers are always copied atomically, as this is required by the GC. (Also, the init() safety check doesn't work because it gets renamed to <prefix>init() by template instantiation.) PiperOrigin-RevId: 345800302	2020-12-04 19:07:17 -08:00
Jamie Liu	9bd0164237	Improve sync.SeqCount performance. - Make sync.SeqCountEpoch not a struct. This allows sync.SeqCount.BeginRead() to be inlined. - Mark sync.SeqAtomicLoad<T> nosplit to mitigate the Go compiler's refusal to inline it. (Best I could get was "cost 92 exceeds budget 80".) - Use runtime-guided spinning in SeqCount.BeginRead(). Benchmarks: name old time/op new time/op delta pkg:pkg/sync/sync goos:linux goarch:amd64 SeqCountWriteUncontended-12 8.24ns ± 0% 11.40ns ± 0% +38.35% (p=0.000 n=10+10) SeqCountReadUncontended-12 0.33ns ± 0% 0.14ns ± 3% -57.77% (p=0.000 n=7+8) pkg:pkg/sync/seqatomictest/seqatomic goos:linux goarch:amd64 SeqAtomicLoadIntUncontended-12 0.64ns ± 1% 0.41ns ± 1% -36.40% (p=0.000 n=10+8) SeqAtomicTryLoadIntUncontended-12 0.18ns ± 4% 0.18ns ± 1% ~ (p=0.206 n=10+8) AtomicValueLoadIntUncontended-12 0.27ns ± 3% 0.27ns ± 0% -1.77% (p=0.000 n=10+8) (atomic.Value.Load is, of course, inlined. We would expect an uncontended inline SeqAtomicLoad<int> to perform identically to SeqAtomicTryLoad<int>.) The "regression" in BenchmarkSeqCountWriteUncontended, despite this CL changing nothing in that path, is attributed to microarchitectural subtlety; the benchmark loop is unchanged except for its address: Before this CL: :0 0x4e62d1 48ffc2 INCQ DX :0 0x4e62d4 48399110010000 CMPQ DX, 0x110(CX) :0 0x4e62db 7e26 JLE 0x4e6303 :0 0x4e62dd 90 NOPL :0 0x4e62de bb01000000 MOVL $0x1, BX :0 0x4e62e3 f00fc118 LOCK XADDL BX, 0(AX) :0 0x4e62e7 ffc3 INCL BX :0 0x4e62e9 0fbae300 BTL $0x0, BX :0 0x4e62ed 733a JAE 0x4e6329 :0 0x4e62ef 90 NOPL :0 0x4e62f0 bb01000000 MOVL $0x1, BX :0 0x4e62f5 f00fc118 LOCK XADDL BX, 0(AX) :0 0x4e62f9 ffc3 INCL BX :0 0x4e62fb 0fbae300 BTL $0x0, BX :0 0x4e62ff 73d0 JAE 0x4e62d1 After this CL: :0 0x4e6361 48ffc2 INCQ DX :0 0x4e6364 48399110010000 CMPQ DX, 0x110(CX) :0 0x4e636b 7e26 JLE 0x4e6393 :0 0x4e636d 90 NOPL :0 0x4e636e bb01000000 MOVL $0x1, BX :0 0x4e6373 f00fc118 LOCK XADDL BX, 0(AX) :0 0x4e6377 ffc3 INCL BX :0 0x4e6379 0fbae300 BTL $0x0, BX :0 0x4e637d 733a JAE 0x4e63b9 :0 0x4e637f 90 NOPL :0 0x4e6380 bb01000000 MOVL $0x1, BX :0 0x4e6385 f00fc118 LOCK XADDL BX, 0(AX) :0 0x4e6389 ffc3 INCL BX :0 0x4e638b 0fbae300 BTL $0x0, BX :0 0x4e638f 73d0 JAE 0x4e6361 PiperOrigin-RevId: 329754148	2020-09-02 11:37:31 -07:00
Ian Gudger	27500d529f	New sync package. * Rename syncutil to sync. * Add aliases to sync types. * Replace existing usage of standard library sync package. This will make it easier to swap out synchronization primitives. For example, this will allow us to use primitives from github.com/sasha-s/go-deadlock to check for lock ordering violations. Updates #1472 PiperOrigin-RevId: 289033387	2020-01-09 22:02:24 -08:00
Michael Pratt	fe1369ac98	Move package sync to third_party PiperOrigin-RevId: 231889261 Change-Id: I482f1df055bcedf4edb9fe3fe9b8e9c80085f1a0	2019-01-31 17:49:14 -08:00
Ian Gudger	8fce67af24	Use correct company name in copyright header PiperOrigin-RevId: 217951017 Change-Id: Ie08bf6987f98467d07457bcf35b5f1ff6e43c035	2018-10-19 16:35:11 -07:00
Googler	d02b74a5dc	Check in gVisor. PiperOrigin-RevId: 194583126 Change-Id: Ica1d8821a90f74e7e745962d71801c598c652463	2018-04-28 01:44:26 -04:00

9 Commits