`fdBitmap.FirstZero()` could return `max` value; if it does, then
recompute the max value to avoid reusing the old max value twice.
The default bitmap size for file descriptors in gVisor is 65535.
Add a pipe test that attempts to create more than 65535 FDs to hit the edge
case where fdBitmap.FirstZero() returns the default bitmap max value of 65535.
TESTED:
http://sponge2/4c12ce75-3763-4773-ad62-87c6b8fe0446http://sponge2/9c9d6ea0-b69c-432c-a16b-9446214109ba
PiperOrigin-RevId: 724410846
- Add Kernel.IsPaused() which indicates whether the kernel is currently paused.
- Add TaskSet.ForEachThreadGroup() which allows callers to iterate through all
thread groups in the kernel.
- Export FDTable.ForEach() which allows other packages to iterate over all FDs.
PiperOrigin-RevId: 640760723
FDTable.fdBitmap.ForEach() conveniently allows this interruption. Just plumb
the caller function result to Bitmap.ForEach().
PiperOrigin-RevId: 636775362
This allows for external information to be passed to restore code.
Similar to c087777e37 ("Plumb restore context to afterLoad()").
Updates #1956.
PiperOrigin-RevId: 614125262
dup(2) man page specifies:
If the file descriptor newfd was previously open, it is closed
before being reused; the close is performed silently (i.e., any
errors during the close are not reported by dup2()).
Even though we were DecRef-ing and hence releasing the replaced FD, we were
not calling OnClose(). Compare fs/file.c:do_dup2() -> filp_close(tofree), which
in turn calls filp_flush(). In gVisor, FileDescription.OnClose() analogously
does such flush operations.
in turn
PiperOrigin-RevId: 583147682
- Fixed up some documentation.
- Got rid of some redundant FDTable.get() calls from FDTable.Remove() and
FDTable.RemoveNextInRange().
- Consistently handled the result of FDTable.set().
- Added file != nil precondition to NewFDAt().
- Only call fdBitmap.Add() and fdBitmap.Remove() when necessary.
PiperOrigin-RevId: 583096816
FDTable.descriptorTable is a slice of unsafe.Pointer-s and its maximum length
is MaxInt32. It requires up to 16GB of memory. A process can use just a few
descriptors but sets one or more of them to high numbers. In this case,
FDTable.descriptorTable is extended to the maximum size.
The problem here is that go-runtime zeros memory regions when they are reused.
In the case of fdtable, the memory region is 16GB, so it is a time consuming
operation. Second, it forces the kernel to allocate physical pages to
the entire region.
This change adds another level to descriptorTable, so the first level is
a slice of buckets where each bucket is a slice of descriptors. The bucket
size is fixed to 512 entries to fit one page.
Before:
BenchmarkFDLookupAndDecRef-12 50834290 23.70 ns/op
BenchmarkCreateWithMaxFD-12 2 7194873988 ns/op
BenchmarkFDLookupAndDecRefConcurrent-12 23775555 49.68 ns/op
BenchmarkTableLookup-12 412888780 2.835 ns/op
BenchmarkTableMapLookup-12 87944782 12.84 ns/op
After:
BenchmarkFDLookupAndDecRef-12 46229940 25.03 ns/op
BenchmarkCreateWithMaxFD-12 13 82573899 ns/op
BenchmarkFDLookupAndDecRefConcurrent-12 21889380 54.13 ns/op
BenchmarkTableLookup-12 415851230 2.821 ns/op
BenchmarkTableMapLookup-12 97236267 11.89 ns/op
Reported-by: syzbot+af17678e3bfb7ca7c65a@syzkaller.appspotmail.com
PiperOrigin-RevId: 539138632
There were 2 problems when trying to allocate a high FD value:
- Rlimit is stored as uint64 and could be truncated when converting to int32
to calculate the max value allowed for the FD.
- While trying to double the FD table size, the new length for the table could
end up short due to invalid type convertion again.
Reported-by: syzbot+e4a60cfb88b515cbd2b1@syzkaller.appspotmail.com
PiperOrigin-RevId: 518362257
Apply bitmap in fd_table to record open file fd. It can
accelerate the speed of allocating or removing fd from
fdtable.
Signed-off-by: Howard Zhang <howard.zhang@arm.com>
Add Equals method to compare syserror and unix.Errno errors to linuxerr errors.
This will facilitate removal of syserror definitions in a followup, and
finding needed conversions from unix.Errno to linuxerr.
PiperOrigin-RevId: 380909667
The syscall package has been deprecated in favor of golang.org/x/sys.
Note that syscall is still used in the following places:
- pkg/sentry/socket/hostinet/stack.go: some netlink related functionalities
are not yet available in golang.org/x/sys.
- syscall.Stat_t is still used in some places because os.FileInfo.Sys() still
returns it and not unix.Stat_t.
Updates #214
PiperOrigin-RevId: 360701387
IN_CLOSE should only be generated when a file description loses its last
reference; not when a file descriptor is closed.
See fs/file_table.c:__fput.
Updates #5348.
PiperOrigin-RevId: 353810697