Changes the API of tcpip.Clock to also provide a method for scheduling and
rescheduling work after a specified duration. This change also implements the
AfterFunc method for existing implementations of tcpip.Clock.
This is the groundwork required to mock time within tests. All references to
CancellableTimer has been replaced with the tcpip.Job interface, allowing for
custom implementations of scheduling work.
This is a BREAKING CHANGE for clients that implement their own tcpip.Clock or
use tcpip.CancellableTimer. Migration plan:
1. Add AfterFunc(d, f) to tcpip.Clock
2. Replace references of tcpip.CancellableTimer with tcpip.Job
3. Replace calls to tcpip.CancellableTimer#StopLocked with tcpip.Job#Cancel
4. Replace calls to tcpip.CancellableTimer#Reset with tcpip.Job#Schedule
5. Replace calls to tcpip.NewCancellableTimer with tcpip.NewJob.
PiperOrigin-RevId: 322906897
This change gates all FUSE commands (by gating /dev/fuse) behind a runsc
flag. In order to use FUSE commands, use the --fuse flag with the --vfs2
flag. Check if FUSE is enabled by running dmesg in the sandbox.
Previously, it was not possible to encode/decode an object graph which
contained a pointer to a field within another type. This was because the
encoder was previously unable to disambiguate a pointer to an object and a
pointer within the object.
This CL remedies this by constructing an address map tracking the full memory
range object occupy. The encoded Refvalue message has been extended to allow
references to children objects within another object. Because the encoding
process may learn about object structure over time, we cannot encode any
objects under the entire graph has been generated.
This CL also updates the state package to use standard interfaces intead of
reflection-based dispatch in order to improve performance overall. This
includes a custom wire protocol to significantly reduce the number of
allocations and take advantage of structure packing.
As part of these changes, there are a small number of minor changes in other
places of the code base:
* The lists used during encoding are changed to use intrusive lists with the
objectEncodeState directly, which required that the ilist Len() method is
updated to work properly with the ElementMapper mechanism.
* A bug is fixed in the list code wherein Remove() called on an element that is
already removed can corrupt the list (removing the element if there's only a
single element). Now the behavior is correct.
* Standard error wrapping is introduced.
* Compressio was updated to implement the new wire.Reader and wire.Writer
inteface methods directly. The lack of a ReadByte and WriteByte caused issues
not due to interface dispatch, but because underlying slices for a Read or
Write call through an interface would always escape to the heap!
* Statify has been updated to support the new APIs.
See README.md for a description of how the new mechanism works.
PiperOrigin-RevId: 318010298
In order to make sure all aio goroutines have stopped during S/R, a new
WaitGroup was added to TaskSet, analagous to runningGoroutines. This WaitGroup
is incremented with each aio goroutine, and waited on during kernel.Pause.
The old VFS1 aio code was changed to use this new WaitGroup, rather than
fs.Async. The only uses of fs.Async are now inode and mount Release operations,
which do not call fs.Async recursively. This fixes a lock-ordering violation
that can cause deadlocks.
Updates #1035.
PiperOrigin-RevId: 316689380
LockFD is the generic implementation that can be embedded in
FileDescriptionImpl implementations. Unique lock ID is
maintained in vfs.FileDescription and is created on demand.
Updates #1480
PiperOrigin-RevId: 315604825
This is needed to set up host fds passed through a Unix socket. Note that
the host package depends on kernel, so we cannot set up the hostfs mount
directly in Kernel.Init as we do for sockfs and pipefs.
Also, adjust sockfs to make its setup look more like hostfs's and pipefs's.
PiperOrigin-RevId: 308274053
Software GSO implementation currently has a complicated code path with
implicit assumptions that all packets to WritePackets carry same Data
and it does this to avoid allocations on the path etc. But this makes it
hard to reuse the WritePackets API.
This change breaks all such assumptions by introducing a new Vectorised
View API ReadToVV which can be used to cleanly split a VV into multiple
independent VVs. Further this change also makes packet buffers linkable
to form an intrusive list. This allows us to get rid of the array of
packet buffers that are passed in the WritePackets API call and replace
it with a list of packet buffers.
While this code does introduce some more allocations in the benchmarks
it doesn't cause any degradation.
Updates #231
PiperOrigin-RevId: 304731742
A socket mount where anonymous sockets will reside is added to the
VirtualFilesystem. Socketfs is built on top of kernfs.
Updates #1476, #1478, #1484, #1485.
PiperOrigin-RevId: 304095251
- When setting up the virtual filesystem, mount a host.filesystem to contain
all files that need to be imported.
- Make read/preadv syscalls to the host in cases where preadv2 may not be
supported yet (likewise for writing).
- Make save/restore functions in kernel/kernel.go return early if vfs2 is
enabled.
PiperOrigin-RevId: 300922353
/dev/net/tun does not currently work with hostinet. This has caused some
program starts failing because it thinks the feature exists.
PiperOrigin-RevId: 297876196
TCP/IP will work with netstack networking. hostinet doesn't work, and sockets
will have the same behavior as it is now.
Before the userspace is able to create device, the default loopback device can
be used to test.
/proc/net and /sys/net will still be connected to the root network stack; this
is the same behavior now.
Issue #1833
PiperOrigin-RevId: 296309389
- Added fsbridge package with interface that can be used to open
and read from VFS1 and VFS2 files.
- Converted ELF loader to use fsbridge
- Added VFS2 types to FSContext
- Added vfs.MountNamespace to ThreadGroup
Updates #1623
PiperOrigin-RevId: 295183950
FD table now holds both VFS1 and VFS2 types and uses the correct
one based on what's set.
Parts of this CL are just initial changes (e.g. sys_read.go,
runsc/main.go) to serve as a template for the remaining changes.
Updates #1487
Updates #1623
PiperOrigin-RevId: 292023223
Because the abi will depend on the core types for marshalling (usermem,
context, safemem, safecopy), these need to be flattened from the sentry
directory. These packages contain no sentry-specific details.
PiperOrigin-RevId: 291811289