This helps to rectify a long standing problem of Systrap panicking
when encountering corrupted sysmsg stub memory.
These errors specifically are easier to notice and debug since we
check for them in the stub code and flag them to the sentry
explicitly. They are now very grep-able to make finding their origin
in the stub code easier.
PiperOrigin-RevId: 604743496
This adds a per-task cache of seccomp actions to take for syscall numbers
where the filters return an action without depending on anything other than
the syscall number and the architecture code of the seccomp program input.
This avoids evaluating seccomp-bpf programs in the syscall hot path, for
programs that use seccomp *within* gVisor (aka on themselves).
Benchmarks show that this removes about 50ns from the syscall hot path
for a trivial filter like the one in the benchmark.
Real-world filters are much longer, and the benefit is magnified the more
complex the filter is.
```
│ not_cached │ cached │
│ sec/op │ sec/op vs base │
SyscallUnderSeccomp 1.282µ ± 3% 1.230µ ± 1% -4.06% (p=0.002 n=6)
```
PiperOrigin-RevId: 586522068
This introduces a `metric.FieldValue` struct type that wraps a string.
All metric interfaces that deal with field values have been updated to use
pointers to this type instead of strings.
The intent of this change is to make it more obvious that field values must
be passed using references. Prior to this change, this was done using string
pointer comparisons. Now this must be done by using a pointer to the same
`metric.FieldValue` struct.
The struct type still externally exposes its string so that it can be referred
to in value function callbacks by "custom" metrics. (Though there are no
current uses of callback metrics with fields.)
PiperOrigin-RevId: 527030738
This optimizes both increments and lookup operations.
It does so using the following:
- For metrics of potential cardinality larger than 48, it will switch to using
a map rather than linear search for mapping field value combinations to the
index that combination corresponds to in the flattened list of values.
For metrics of cardinality 48 or smaller, linear search is still used as it
is still faster (and also still faster than binary search), as determined by
benchmarks.
- All field values are checked for pointer uniqueness (i.e. the pointer to the
start of each field value string must be unique). This is then used for
faster matching: instead of comparing whole strings (and needing to hash
them, in the case of doing map-based lookups), it compares pointer values.
In the context of this metric library, because all field values must be
pre-declared ahead of time, keeping references to these pre-declared strings
should always be possible. This is enforced: it will `panic` if the metric
user does not do this. In practice, this is easy to do by using `const`
strings for all metric values. This change does just that for existing
metrics with string fields.
From benchmarks, this speeds up the time to take a snapshot of existing
metrics by -8.35%. With the unimplemented syscall counter metric, this
optimization reduces the slowdown of adding this metric from +4,250% to
a still-large but much more manageable +290%. A further optimization
(cl/524419591) will reduce this overhead further before re-introducing the
unimplemented syscall counter metric.
PiperOrigin-RevId: 526134756
This will be used to plumb the syscall number through to a counter metric that
exports the number of times an unimplemented syscall has been called.
Plenty of syscall implementations call `EmitUnimplementedEvent` for flags and
settings that are not implemented. With `sysno` available, they will be able
to plumb that bit of information through.
PiperOrigin-RevId: 518635831
Prior to this CL, each of the 4 syscall points (raw enter,
enter, raw exit, exit) each resulted in 1 atomic read, on top of the existing
atomic read to read non-seccheck-related per-syscall flags.
This CL propagates seccheck's syscall-related points to the existing
per-syscall flags bitfield, such that this data can be loaded in a single
atomic read per syscall, rather than 5.
This shaves off a few precious nanoseconds (-3%) from the hot syscall path.
PiperOrigin-RevId: 463190096
Enable all trace points while running syscall tests to catch possible
crashes and bugs that may exist. These are not verifying that the data
in the points are correct though. Since all trace points are platform
agnostic for now, only enable them for ptrace.
Updates #4805
PiperOrigin-RevId: 455232093
The use of protobuf.Any is convenient, but adds to proto serialization
time and number of memory allocations required to send a message.
Instead, we now use an enum to indentify the message and use it to
determine how to unmarshall the message on the receiveing end. It
also speeds up event consuption by not requiring a map from string
(proto names) to callbacks.
BenchmarkProtoAny-6 115.9 ns/op 210 B/op 4 allocs/op
BenchmarkProtoEnum-6 58.3 ns/op 2 B/op 1 allocs/op
Updates #4805
PiperOrigin-RevId: 446879057
Each syscall provides 4 different points. There is a raw syscall point that
contains the syscall number and all 6 arguments, nothing else. Some syscalls
can provide a schematized version of the syscall by defining a function that
converts the syscall into a proto representing the syscall. Each of these
flavors have a point for enter and another for exit. In both cases, the exit
event adds return value and errno (if any).
Updates #4805
PiperOrigin-RevId: 445510907
Change the linuxerr.ErrorFromErrno to return an error type and not
a *errors.Error type. The latter results in problems comparing to nil
as <nil><nil> != <nil><*errors.Error>.
In a follow up, there will be a change to remove *errors.Error.Errno(),
which will also encourage users to not use Errnos to reference linuxerr.
PiperOrigin-RevId: 406444419
Removes package syserror and moves still relevant code to either linuxerr
or to syserr (to be later removed).
Internal errors are converted from random types to *errors.Error types used
in linuxerr. Internal errors are in linuxerr/internal.go.
PiperOrigin-RevId: 390724202
Remove three syserror entries duplicated in linuxerr. Because of the
linuxerr.Equals method, this is a mere change of return values from
syserror to linuxerr definitions.
Done with only these three errnos as CLs removing all grow to a significantly
large size.
PiperOrigin-RevId: 382173835
Add Equals method to compare syserror and unix.Errno errors to linuxerr errors.
This will facilitate removal of syserror definitions in a followup, and
finding needed conversions from unix.Errno to linuxerr.
PiperOrigin-RevId: 380909667
The newly added Weirdness metric with fields should be used instead of them.
Simple query for weirdness metric: http://shortn/_DGNk0z2Up6
PiperOrigin-RevId: 370578132
Weirdness metric contains fields to track the number of clock fallback,
partial result and vsyscalls. This metric will avoid the overhead of
having three different metrics (fallbackMetric, partialResultMetric,
vsyscallCount).
PiperOrigin-RevId: 369970218
Split usermem package to help remove syserror dependency in go_marshal.
New hostarch package contains code not dependent on syserror.
PiperOrigin-RevId: 365651233
The syscall package has been deprecated in favor of golang.org/x/sys.
Note that syscall is still used in the following places:
- pkg/sentry/socket/hostinet/stack.go: some netlink related functionalities
are not yet available in golang.org/x/sys.
- syscall.Stat_t is still used in some places because os.FileInfo.Sys() still
returns it and not unix.Stat_t.
Updates #214
PiperOrigin-RevId: 360701387
* Aggregate architecture Overview in "What is gVisor?" as it makes more sense
in one place.
* Drop "user-space kernel" and use "application kernel". The term "user-space
kernel" is confusing when some platform implementation do not run in
user-space (instead running in guest ring zero).
* Clear up the relationship between the Platform page in the user guide and the
Platform page in the architecture guide, and ensure they are cross-linked.
* Restore the call-to-action quick start link in the main page, and drop the
GitHub link (which also appears in the top-right).
* Improve image formatting by centering all doc and blog images, and move the
image captions to the alt text.
PiperOrigin-RevId: 311845158