39 Commits

Author SHA1 Message Date
Konstantin Bogomolov fe66cae2ed Enumerate known systrap stub failures to exit process cleanly.
This helps to rectify a long standing problem of Systrap panicking
when encountering corrupted sysmsg stub memory.

These errors specifically are easier to notice and debug since we
check for them in the stub code and flag them to the sentry
explicitly. They are now very grep-able to make finding their origin
in the stub code easier.

PiperOrigin-RevId: 604743496
2024-02-06 13:19:26 -08:00
Etienne Perot dec37ea4ed gVisor seccomp: Implement in-Sentry seccomp cache.
This adds a per-task cache of seccomp actions to take for syscall numbers
where the filters return an action without depending on anything other than
the syscall number and the architecture code of the seccomp program input.

This avoids evaluating seccomp-bpf programs in the syscall hot path, for
programs that use seccomp *within* gVisor (aka on themselves).

Benchmarks show that this removes about 50ns from the syscall hot path
for a trivial filter like the one in the benchmark.
Real-world filters are much longer, and the benefit is magnified the more
complex the filter is.

```
                    │ not_cached  │              cached               │
                    │   sec/op    │   sec/op     vs base              │
SyscallUnderSeccomp   1.282µ ± 3%   1.230µ ± 1%  -4.06% (p=0.002 n=6)
```

PiperOrigin-RevId: 586522068
2023-11-29 20:00:16 -08:00
Etienne Perot a938259779 gVisor metric library: Change interface for passing in field values.
This introduces a `metric.FieldValue` struct type that wraps a string.
All metric interfaces that deal with field values have been updated to use
pointers to this type instead of strings.

The intent of this change is to make it more obvious that field values must
be passed using references. Prior to this change, this was done using string
pointer comparisons. Now this must be done by using a pointer to the same
`metric.FieldValue` struct.

The struct type still externally exposes its string so that it can be referred
to in value function callbacks by "custom" metrics. (Though there are no
current uses of callback metrics with fields.)

PiperOrigin-RevId: 527030738
2023-04-25 11:49:02 -07:00
Etienne Perot 79b38029d7 gVisor metric library: Optimize operations for large-cardinality metrics.
This optimizes both increments and lookup operations.

It does so using the following:

- For metrics of potential cardinality larger than 48, it will switch to using
  a map rather than linear search for mapping field value combinations to the
  index that combination corresponds to in the flattened list of values.
  For metrics of cardinality 48 or smaller, linear search is still used as it
  is still faster (and also still faster than binary search), as determined by
  benchmarks.
- All field values are checked for pointer uniqueness (i.e. the pointer to the
  start of each field value string must be unique). This is then used for
  faster matching: instead of comparing whole strings (and needing to hash
  them, in the case of doing map-based lookups), it compares pointer values.
  In the context of this metric library, because all field values must be
  pre-declared ahead of time, keeping references to these pre-declared strings
  should always be possible. This is enforced: it will `panic` if the metric
  user does not do this. In practice, this is easy to do by using `const`
  strings for all metric values. This change does just that for existing
  metrics with string fields.

From benchmarks, this speeds up the time to take a snapshot of existing
metrics by -8.35%. With the unimplemented syscall counter metric, this
optimization reduces the slowdown of adding this metric from +4,250% to
a still-large but much more manageable +290%. A further optimization
(cl/524419591) will reduce this overhead further before re-introducing the
unimplemented syscall counter metric.

PiperOrigin-RevId: 526134756
2023-04-21 14:21:18 -07:00
Etienne Perot 44e2d0fcfe gVisor: Refactor SyscallFn to take in the syscall number as argument.
This will be used to plumb the syscall number through to a counter metric that
exports the number of times an unimplemented syscall has been called.

Plenty of syscall implementations call `EmitUnimplementedEvent` for flags and
settings that are not implemented. With `sysno` available, they will be able
to plumb that bit of information through.

PiperOrigin-RevId: 518635831
2023-03-22 12:06:26 -07:00
Kevin Krakauer d8aa09e04c convert uses of interface{} to any
Done via:
  find . -name "*.go" | xargs sed -i -E 's/interface\{\}/any/g'

PiperOrigin-RevId: 487033228
2022-11-08 13:14:06 -08:00
Etienne Perot a96af5dd61 gVisor: Read syscall seccheck enablement as part of per-syscall flags.
Prior to this CL, each of the 4 syscall points (raw enter,
enter, raw exit, exit) each resulted in 1 atomic read, on top of the existing
atomic read to read non-seccheck-related per-syscall flags.

This CL propagates seccheck's syscall-related points to the existing
per-syscall flags bitfield, such that this data can be loaded in a single
atomic read per syscall, rather than 5.

This shaves off a few precious nanoseconds (-3%) from the hot syscall path.

PiperOrigin-RevId: 463190096
2022-07-25 15:24:56 -07:00
Fabricio Voznika da267f435f Rename seccheck.checkers to seccheck.sinks
Makes the naming consistent with the public configuration. There is no
behavior change.

Updates #4805

PiperOrigin-RevId: 462448505
2022-07-21 12:47:06 -07:00
Fabricio Voznika 21c757b60f Enable all trace points for syscall tests
Enable all trace points while running syscall tests to catch possible
crashes and bugs that may exist. These are not verifying that the data
in the points are correct though. Since all trace points are platform
agnostic for now, only enable them for ptrace.

Updates #4805

PiperOrigin-RevId: 455232093
2022-06-15 15:22:12 -07:00
Konstantin Bogomolov 7574e4f642 Add KVM specific metrics.
This change adds counter and timer metrics useful for analyzing the KVM
platform.

PiperOrigin-RevId: 447043888
2022-05-06 12:21:49 -07:00
Fabricio Voznika 2d6e64019b Faster proto serialization
The use of protobuf.Any is convenient, but adds to proto serialization
time and number of memory allocations required to send a message.
Instead, we now use an enum to indentify the message and use it to
determine how to unmarshall the message on the receiveing end. It
also speeds up event consuption by not requiring a map from string
(proto names) to callbacks.

BenchmarkProtoAny-6   115.9 ns/op        210 B/op       4 allocs/op
BenchmarkProtoEnum-6   58.3 ns/op          2 B/op       1 allocs/op

Updates #4805

PiperOrigin-RevId: 446879057
2022-05-05 19:29:49 -07:00
Fabricio Voznika 575d76def2 Add support for syscall points
Each syscall provides 4 different points. There is a raw syscall point that
contains the syscall number and all 6 arguments, nothing else. Some syscalls
can provide a schematized version of the syscall by defining a function that
converts the syscall into a proto representing the syscall. Each of these
flavors have a point for enter and another for exit. In both cases, the exit
event adds return value and errno (if any).

Updates #4805

PiperOrigin-RevId: 445510907
2022-04-29 14:49:40 -07:00
Zach Koopmans b822923b70 [syserr] Covert all linuxerr returns to error type.
Change the linuxerr.ErrorFromErrno to return an error type and not
a *errors.Error type. The latter results in problems comparing to nil
as <nil><nil> != <nil><*errors.Error>.

In a follow up, there will be a change to remove *errors.Error.Errno(),
which will also encourage users to not use Errnos to reference linuxerr.

PiperOrigin-RevId: 406444419
2021-10-29 14:03:16 -07:00
Zach Koopmans ce58d71fd5 [syserror] Remove pkg syserror.
Removes package syserror and moves still relevant code to either linuxerr
or to syserr (to be later removed).

Internal errors are converted from random types to *errors.Error types used
in linuxerr. Internal errors are in linuxerr/internal.go.

PiperOrigin-RevId: 390724202
2021-08-13 17:16:52 -07:00
Jamie Liu 052eb90dc1 Replace kernel.ExitStatus with linux.WaitStatus.
PiperOrigin-RevId: 383705129
2021-07-08 13:39:15 -07:00
Zach Koopmans 54b71221c0 [syserror] Change syserror to linuxerr for E2BIG, EADDRINUSE, and EINVAL
Remove three syserror entries duplicated in linuxerr. Because of the
linuxerr.Equals method, this is a mere change of return values from
syserror to linuxerr definitions.

Done with only these three errnos as CLs removing all grow to a significantly
large size.

PiperOrigin-RevId: 382173835
2021-06-29 15:08:46 -07:00
Zach Koopmans e1dc1c78e7 [syserror] Add conversions to linuxerr with temporary Equals method.
Add Equals method to compare syserror and unix.Errno errors to linuxerr errors.
This will facilitate removal of syserror definitions in a followup, and
finding needed conversions from unix.Errno to linuxerr.

PiperOrigin-RevId: 380909667
2021-06-22 15:53:32 -07:00
Nayana Bidari 5b207fe783 Remove metrics: fallback, vsyscallCount and partialResult
The newly added Weirdness metric with fields should be used instead of them.

Simple query for weirdness metric: http://shortn/_DGNk0z2Up6

PiperOrigin-RevId: 370578132
2021-04-26 17:37:29 -07:00
Nayana Bidari 0a6eaed50b Add weirdness sentry metric.
Weirdness metric contains fields to track the number of clock fallback,
partial result and vsyscalls. This metric will avoid the overhead of
having three different metrics (fallbackMetric, partialResultMetric,
vsyscallCount).

PiperOrigin-RevId: 369970218
2021-04-22 16:07:15 -07:00
Zach Koopmans 8a2f7e716d [syserror] Split usermem package
Split usermem package to help remove syserror dependency in go_marshal.
New hostarch package contains code not dependent on syserror.

PiperOrigin-RevId: 365651233
2021-03-29 13:30:21 -07:00
Ayush Ranjan a9441aea27 [op] Replace syscall package usage with golang.org/x/sys/unix in pkg/.
The syscall package has been deprecated in favor of golang.org/x/sys.

Note that syscall is still used in the following places:
- pkg/sentry/socket/hostinet/stack.go: some netlink related functionalities
  are not yet available in golang.org/x/sys.
- syscall.Stat_t is still used in some places because os.FileInfo.Sys() still
  returns it and not unix.Stat_t.

Updates #214

PiperOrigin-RevId: 360701387
2021-03-03 10:25:58 -08:00
Rahat Mahmood d201feb8c5 Enable automated marshalling for the syscall package.
PiperOrigin-RevId: 331940975
2020-09-15 23:38:57 -07:00
Dean Deng f2822da542 Move ERESTART* error definitions to syserror package.
This is needed to avoid circular dependencies between the vfs and kernel
packages.

PiperOrigin-RevId: 327355524
2020-08-18 19:28:53 -07:00
Adin Scannell 420b791a3d Minor formatting updates for gvisor.dev.
* Aggregate architecture Overview in "What is gVisor?" as it makes more sense
  in one place.

* Drop "user-space kernel" and use "application kernel". The term "user-space
  kernel" is confusing when some platform implementation do not run in
  user-space (instead running in guest ring zero).

* Clear up the relationship between the Platform page in the user guide and the
  Platform page in the architecture guide, and ensure they are cross-linked.

* Restore the call-to-action quick start link in the main page, and drop the
  GitHub link (which also appears in the top-right).

* Improve image formatting by centering all doc and blog images, and move the
  image captions to the alt text.

PiperOrigin-RevId: 311845158
2020-05-15 20:05:18 -07:00
Fabricio Voznika 28399818fc Make ExtractErrno a function
PiperOrigin-RevId: 306891171
2020-04-16 11:49:27 -07:00