77 Commits

Author SHA1 Message Date
Jamie Liu c16d3fdfad pgalloc: log async page loading progress and info about awaited loads
PiperOrigin-RevId: 738555942
2025-03-19 15:17:36 -07:00
Jamie Liu b01944883b Add memmap.File.MemoryType()
This has no effect (outside of debug logging) until cl/723723715.

Updates #11436

PiperOrigin-RevId: 736686635
2025-03-13 17:08:52 -07:00
Jamie Liu cc69f4f190 pgalloc: no-op MemoryFile.UpdateUsage() during saving
During checkpointing, MemoryFile.SaveTo() narrows the set of pages known by
memory accounting to be "committed" to those containing non-zero bytes, in
order to avoid saving zero pages and therefore bloating the checkpoint. In the
process of doing so, it needs to touch pages in order to determine whether they
contain non-zero bytes, and does so without holding MemoryFile.mu (in a
MemoryFile.updateUsageLocked() callback). Thus, a concurrent call to
MemoryFile.UpdateUsage() => MemoryFile.updateUsageLocked() can racily observe
that touched zero pages are committed (via mincore()) and mark them
known-committed accordingly, causing them to be unintentionally saved in the
checkpoint.

When SaveOpts.ExcludeCommittedZeroPages is set, MemoryFile.SaveTo() does not
decommit previously-known-committed zero pages, since doing so would cost time;
the motivation for decommitting zero pages is to avoid increasing (real, host)
memory usage during checkpointing, but previously-known-committed pages must
have been using memory even before being touched. However, this significantly
widens the race window described above, since any future call to
MemoryFile.UpdateUsage() will observe said pages to be committed and mark them
known-committed again, effectively negating SaveOpts.ExcludeCommittedZeroPages.

To fix this, inhibit MemoryFile.UpdateUsage() during MemoryFile.SaveTo(); in
essence, when MemoryFile.SaveTo() is in progress, it exclusively defines what
pages are known-committed.

PiperOrigin-RevId: 733876220
2025-03-05 14:50:59 -08:00
Ayush Ranjan 1e5b6ec429 Add more context to errors during restore.
This will help with debugging restore failures.

PiperOrigin-RevId: 693436013
2024-11-05 12:18:57 -08:00
Jamie Liu a15559c56c mm: limit AddressSpace overmapping during async page loading
On platforms that do not create page table entries in
platform.AddressSpace.MapFile(precommit=false), i.e. platforms for which
platform.Platform.MapUnit() == 0, platform.AddressSpace.MapFile() is generally
implemented as some form of host mmap(), which only synchronously creates host
kernel VMAs (virtual memory areas) and creates page table entries lazily in
response to application faults. On such platforms, MM.mapAsLocked() creates the
largest possible host VMAs since doing so reduces future sentry-handled page
faults and has effectively no additional cost. However, when async page loading
is active, this must wait for all mapped pages to be loaded, which may result
in the faulting application blocking for significantly longer than expected (in
experiments, a single page fault could result in waiting for up to 64GB of data
to be loaded). In such cases, additionally constrain mapped sizes to limit wait
times.

PiperOrigin-RevId: 680699946
2024-09-30 13:39:11 -07:00
Jamie Liu 41f01d8f9c pgalloc: integrate async page loading
When a pages file is provided to `runsc restore`, reads from that file are
asynchronous (via statefile.AsyncReader) in order to maximize throughput.
However, all such reads must complete before Kernel.LoadFrom() returns, so
applications cannot execute before MemoryFile loading is complete. The main
objective of this CL is to allow reads to continue after Kernel.LoadFrom()
returns, allowing applications to execute while MemoryFile loading is still in
progress. This behavior is user-visible: it affects whether deleting the pages
file frees disk space immediately on POSIX filesystems, may affect whether
deletion is possible on non-POSIX filesystems, and prevents unmounting
regardless. Thus it is flag-guarded as `runsc restore --background`.

MemoryFile ranges that have yet to be loaded, but that are being waited-for by
applications, should be prioritized over ranges for which no application is
waiting. This requires that application requests for data (calls to
MemoryFile.(memmap.File).DataFD/MapInternal()) are able to determine which
ranges have not yet been loaded, request reads for such ranges with elevated
priority, and wait for only those reads to be completed; none of these are
supported by the existing statefile.AsyncReader.

Thus:

- Add //pkg/sentry/pgalloc/aio, which provides an async I/O API that is
  designed to be easily implementable using a goroutine pool, Linux native AIO,
  or io_uring, though only includes a goroutine pool implementation. (io_uring
  is widely disabled due to security vulnerabilities. In my testing, Linux
  native AIO is slower than the goroutine pool, but this may change with lower
  GOMAXPROCS which needs further testing.)

- Move I/O scheduling into pgalloc: introduce an async page loader goroutine
  that is started by MemoryFile.LoadFrom() when async page loading is requested
  (implicitly, via the existence of a pages file), which is responsible for
  driving submission of read requests and handling their completions.

PiperOrigin-RevId: 679321884
2024-09-26 15:51:13 -07:00
Jamie Liu a50fb5ded0 Add memmap.File.DataFD().
This is used in cl/674746696 to ensure that users of MemoryFile data wait until
that data has been loaded.

PiperOrigin-RevId: 679255898
2024-09-26 12:51:20 -07:00
Jamie Liu 56521670ef state/wire: do not use sync.Pool for single-byte buffers
This package critically depends on reading/writing single bytes. Since
arguments to interface methods io.Reader.Read / io.Writer.Write escape, a naive
implementation would heap-allocate a one-byte array per read/write.

Prior to cl/625167495, the wire package provided custom Reader/Writer
interfaces, implementations of which were required to provide their own
ReadByte/WriteByte methods that did not take any escaping arguments.
cl/625167495 eliminated these interfaces and made the wire package use
sync.Pool to allocate one-byte arrays instead, simplifying the pipelining of
readers and writers but introducing non-trivial overhead. This CL re-introduces
wire.Reader/Writer, but as structs combining an io.Reader/Writer and a one-byte
array; this preserves the relative ease of using arbitrary io.Readers/Writers,
while eliminating sync.Pool overhead by essentially having callers of the wire
package provide the persistent buffer.

PiperOrigin-RevId: 666946511
2024-08-23 15:45:22 -07:00
Jamie Liu 26626ead8f pgalloc: log additional save/load stats
PiperOrigin-RevId: 666569217
2024-08-22 18:32:33 -07:00
Jamie Liu a5573312e0 Add explicit huge page and memory recycling support to pgalloc.MemoryFile.
This CL addresses the following major issues:

- When an application releases memory to the sentry, the sentry unconditionally
  releases that memory to the host, rather than allowing it to be reused for
  future allocations, in order to ensure that new allocations are uniformly
  decommitted (use no memory): cl/145016083. In most cases, this should have
  relatively little performance impact; since releasing memory from the
  application to the OS is expensive even outside of gVisor, application memory
  allocators optimizing for performance already limit the rate at which they
  release memory to the OS. However, in applications that involve frequent
  process creation and exit (e.g. build systems), this practice prevents reuse
  of memory deallocated by exiting processes for memory allocated by new
  processes, resulting in both performance degradation and a spike in memory
  usage (since the sentry may not have released all deallocated memory to the
  host by the time new allocations occur).

- gVisor's historical approach to application THP is based on THP being enabled
  on a per-memfd basis, using the MFD_HUGEPAGE flag not merged into the
  upstream Linux kernel
  (https://patchwork.kernel.org/project/linux-mm/patch/c140f56a-1aa3-f7ae-b7d1-93da7d5a3572@google.com/).
  Thus, on vanilla Linux kernels, gVisor cannot use THP for application memory
  without requiring the system to enable THP for all tmpfs files and memfds (by
  setting /sys/kernel/mm/transparent_hugepage/shmem_enabled to "always" or
  "force").

- Both MM and the application page allocator (pgalloc) are agnostic as to
  whether the underlying memory file will be THP-backed. Instead, both attempt
  to align hugepage-sized and larger allocations to hugepage boundaries, such
  that if the memory file happens to support THP then such allocations will be
  appropriately aligned to use THP. This is suboptimal since many allocations
  do not benefit from THP, resulting in memory underutilization.

These issues are especially relevant to platforms based on hardware
virtualization, where acquiring memory from the host is significantly more
expensive due to EPT/NPT fault overhead; when effective, THP reduces the
frequency with which said cost is incurred by a factor of 512, and page reuse
avoids incurring it at all.

Thus:

- Instead of inferring whether THP use is desired from allocation size,
  indicate this explicitly as AllocOpts.Huge, and only set it to true for
  allocations for non-stack private anonymous mappings.

- Add AllocateCallerIndirectCommit, a new possible value for AllocOpts.Mode
  that indicates that the caller will commit all pages in the allocation. In
  such cases, pgalloc can reuse deallocated pages without risking increased
  memory usage, internally referred to as "recycling".
  AllocateCallerIndirectCommit is used primarily for page faults on a
  THP-backed region. (It is also used for single-page allocations on non-THP
  backed regions, but due to expansion of faults to mm.privateAllocUnit-aligned
  ranges, this is relatively uncommon.)

- Allow different chunks in pgalloc.MemoryFile's backing file to have varying
  THP-ness, indicated to the host using MADV_HUGEPAGE/NOHUGEPAGE.

- Split pgalloc.MemoryFile's existing page metadata set into two sets tracking
  deallocated pages for small/huge-page-backed regions respectively; two sets
  tracking in-use pages for small/huge-page-backed regions respectively; and a
  fifth set tracking memory accounting state.

- Add MemoryFileOpts.DisableMemoryAccounting; this is primarily intended for
  pgalloc tests, but may also be applicable to disk-backed MemoryFiles.

Cleanup:

- Remove MemoryFile.usageSwapped; the UpdateUsage() optimization it enabled,
  described in updateUsageLocked(), was based on the condition that
  MemoryFile.mu would be locked throughout the call to updateUsageLocked(),
  which was invalidated by cl/337865250.

- Remove MemoryFileOpts.ManualZeroing, which is unused.

- Rename "reclaiming" to "releasing"; the former is confusing since "reclaim"
  in Linux has a significantly different meaning (essentially "eviction" in
  pgalloc), and the latter seems to be conventional in user-mode memory
  allocators.

Using THP for application memory requires setting
/sys/kernel/mm/transparent_hugepage/shmem_enabled to "advise", in order to
allow runsc to request THP from the kernel.

After this CL, pgalloc.MemoryFile still releases memory to the host as fast as
possible, limiting the effectiveness of page recycling. A following CL adds
optional memory release throttling to improve this.

Performance outcomes vary by workload and platform. (In all of the below,
"baseline" is without this CL, "expt" is with this CL, and "expt2" is with this
CL + reclaim throttling (cl/575046398).)

For systrap in GKE: As noted, this change is required to enable application THP
without forcing it on all host shmem users. In conjunction with recycling
(which has a relatively small effect on systrap since it does not use hardware
virtualization), THP use slightly improves performance, although whether this
is measurable is case-dependent. On an idle VM, with shmem_enabled = "advise":

```
goos: linux
goarch: amd64
cpu: Intel(R) Xeon(R) CPU @ 2.80GHz
                                                │  baseline  │               expt                │               expt2               │
                                                │   sec/op   │   sec/op    vs base               │   sec/op    vs base               │
BuildABSL/page_cache.clean/filesystem.bindfs-16   39.09 ± 4%   38.84 ± 5%       ~ (p=0.947 n=30)   38.84 ± 3%       ~ (p=0.854 n=30)
BuildABSL/page_cache.dirty/filesystem.bindfs-16   37.83 ± 3%   36.58 ± 4%       ~ (p=0.057 n=30)   36.83 ± 5%       ~ (p=0.314 n=30)
BuildABSL/page_cache.clean/filesystem.tmpfs-16    39.34 ± 3%   38.59 ± 4%       ~ (p=0.350 n=30)   38.58 ± 4%       ~ (p=0.300 n=30)
BuildABSL/page_cache.dirty/filesystem.tmpfs-16    37.83 ± 3%   36.08 ± 4%  -4.64% (p=0.026 n=30)   36.58 ± 4%       ~ (p=0.123 n=30)
BuildABSL/page_cache.clean/filesystem.rootfs-16   39.59 ± 4%   38.83 ± 3%       ~ (p=0.485 n=30)   40.09 ± 5%       ~ (p=0.971 n=30)
BuildABSL/page_cache.dirty/filesystem.rootfs-16   36.83 ± 3%   38.08 ± 5%       ~ (p=0.307 n=30)   38.08 ± 1%       ~ (p=0.242 n=30)
BuildABSL/page_cache.clean/filesystem.fusefs-16   38.34 ± 3%   37.59 ± 5%       ~ (p=0.752 n=30)   38.59 ± 3%       ~ (p=0.982 n=30)
BuildABSL/page_cache.dirty/filesystem.fusefs-16   37.58 ± 4%   38.08 ± 5%       ~ (p=0.708 n=30)   36.08 ± 6%       ~ (p=0.127 n=30)
BuildGRPC/page_cache.clean/filesystem.bindfs-16   212.7 ± 2%   211.0 ± 1%       ~ (p=0.138 n=30)   211.2 ± 1%       ~ (p=0.458 n=30)
BuildGRPC/page_cache.dirty/filesystem.bindfs-16   210.0 ± 1%   210.0 ± 1%       ~ (p=0.542 n=30)   209.7 ± 1%       ~ (p=0.665 n=30)
BuildGRPC/page_cache.clean/filesystem.rootfs-16   210.5 ± 1%   210.0 ± 1%       ~ (p=0.423 n=30)   210.0 ± 1%       ~ (p=0.142 n=30)
BuildGRPC/page_cache.dirty/filesystem.rootfs-16   210.2 ± 1%   209.0 ± 1%       ~ (p=0.219 n=30)   209.5 ± 1%       ~ (p=0.230 n=30)
geomean                                           67.62        66.97       -0.96%                  67.12       -0.74%
```

The KVM platform benefits significantly from reduced nested page faults due to
huge pages, and to a lesser extent due to recycling:

```
goos: linux
goarch: amd64
cpu: Intel(R) Xeon(R) W-2135 CPU @ 3.70GHz
                                                │  baseline  │                 expt                  │                 expt2                 │
                                                │   sec/op   │   sec/op    vs base                   │   sec/op    vs base                   │
BuildABSL/page_cache.clean/filesystem.bindfs-12   43.11 ± 2%   39.35 ± 3%   -8.71% (p=0.000 n=20)      38.10 ± 4%  -11.63% (p=0.000 n=20+19)
BuildABSL/page_cache.dirty/filesystem.bindfs-12   42.35 ± 3%   39.09 ± 4%   -7.69% (p=0.000 n=20+19)   39.09 ± 5%   -7.69% (p=0.000 n=20+19)
BuildABSL/page_cache.clean/filesystem.tmpfs-12    42.35 ± 3%   38.34 ± 5%   -9.46% (p=0.000 n=20)      38.59 ± 3%   -8.87% (p=0.000 n=20+19)
BuildABSL/page_cache.dirty/filesystem.tmpfs-12    42.09 ± 1%   37.59 ± 4%  -10.70% (p=0.000 n=20)      38.09 ± 4%   -9.51% (p=0.000 n=20+19)
BuildABSL/page_cache.clean/filesystem.rootfs-12   42.85 ± 3%   38.84 ± 3%   -9.35% (p=0.000 n=20)      39.09 ± 3%   -8.77% (p=0.000 n=20+17)
BuildABSL/page_cache.dirty/filesystem.rootfs-12   41.85 ± 2%   39.59 ± 6%   -5.40% (p=0.000 n=20+19)   38.09 ± 3%   -9.00% (p=0.000 n=20+19)
BuildABSL/page_cache.clean/filesystem.fusefs-12   42.60 ± 2%   38.34 ± 2%  -10.00% (p=0.000 n=20)      39.59 ± 3%   -7.06% (p=0.000 n=20+19)
BuildABSL/page_cache.dirty/filesystem.fusefs-12   42.09 ± 4%   39.09 ± 3%   -7.13% (p=0.000 n=20)      38.09 ± 3%   -9.52% (p=0.000 n=20+19)
BuildGRPC/page_cache.clean/filesystem.bindfs-12   207.7 ± 1%   206.4 ± 0%   -0.60% (p=0.018 n=20)      205.9 ± 1%   -0.85% (p=0.001 n=20+19)
BuildGRPC/page_cache.dirty/filesystem.bindfs-12   206.9 ± 1%   206.9 ± 1%        ~ (p=0.121 n=20)      204.4 ± 1%   -1.22% (p=0.004 n=20+19)
BuildGRPC/page_cache.clean/filesystem.rootfs-12   207.7 ± 1%   204.9 ± 1%   -1.33% (p=0.004 n=20)      203.9 ± 0%   -1.81% (p=0.000 n=20+19)
BuildGRPC/page_cache.dirty/filesystem.rootfs-12   206.9 ± 1%   204.9 ± 0%   -0.97% (p=0.004 n=20+19)   203.9 ± 0%   -1.45% (p=0.000 n=20+19)
geomean                                           71.97        67.63        -6.03%                     67.28        -6.52%
```
PiperOrigin-RevId: 647771821
2024-06-28 12:56:46 -07:00
Jamie Liu 56ab580ccb Automated rollback of changelist 633961720
PiperOrigin-RevId: 636602151
2024-05-23 10:50:10 -07:00
Konstantin Bogomolov 0bf4e9f6e5 Set limit on how big MemoryFile.Allocate calls can be.
Either TotalHostMem or TotalMem are good candidates for limits
because in case either of these is set we should not be going over
them.

The motivations of this is to help catch syscalls causing allocations
with size values that are blatantly bad.

PiperOrigin-RevId: 633961720
2024-05-15 08:23:50 -07:00
Jamie Liu e882488ed7 pgalloc: add SaveOpts.ExcludeCommittedZeroPages
This option, enabled via `runsc checkpoint --exclude-committed-zero-pages`,
instructs `pgalloc.MemoryFile.SaveTo()` to also exclude definitely-committed
zero pages from checkpointing (in addition to possibly-committed zero pages,
which are always scanned for and excluded). This is useful when the application
being checkpointed is known to have a large number of committed zero pages:
pages that (1) have been touched by application memory accesses, a syscall such
as read(), or page pinning by e.g. a driver, and (2) have not been subsequently
released by the application to the operating system by e.g. munmap() or
madvise(MADV_DONTNEED) (+ page unpinning if necessary), and (3) are filled with
zero bytes.

Minor changes:

- In `MemoryFile.updateUsageLocked()`, pass file offset to `checkCommitted` so
  that `MemoryFile.SaveTo()`'s `checkCommitted` can use `FALLOC_FL_PUNCH_HOLE`
  to decommit pages rather than `MADV_REMOVE` (which translates addresses to
  file offsets and then invokes `FALLOC_FL_PUNCH_HOLE`).

- In `MemoryFile.SaveTo()`, buffer up to a hugepage worth of pages to decommit
  rather than decommitting one page per syscall.

- Increment `MemoryFile.usageExpected` in `MemoryFile.LoadFrom()`, such that
  the first following call to `MemoryFile.UpdateUsage()` might skip the call
  to `MemoryFile.updateUsageLocked()` (if memory usage hasn't changed since
  loading).

PiperOrigin-RevId: 632370455
2024-05-09 21:50:58 -07:00
Jamie Liu 31979a7187 mm: add fallback to buffered I/O when memmap.File.MapInternal() is unavailable
MapInternal() returns a coherent memory mapping of the host file descriptor
represented by a memmap.File, in the sentry's address space. This is
principally used when the sentry needs to access the contents of application
memory (for e.g. syscall arguments passed by pointer, or the source/destination
of a write()/read() syscall); it usually looks up the memmap.Files backing
application addresses and obtains mappings via MapInternal().

/dev/nvidia-uvm cannot generally be mapped into the sentry's address space, for
reasons described by
https://github.com/google/gvisor/blob/master/g3doc/proposals/nvidia_driver_proxy.md#unified-virtual-memory-uvm
(in short, nvidia-uvm requires that a given page at file offset X can only be
mapped at address X). To allow the sentry to access the contents of such
mappings, make it possible for memmap.File.MapInternal() to indicate that a
fallback to buffered I/O is required, add interface methods
memmap.File.Buffer{Read,Write}At() to perform this buffered I/O, and implement
this fallback in the MM I/O path.

This CL does not use the new buffered I/O fallback anywhere; a following CL
adds it to nvproxy's nvidia-uvm.

Updates #10331

PiperOrigin-RevId: 629830825
2024-05-01 14:05:55 -07:00
Ayush Ranjan 06c085fae5 Add AsyncReader implementation in statefile package.
This type allows reading asynchronously and provides a Wait() method as a
barrier operation.

PiperOrigin-RevId: 627520473
2024-04-23 15:20:56 -07:00
Ayush Ranjan 43c2c00c50 Delete wire.Reader and wire.Writer.
These interfaces only existed to add ReadByte() and WriteByte() methods. There
were only 4 implementors of these methods: compressio.{Simple}{Reader/Writer}.
And there were only 2 users of this.

Using io.{Reader/Writer} is more extendible. For instance, it allows using
*os.File with the `wire` package without any wrappers.

Updated the 2 users to implement their own {read/write}Byte(). To avoid heap
allocation of the [1]byte storage during call to io.Reader.Read or
io.Writer.Write due to interface call, used sync.Pool. Earlier, calls to
compressio.Simple{Reader/Writer}'s implementation of {Read/Write}Byte would
cause a heap allocation.

PiperOrigin-RevId: 625167495
2024-04-15 19:40:27 -07:00
Ayush Ranjan 5e9207a966 Create separate pages.img checkpoint file when compression=none.
PiperOrigin-RevId: 624210535
2024-04-12 09:57:06 -07:00
Ayush Ranjan ed9678b679 Delete pgalloc.MemoryFileProvider.
The work done in c087777e37 ("Plumb restore context to afterLoad()") makes
pgalloc.MemoryFileProvider redundant as structs can now easily restore
pgalloc.MemoryFile in stateify's afterLoad() method. This allows structs to
have a pgalloc.MemoryFile field and use that directly, instead of going through
the provided interface.

This cleans up a lot of code and also should be more performant (avoids an
interface method call on many hot paths).

PiperOrigin-RevId: 615258927
2024-03-12 20:06:58 -07:00
Ayush Ranjan faf07bade6 Reassociate pma.file to the correct pgalloc.MemoryFile on restore.
Earlier we were always restoring pma.file to mm.mfp.MemoryFile(). However,
d8eb29ed6f ("Add support for saving PMAs referencing tmpfs filestore files.")
added support for saving PMAs that reference "private" pgalloc.MemoryFiles that
are different from mm.mfp.MemoryFile().

We achieve the correct restore by:
- Adding a "RestoreID" field to pgalloc.MemoryFile. Private MemoryFiles set
  this with a vfs.RestoreID.String(). Non-private MemoryFile does not set it.
- MemoryFile struct is not savable by itself, but pma.file field is saved as a
  string. We store the RestoreID string there.
- On restore, if RestoreID is "", then restore using CtxMemoryFile. If it has a
  non-empty RestoreID, then restore using CtxMemoryFileMap.
- Cleanup: vfs.CtxFilesystemMemoryFileMap was moved to pgalloc.CtxMemoryFileMap
  so we can now provide a pgalloc.MemoryFileMapFromContext() method which
  cleans up some code. Also the key to this map (MemoryFileOpts.RestoreID)
  belongs to pgalloc, so it seems like the right place to have this context.

PiperOrigin-RevId: 614903073
2024-03-11 21:44:03 -07:00
NymanRobin f481172b53 Convert atomic.Value to atomic.Pointer[T] 2024-03-05 11:09:23 +02:00
Ayush Ranjan 2759f79fce Plumb safemem.ReaderFunc through pgalloc.Allocate() to avoid heap allocations.
The safemem.Reader interface receiver was causing the implementation to escape
to heap (as at compile time, the compiler can not prove anything about the
implementation).

Using generics for io.ReadFullToBlocks() does not help in avoiding the heap
allocation.

With this change, we avoid at least these allocations:
- rw variable in kcov.Kcov.TaskWork().
- safemem.BlockSeqReader struct in in mm.MemoryManager.getPMAsInternalLocked().
- gr variable in fsutil.FileRangeSet.Fill().
- h variable in gofer.dentry.Translate().
- h variable in gofer.dentryReadWriter.ReadToBlocks().

Some of these are on hot paths for IO workloads.

PiperOrigin-RevId: 609805848
2024-02-23 12:32:25 -08:00
Ayush Ranjan d8eb29ed6f Add support for saving PMAs referencing tmpfs filestore files.
The rootfs is such a mount (with default overlay2=root:self configuration).
It should be possible to checkpoint this mount when application has active
VMAs that have allocated PMAs from the backing filestore file.

2d90b66af1 ("Add checkpoint/restore support for tmpfs with file backend.")
added support for saving such a tmpfs filestore file during checkpoint and
restoring it correctly. Hence, after restore, the VMAs and PMAs should be
valid.

PiperOrigin-RevId: 605766485
2024-02-09 16:50:24 -08:00
Jing Chen be48200c0e Re-order loads in BUILD files to make transformations reversible in Copybara.
PiperOrigin-RevId: 598898756
2024-01-16 11:21:40 -08:00
prof awk 4d30f2c9ef use new clear builtin to clear bufs 2023-11-27 19:43:25 +02:00
Jamie Liu 88d35cd8f1 segment.Set API improvements.
- Replace Add with TryInsertRange; for symmetry with RemoveRange, to establish
  the convention that *Range methods perform an implicit search in the set, and
  so that we can fork InsertRange which has Insert-like semantics (panics on
  conflict), which the majority of callers want.

- Rename MergeRange and MergeAdjacent to MergeInsideRange and MergeOutsideRange
  respectively; for the same convention, and to more clearly describe the
  difference between these functions.

- Add MergePrev and MergeNext. These solve the longstanding problem of
  requiring a separate call to Merge{Inside,Outside}Range (which will perform
  additional searches) after mutating a set in a relatively simple manner.

- Add SplitBefore and SplitAfter, which are halves of Isolate. These are
  slightly preferable to Isolate in many use cases for the latter (when
  iterating segments within a range, only the first segment can include a key
  before the start, so this saves some useless comparisons in almost every
  iteration of such loops), and are useful in some more complex algorithms.
  Also add LowerBoundSegmentSplitBefore and UpperBoundSegmentSplitAfter as
  ergonomic aids for the former use case.

- Add {Visit,Mutate}[Full]Range, which are convenience wrappers around the
  iterator API (including new functions) for simple use cases (and hence also
  serve to demonstrate how the new iterator functions are used).
  MutateFullRange in particular replaces ApplyContiguous and adds merging
  during iteration.

- Add RemoveFullRange, which (analogous to {Visit,Mutate}FullRange) is a
  variant of RemoveRange that checks that the range is fully covered by
  segments.

- Add Unisolate, which combines MergePrev and MergeNext in the same way that
  Isolate combines SplitBefore and SplitAfter. This is useful for merging after
  mutation of a single segment.

- Add {First,Last,LowerBound,UpperBound}LargeEnoughGap, which are convenient
  loop starters when using gap tracking.

- Replace SegmentDataSlices with FlatSegment, which is easier to use when
  specifying "set literals" (as in tests).

- Make {prev,next}LargeEnoughGapHelper iterative rather than tail-recursive.

- Slightly optimize Iterator.{Prev,Next}NonEmpty: GapIterator.{Start,End} needs
  to find the corresponding Iterator, so call Iterator.{Prev,Next}Segment
  directly rather than doing so twice.

PiperOrigin-RevId: 583506148
2023-11-17 15:51:12 -08:00