Commit Graph

480 Commits

Author SHA1 Message Date
Matt Borgerson
704ece9ac6 Merge QEMU v10.2.0 2026-01-18 16:36:55 -07:00
Philippe Mathieu-Daudé
0d42e48c73 Revert "migration/vmstate: remove VMSTATE_BUFFER_POINTER_UNSAFE macro"
Next commit will re-use VMSTATE_BUFFER_POINTER_UNSAFE().

This reverts commit 58341158d0.

Suggested-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2025-12-09 21:00:15 +01:00
Vladimir Sementsov-Ogievskiy
8c3843638c migration: vmsd errp handlers: return bool
No code actually depend on specific errno values returned by
vmstate_load_state. The only use of it is to check for success,
and sometimes inject numeric error values into error messages
in migration code. The latter is not a stopper for gradual
conversion to "errp + bool return value" APIs.

Big analysis of vmstate_load_state() callers, showing that
specific errno values are not actually used, is done by Peter
here:

https://lore.kernel.org/qemu-devel/aQDdRn8t0B8oE3gf@x1.local/

Converting of vmstate_load_state() itself will follow in
another series.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Link: https://lore.kernel.org/r/20251028170926.77219-2-vsementsov@yandex-team.ru
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-11-03 16:04:10 -05:00
Markus Armbruster
3ca0a0ab05 migration: Use bitset of MigMode instead of variable arguments
migrate_add_blocker_modes() and migration_add_notifier_modes use
variable arguments for a set of migration modes.  The variable
arguments get collected into a bitset for processsing.  Take a bitset
argument instead, it's simpler.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20251027064503.1074255-3-armbru@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-11-03 16:04:10 -05:00
Peter Xu
6a65fdee8a migration/cpr: Fix coverity report in cpr_exec_persist_state()
Per reported and analyzed by Peter:

https://lore.kernel.org/r/CAFEAcA_mUQ2NeoguR5efrhw7XYGofnriWEA=+Dg+Ocvyam1wAw@mail.gmail.com

mfd leak is a false positive, try to use a coverity annotation (which I
didn't find manual myself, but still give it a shot).

Fix the other one by capture error if setenv() failed.  When at it, pass
the error to the top (cpr_state_save()).  Along the way, changing all
retval to bool when errp is around.

Resolves: Coverity CID 1641391
Resolves: Coverity CID 1641392
Fixes: efc6587313 ("migration: cpr-exec save and load")
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20251021220407.2662288-3-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-11-03 16:04:09 -05:00
Philippe Mathieu-Daudé
346479c304 migration: Remove unused VMSTATE_UINTTL_EQUAL[_V]() macros
The last use of VMSTATE_UINTTL_EQUAL() was removed in commit
16a2497bd4 ("target-ppc: Fix CPU migration from qemu-2.6 <->
later versions"), 9 years ago; remove it.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/20251016160313.25751-1-philmd@linaro.org
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-11-03 16:04:09 -05:00
Michael Tokarev
58341158d0 migration/vmstate: remove VMSTATE_BUFFER_POINTER_UNSAFE macro
The only user of this macro was VirtIONet.vlans, which has been
converted to regular VMSTATE_BUFFER.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Peter Xu <peterx@redhat.com>
Message-ID: <20251023135316.31128-3-mjt@tls.msk.ru>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2025-10-28 08:08:11 +01:00
Richard Henderson
bd6aa0d1e5 Merge tag 'staging-pull-request' of https://gitlab.com/peterx/qemu into staging
Migration/Memory Pull for 10.2

- PeterX's fix on tls warning for preempt channel when migratino completes
- Arun's series to enhance error reporting for vTPM and migration framework
- PeterX's patch to cleanup multifd send TLS BYE messages
- Juraj's fix on postcopy start state transition when switchover failed
- Yanfei's fix to migrate APIC before VFIO-PCI to avoid irq fallbacks
- Dan's cleanup to simplify error reporting in qemu_fill_buffer()
- PeterM's fix on address space leak when cpu hot plug / unplug
- Steve's cpr-exec wholeset

# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCaN/uIhIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wZ+mAEA1l2RS9sZS1W3vXQMCNb+Nu8Uo2p+e5Qj
# Uu6J0WVV+XsBANtzGZk2UM/frqlABywW3/ozJ4qBvIPKo758Mr6/lqUH
# =asUv
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 03 Oct 2025 08:39:14 AM PDT
# gpg:                using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg:                issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [unknown]
# gpg:                 aka "Peter Xu <peterx@redhat.com>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D  D1A9 3B5F CCCD F3AB D706

* tag 'staging-pull-request' of https://gitlab.com/peterx/qemu: (45 commits)
  migration-test: test cpr-exec
  vfio: cpr-exec mode
  migration: cpr-exec docs
  migration: cpr-exec mode
  migration: cpr-exec save and load
  migration: cpr-exec-command parameter
  oslib: qemu_clear_cloexec
  migration: add cpr_walk_fd
  migration: multi-mode notifier
  migration: simplify error reporting after channel read
  physmem: Destroy all CPU AddressSpaces on unrealize
  memory: New AS helper to serialize destroy+free
  include/system/memory.h: Clarify address_space_destroy() behaviour
  migration: ensure APIC is loaded prior to VFIO PCI devices
  migration: Fix state transition in postcopy_start() error handling
  migration/multifd/tls: Cleanup BYE message processing on sender side
  migration: HMP: Adjust the order of output fields
  migration: Make migration_has_failed() work even for CANCELLING
  io/crypto: Move tls premature termination handling into QIO layer
  backends/tpm: Propagate vTPM error on migration failure
  ...

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2025-10-04 09:10:58 -07:00
Steve Sistare
a3eae205c6 migration: cpr-exec mode
Add the cpr-exec migration mode.  Usage:
  qemu-system-$arch -machine aux-ram-share=on ...
  migrate_set_parameter mode cpr-exec
  migrate_set_parameter cpr-exec-command \
    <arg1> <arg2> ... -incoming <uri-1> \
  migrate -d <uri-1>

The migrate command stops the VM, saves state to uri-1,
directly exec's a new version of QEMU on the same host,
replacing the original process while retaining its PID, and
loads state from uri-1.  Guest RAM is preserved in place,
albeit with new virtual addresses.

The new QEMU process is started by exec'ing the command
specified by the @cpr-exec-command parameter.  The first word of
the command is the binary, and the remaining words are its
arguments.  The command may be a direct invocation of new QEMU,
or may be a non-QEMU command that exec's the new QEMU binary.

This mode creates a second migration channel that is not visible
to the user.  At the start of migration, old QEMU saves CPR state
to the second channel, and at the end of migration, it tells the
main loop to call cpr_exec.  New QEMU loads CPR state early, before
objects are created.

Because old QEMU terminates when new QEMU starts, one cannot
stream data between the two, so uri-1 must be a type,
such as a file, that accepts all data before old QEMU exits.
Otherwise, old QEMU may quietly block writing to the channel.

Memory-backend objects must have the share=on attribute, but
memory-backend-epc is not supported.  The VM must be started with
the '-machine aux-ram-share=on' option, which allows anonymous
memory to be transferred in place to the new process.  The memfds
are kept open across exec by clearing the close-on-exec flag, their
values are saved in CPR state, and they are mmap'd in new QEMU.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Link: https://lore.kernel.org/r/1759332851-370353-7-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Steve Sistare
efc6587313 migration: cpr-exec save and load
To preserve CPR state across exec, create a QEMUFile based on a memfd, and
keep the memfd open across exec.  Save the value of the memfd in an
environment variable so post-exec QEMU can find it.

These new functions are called in a subsequent patch.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Link: https://lore.kernel.org/r/1759332851-370353-6-git-send-email-steven.sistare@oracle.com
[peterx: fix build for Windows]
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Steve Sistare
a9f9eee58b migration: add cpr_walk_fd
Add a helper to walk all CPR fd's and run a callback for each.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1759332851-370353-3-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Steve Sistare
dc79c7d5e1 migration: multi-mode notifier
Allow a notifier to be added for multiple migration modes.
To allow a notifier to appear on multiple per-node lists, use
a generic list type.  We can no longer use NotifierWithReturnList,
because it shoe horns the notifier onto a single list.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/1759332851-370353-2-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Yanfei Xu
d943cef760 migration: ensure APIC is loaded prior to VFIO PCI devices
The load procedure of VFIO PCI devices involves setting up IRT
for each VFIO PCI devices. This requires determining whether an
interrupt is single-destination interrupt to decide between
Posted Interrupt(PI) or remapping mode for the IRTE. However,
determining this may require accessing the VM's APIC registers.

For example:
ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irqs)
  ...
    kvm_arch_irq_bypass_add_producer
      kvm_x86_call(pi_update_irte)
        vmx_pi_update_irte
          kvm_intr_is_single_vcpu

If the LAPIC has not been loaded yet, interrupts will use remapping
mode. To prevent the fallback of interrupt mode, keep APIC is always
loaded prior to VFIO PCI devices.

Signed-off-by: Yicong Shen <shenyicong.1023@bytedance.com>
Signed-off-by: Yanfei Xu <yanfei.xu@bytedance.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20250818131127.1021648-1-yanfei.xu@bytedance.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Arun Menon
40de712a89 migration: Add error-parameterized function variants in VMSD struct
- We need to have good error reporting in the callbacks in
  VMStateDescription struct. Specifically pre_save, pre_load
  and post_load callbacks.
- It is not possible to change these functions everywhere in one
  patch, therefore, we introduce a duplicate set of callbacks
  with Error object passed to them.
- So, in this commit, we implement 'errp' variants of these callbacks,
  introducing an explicit Error object parameter.
- This is a functional step towards transitioning the entire codebase
  to the new error-parameterized functions.
- Deliberately called in mutual exclusion from their counterparts,
  to prevent conflicts during the transition.
- New impls should preferentally use 'errp' variants of
  these methods, and existing impls incrementally converted.
  The variants without 'errp' are intended to be removed
  once all usage is converted.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-26-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Arun Menon
6f9fc6f501 migration: Remove error variant of vmstate_save_state() function
This commit removes the redundant vmstate_save_state_with_err()
function.

Previously, commit 969298f9d7 introduced vmstate_save_state_with_err()
to handle error propagation, while vmstate_save_state() existed for
non-error scenarios.
This is because there were code paths where vmstate_save_state_v()
(called internally by vmstate_save_state) did not explicitly set
errors on failure.

This change unifies error handling by
 - updating vmstate_save_state() to accept an Error **errp argument.
 - vmstate_save_state_v() ensures errors are set directly within the errp
   object, eliminating the need for two separate functions.

All calls to vmstate_save_state_with_err() are replaced with
vmstate_save_state(). This simplifies the API and improves code
maintainability.

vmstate_save_state() that only calls vmstate_save_state_v(),
by inference, also has errors set in errp in case of failure.
The errors are reported using error_report_err().
If we want the function to exit on error, then &error_fatal is
passed.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-24-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Arun Menon
d865e4aabd migration: push Error **errp into loadvm_process_enable_colo()
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that loadvm_process_enable_colo() must report an error
in errp, in case of failure.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-21-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:02 -04:00
Arun Menon
c632ffbd74 migration: push Error **errp into vmstate_load_state()
This is an incremental step in converting vmstate loading
code to report error via Error objects instead of directly
printing it to console/monitor.
It is ensured that vmstate_load_state() must report an error
in errp, in case of failure.

The errors are temporarily reported using error_report_err().
This is removed in the subsequent patches in this series,
when we are actually able to propagate the error to the calling
function using errp. Whereas, if we want the function to exit on
error, then error_fatal is passed.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Arun Menon <armenon@redhat.com>
Tested-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250918-propagate_tpm_error-v14-2-36f11a6fb9d3@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2025-10-03 09:48:01 -04:00
TANG Tiancheng
30478e2ff7 migration: Add support for a variable-length array of UINT32 pointers
Add support for defining a vmstate field which is a variable-length array
of pointers, and use this to define a VMSTATE_TIMER_PTR_VARRAY() which allows
a variable-length array of QEMUTimer* to be used by devices.

Message-id: 20250909-timers-v1-0-7ee18a9d8f4b@linux.alibaba.com
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: TANG Tiancheng <lyndra@linux.alibaba.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20250911-timers-v3-2-60508f640050@linux.alibaba.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2025-10-03 13:15:14 +10:00
Juraj Marcin
beeac2df5f migration: Rename save_live_complete_precopy_thread to save_complete_precopy_thread
Recent patch [1] renames the save_live_complete_precopy handler to
save_complete, as the machine is not live in most cases when this
handler is executed. The same is true also for
save_live_complete_precopy_thread, therefore this patch removes the
"live" keyword from the handler itself and related types to keep the
naming unified.

In contrast to save_complete, this handler is only executed at the end
of precopy, therefore the "precopy" keyword is retained.

[1]: https://lore.kernel.org/all/20250613140801.474264-7-peterx@redhat.com/

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Juraj Marcin <jmarcin@redhat.com>
Link: https://lore.kernel.org/r/20250626085235.294690-1-jmarcin@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-07-11 10:37:39 -03:00
Peter Xu
57c43e52bd migration: Rename save_live_complete_precopy to save_complete
Now after merging the precopy and postcopy version of complete() hook,
rename the precopy version from save_live_complete_precopy() to
save_complete().

Dropping the "live" when at it, because it's in most cases not live when
happening (in precopy).

No functional change intended.

Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20250613140801.474264-7-peterx@redhat.com
[peterx: squash the fixup that covers a few more doc spots, per Juraj]
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-07-11 10:37:36 -03:00
Peter Xu
d7530a9682 migration: Drop save_live_complete_postcopy hook
The hook is only defined in two vmstate users ("ram" and "block dirty
bitmap"), meanwhile both of them define the hook exactly the same as the
precopy version.  Hence, this postcopy version isn't needed.

No functional change intended.

Reviewed-by: Juraj Marcin <jmarcin@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20250613140801.474264-6-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2025-07-11 10:37:35 -03:00
Steve Sistare
a6f2f9c42f migration: vfio cpr state hook
Define a list of vfio devices in CPR state, in a subsection so that
older QEMU can be live updated to this version.  However, new QEMU
will not be live updateable to old QEMU.  This is acceptable because
CPR is not yet commonly used, and updates to older versions are unusual.

The contents of each device object will be defined by the vfio subsystem
in a subsequent patch.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Link: https://lore.kernel.org/qemu-devel/1751493538-202042-14-git-send-email-steven.sistare@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-07-03 13:42:28 +02:00
Steve Sistare
ccfc6715cf migration: cpr_get_fd_param helper
Add the helper function cpr_get_fd_param, to use when preserving
a file descriptor that is opened externally and passed to QEMU.
cpr_get_fd_param returns a descriptor number either from a QEMU
command-line parameter, from a getfd command, or from CPR state.

When a descriptor is passed to new QEMU via SCM_RIGHTS, its number
changes.  Hence, during CPR, the command-line parameter is ignored
in new QEMU, and over-ridden by the value found in CPR state.

Similarly, if the descriptor was originally specified by a getfd
command in old QEMU, the fd number is not known outside of QEMU,
and it changes when sent to new QEMU via SCM_RIGHTS.  Hence the
user cannot send getfd to new QEMU, but when the user sends a
hotplug command that references the fd, cpr_get_fd_param finds
its value in CPR state.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/qemu-devel/1751493538-202042-5-git-send-email-steven.sistare@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-07-03 13:42:28 +02:00
Steve Sistare
081c09dc52 migration: lower handler priority
Define a vmstate priority that is lower than the default, so its handlers
run after all default priority handlers.  Since 0 is no longer the default
priority, translate an uninitialized priority of 0 to MIG_PRI_DEFAULT.

CPR for vfio will use this to install handlers for containers that run
after handlers for the devices that they contain.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/qemu-devel/1749569991-25171-3-git-send-email-steven.sistare@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-11 14:01:58 +02:00
Steve Sistare
f95fd60ac1 migration: cpr helpers
Add the cpr_incoming_needed, cpr_open_fd, and cpr_resave_fd helpers,
for use when adding cpr support for vfio and iommufd.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/qemu-devel/1749569991-25171-2-git-send-email-steven.sistare@oracle.com
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2025-06-11 14:01:58 +02:00