izzy/xemu

mirror of https://github.com/izzy2lost/xemu.git synced 2026-03-26 18:22:55 -07:00

Author	SHA1	Message	Date
izzy2lost	d2c6059df3	integrate libsamplerate and other audio fixes	2026-03-20 09:06:52 -04:00
izzy2lost	3a58ffe0c8	enable real NV2A transform-program. Fixes spy vs spy, metal slug 3 and other games	2026-03-20 05:20:28 -04:00
izzy2lost	ddc32eaea4	added opengl es option	2026-03-06 14:09:09 -05:00
izzy2lost	b75429b103	Video working slow bad audio	2026-02-06 19:51:23 -05:00
Matt Borgerson	ebe582768e	util/mstring: Use G_GNUC_PRINTF macro for format attribute specification	2026-01-18 19:06:37 -07:00
Matt Borgerson	704ece9ac6	Merge QEMU v10.2.0	2026-01-18 16:36:55 -07:00
Philippe Mathieu-Daudé	efd6b3d176	Revert "hw/net/virtio-net: make VirtIONet.vlans an array instead of a pointer" Per https://lore.kernel.org/qemu-devel/7798584d-e861-47b7-af52-2c2efb67a4de@proxmox.com/: Loading a VM state taken with v10.1.2 or older doesn't work anymore, using the script [] we get: kvm: VQ 1 size 0x100 < last_avail_idx 0x9 - used_idx 0x3e30 kvm: load of migration failed: Operation not permitted: error while loading state for instance 0x0 of device '0000:00:13.0/virtio-net': Failed to load element of type virtio for virtio: -1 qemu-system-x86_64: Missing section footer for 0000:00:13.0/virtio-net qemu-system-x86_64: Section footer error, section_id: 41 []: #!/bin/bash rm /tmp/disk.qcow2 args=" -netdev type=tap,id=net1,ifname=tap104i1,script=/usr/libexec/qemu-server/pve-bridge,downscript=/usr/libexec/qemu-server/pve-bridgedown,vhost=on -device virtio-net-pci,mac=BC:24:11:32:3C:69,netdev=net1,bus=pci.0,addr=0x13,id=net1 -machine type=pc-i440fx-10.1 " $1/qemu-img create -f qcow2 /tmp/disk.qcow2 1G $1/qemu-system-x86_64 --qmp stdio --blockdev qcow2,node-name=node0,file.driver=file,file.filename=/tmp/disk.qcow2 $args <<EOF {"execute": "qmp_capabilities"} {"execute": "snapshot-save", "arguments": { "job-id": "save0", "tag": "snap", "vmstate": "node0", "devices": ["node0"] } } {"execute": "quit"} EOF $2/qemu-system-x86_64 --qmp stdio --blockdev qcow2,node-name=node0,file.driver=file,file.filename=/tmp/disk.qcow2 $args -loadvm snap This reverts commit `3a9cd2a4a1`. Reported-by: Fiona Ebner <f.ebner@proxmox.com> Suggested-by: Fiona Ebner <f.ebner@proxmox.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2025-12-09 21:00:15 +01:00
Philippe Mathieu-Daudé	0d42e48c73	Revert "migration/vmstate: remove VMSTATE_BUFFER_POINTER_UNSAFE macro" Next commit will re-use VMSTATE_BUFFER_POINTER_UNSAFE(). This reverts commit `58341158d0`. Suggested-by: Fiona Ebner <f.ebner@proxmox.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2025-12-09 21:00:15 +01:00
Stefan Weil	4fdff25625	hw/pci: Fix typo in documentation Signed-off-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Message-ID: <20251209125759.764296-1-sw@weilnetz.de> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2025-12-09 20:56:14 +01:00
Philippe Mathieu-Daudé	df3b304605	osdep: Undefine FSCALE definition to fix Solaris builds Solaris defines FSCALE in <sys/param.h>: 301 /* 302 * Scale factor for scaled integers used to count 303 * %cpu time and load averages. 304 / 305 #define FSHIFT 8 / bits to right of fixed binary point / 306 #define FSCALE (1<<FSHIFT) When emulating the SVE FSCALE instruction, we defines the same name in decodetree format in target/arm/tcg/sve.decode: 1129:FSCALE 01100101 .. 00 1001 100 ... ..... ..... @rdn_pg_rm This leads to a definition clash: In file included from ../target/arm/tcg/translate-sve.c:21: ../target/arm/tcg/translate.h:875:17: error: pasting "trans_" and "(" does not give a valid preprocessing token 875 \| static bool trans_##NAME(DisasContext s, arg_##NAME a) \ \| ^~~~~~ ../target/arm/tcg/translate-sve.c:4205:5: note: in expansion of macro 'TRANS_FEAT' 4205 \| TRANS_FEAT(NAME, FEAT, gen_gvec_fpst_arg_zpzz, name##_zpzz_fns[a->esz], a) \| ^~~~~~~~~~ ../target/arm/tcg/translate-sve.c:4249:1: note: in expansion of macro 'DO_ZPZZ_FP' 4249 \| DO_ZPZZ_FP(FSCALE, aa64_sve, sve_fscalbn) \| ^~~~~~~~~~ ../target/arm/tcg/translate-sve.c:4249:12: error: expected declaration specifiers or '...' before numeric constant 4249 \| DO_ZPZZ_FP(FSCALE, aa64_sve, sve_fscalbn) \| ^~~~~~ ../target/arm/tcg/translate.h:875:25: note: in definition of macro 'TRANS_FEAT' 875 \| static bool trans_##NAME(DisasContext s, arg_##NAME a) \ \| ^~~~ ../target/arm/tcg/translate-sve.c:4249:1: note: in expansion of macro 'DO_ZPZZ_FP' 4249 \| DO_ZPZZ_FP(FSCALE, aa64_sve, sve_fscalbn) \| ^~~~~~~~~~ ../target/arm/tcg/translate.h:875:47: error: pasting "arg_" and "(" does not give a valid preprocessing token 875 \| static bool trans_##NAME(DisasContext s, arg_##NAME a) \ \| ^~~~ ../target/arm/tcg/translate-sve.c:4205:5: note: in expansion of macro 'TRANS_FEAT' 4205 \| TRANS_FEAT(NAME, FEAT, gen_gvec_fpst_arg_zpzz, name##_zpzz_fns[a->esz], a) \| ^~~~~~~~~~ ../target/arm/tcg/translate-sve.c:4249:1: note: in expansion of macro 'DO_ZPZZ_FP' 4249 \| DO_ZPZZ_FP(FSCALE, aa64_sve, sve_fscalbn) \| ^~~~~~~~~~ In file included from ../target/arm/tcg/translate-sve.c💯 libqemu-aarch64-softmmu.a.p/decode-sve.c.inc:1227:13: warning: 'trans_FSCALE' used but never defined 1227 \| static bool trans_FSCALE(DisasContext ctx, arg_FSCALE a); \| ^~~~~~~~~~~~ ../target/arm/tcg/translate-sve.c:4249:30: warning: 'sve_fscalbn_zpzz_fns' defined but not used [-Wunused-const-variable=] 4249 \| DO_ZPZZ_FP(FSCALE, aa64_sve, sve_fscalbn) \| ^~~~~~~~~~~ ../target/arm/tcg/translate-sve.c:4201:42: note: in definition of macro 'DO_ZPZZ_FP' 4201 \| static gen_helper_gvec_4_ptr const name##_zpzz_fns[4] = { \ \| ^~~~ As a kludge, undefine it globally in <qemu/osdep.h>. Suggested-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20251203120315.62889-1-philmd@linaro.org>	2025-12-09 20:42:53 +01:00
Richard Henderson	8c00f56fca	tcg/tci: Disable -Wundef FFI_GO_CLOSURES warning Since we build TCI with FFI (commit `22f15579fa` "tcg: Build ffi data structures for helpers") we get on Darwin: In file included from ../../tcg/tci.c:22: In file included from include/tcg/helper-info.h:13: /Library/Developer/CommandLineTools/SDKs/MacOSX15.sdk/usr/include/ffi/ffi.h:483:5: warning: 'FFI_GO_CLOSURES' is not defined, evaluates to 0 [-Wundef] 483 \| #if FFI_GO_CLOSURES \| ^ 1 warning generated. This was fixed in upstream libffi in 2023, but not backported to MacOSX. Simply disable the warning locally. Reported-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2025-12-05 07:50:15 -06:00
Peter Maydell	ef44cc0a76	hw/pci: Make msix_init take a uint32_t for nentries msix_init() and msix_init_exclusive_bar() take an "unsigned short" argument for the number of MSI-X vectors to try to use. This is big enough for the maximum permitted number of vectors, which is 2048. Unfortunately, we have several devices (most notably virtio) which allow the user to specify the desired number of vectors, and which use uint32_t properties for this. If the user sets the property to a value that is too big for a uint16_t, the value will be truncated when it is passed to msix_init(), and msix_init() may then return success if the truncated value is a valid one. The resulting mismatch between the number of vectors the msix code thinks the device has and the number of vectors the device itself thinks it has can cause assertions, such as the one in issue 2631, where "-device virtio-mouse-pci,vectors=19923041" is interpreted by msix as "97 vectors" and by the virtio-pci layer as "19923041 vectors"; a guest attempt to access vector 97 thus passes the virtio-pci bounds checking and hits an essertion in msix_vector_use(). Avoid this by making msix_init() and its wrapper function msix_init_exclusive_bar() take the number of vectors as a uint32_t. The erroneous command line will now produce the warning qemu-system-i386: -device virtio-mouse-pci,vectors=19923041: warning: unable to init msix vectors to 19923041 and proceed without crashing. (The virtio device warns and falls back to not using MSIX, rather than complaining that the option is not a valid value this is the same as the existing behaviour for values that are beyond the MSI-X maximum possible value but fit into a 16-bit integer, like 2049.) To ensure this doesn't result in potential overflows in calculation of the BAR size in msix_init_exclusive_bar(), we duplicate the nentries error-check from msix_init() at the top of msix_init_exclusive_bar(), so we know nentries is sane before we start using it. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2631 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20251107131044.1321637-1-peter.maydell@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2025-11-25 22:41:40 +01:00
Richard Henderson	a8d023be62	Merge tag 'for-upstream' of https://repo.or.cz/qemu/kevin into staging Block layer patches - Image creation: Honour pwrite_zeroes_alignment for zeroing first sector - block-backend: Fix race (causing a crash) when resuming queued requests # -----BEGIN PGP SIGNATURE----- # # iQJFBAABCgAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmklvQMRHGt3b2xmQHJl # ZGhhdC5jb20ACgkQfwmycsiPL9byFA//d9VtU3wLZpJRL2mnYH2qJME3WeqJaSB+ # FzkG32gkCb0JtH5yr427oJYKhZsKpNkz20E7z4+1ZT4ovcjo7mddJYW7DwaMjUmO # G3UXWE33ayLNZFMDrsMRV5tfiQkSb7Y0ekYfwU7GjC3qhMhRIX9eCRBrCLD6jdUx # mg2h0ML0smE9AV5AEuunwSoqp+rD+OpRQ6EBkkCVF5iMlIHeiewP/TQbJtKBtxdK # AumiIcYgPbH7QFG8kDTmVCCGPDC0v2i1G6Owtptbt9RmWTEGp++Ngm8F+7u/kPMk # weRhlVhnxwDxVxmHzvysh0m+n08oVJyA2vB4QJrti6ZmgDcJYulxFfQgPCKxjvGd # 6va02q0DYrCbO3YiViaAtnudEuqqaB1to57jeQq6tP9KrpH8uzAddrFWeb3TY4gN # CvWr+p4V7bYvteNASJt/+VC5T3haR+U5eCRD5nOKPyXqCbMT+z6zZRuYxP2q1W6i # VwQLIjuWIx+bXVRUrHkf9VNy1clB4ga+ZDbTGFrl0NOLDcn6u3Vcr4GQ7VvQ31Pj # ulGA9F+DXjPRQpZC+WnCZsBSLwVBrNeYPyxsCSk2ORH930djgb7e1lxX5OawT7MT # lNzbQ+N7PXCd5Yt0UyJ3uCF6gqlpvmUV7IZMbyoYHceoCnz8+McqvGORYfzkLwk9 # HUDS3UTI8Ks= # =57x4 # -----END PGP SIGNATURE----- # gpg: Signature made Tue 25 Nov 2025 06:28:19 AM PST # gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6 # gpg: issuer "kwolf@redhat.com" # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [unknown] # gpg: WARNING: The key's User ID is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * tag 'for-upstream' of https://repo.or.cz/qemu/kevin: iotests: add Linux loop device image creation test block: use pwrite_zeroes_alignment when writing first sector file-posix: populate pwrite_zeroes_alignment block-backend: Fix race when resuming queued requests Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2025-11-25 10:25:16 -08:00
Stefan Hajnoczi	d704a13d2c	block: use pwrite_zeroes_alignment when writing first sector Since commit `5634622bcb` ("file-posix: allow BLKZEROOUT with -t writeback"), qemu-img create errors out on a Linux loop block device with a 4 KB sector size: # dd if=/dev/zero of=blockfile bs=1M count=1024 # losetup --sector-size 4096 /dev/loop0 blockfile # qemu-img create -f raw /dev/loop0 1G Formatting '/dev/loop0', fmt=raw size=1073741824 qemu-img: /dev/loop0: Failed to clear the new image's first sector: Invalid argument Use the pwrite_zeroes_alignment block limit to avoid misaligned fallocate(2) or ioctl(BLKZEROOUT) in the block/file-posix.c block driver. Cc: qemu-stable@nongnu.org Fixes: `5634622bcb` ("file-posix: allow BLKZEROOUT with -t writeback") Reported-by: Jean-Louis Dupond <jean-louis@dupond.be> Buglink: https://gitlab.com/qemu-project/qemu/-/issues/3127 Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20251007141700.71891-3-stefanha@redhat.com> Tested-by: Fiona Ebner <f.ebner@proxmox.com> Reviewed-by: Fiona Ebner <f.ebner@proxmox.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-11-25 15:26:22 +01:00
Klaus Jensen	3050b34921	hw/nvme: fix namespace atomic parameter setup Coverity complains about a possible copy-paste error in the verification of the namespace atomic parameters (CID 1642811). While the check is correct, the code (and the intention) is unclear. Fix this by reworking how the parameters are verified. Peter also identified that the realize function was not correctly erroring out if parameters were misconfigured, so fix that too. Lastly, change the error messages to be more describing. Coverity: CID 1642811 Fixes: `bce51b8370` ("hw/nvme: add atomic boundary support") Fixes: `3b41acc962` ("hw/nvme: enable ns atomic writes") Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>	2025-11-25 09:21:35 +01:00
Richard Henderson	5e0242e9a8	Merge tag 'hw-misc-20251118' of https://github.com/philmd/qemu into staging Misc HW patches - Re-enable xenpvh machine in qemu-system-arm/aarch64 binaries - Correct Xilinx Zynq DMA Devcfg registers range size - Correct ACCEL_KERNEL_GSI_IRQFD_POSSIBLE typo - Allow for multiple CHR_EVENT_CLOSED events in QTest framework - Fix ACMD41 state machine for SD cards in SPI mode - Avoid confusing address calculation around eMMC RPMB HMAC - Fix a pair of build failures on Solaris (guest-agent and RDMA migration) - Correct QOM parent of LASI south bridge - Clarify MIPS / PPC 32-bit hosts removal in documentation - Prevent further uses of DEVICE_NATIVE_ENDIAN definition - Fix Error uses in eBPF - Update David Hildenbrand's email address # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmkcwjAACgkQ4+MsLN6t # wN7Wxg//UMbpEgp92clPcGUX1RFHViEYu5DDM96nwjLpOR8nNAJvLZ5+qxDfyZRQ # qfVGaE0cm5a/rXRMgFAzeJw5ptcSwLJXsUvnRuNLEpKlIAfqInqqk+JTi/r7hJSq # W8m07IrdtADwoas0OYKur0XwF+k1hqVOENQWPxiLiyViEH2tR8MFA+nrqQhZzgwo # Emu3ICc01wX+hhY2R51mf+GdVcmr8RACc07lmG7MnMtvQW8vzCkA/VJ5jWWQv6Xj # ADKBTciYEK/PKD5vbbwMadZfxaWRiH1l+unfpw0qXC46YOAMvpe3+0mRqk7VeSRc # anqdXQk9dbqw7qwJ+L+RVdUjNf1bLc9LxOePeMOgsNzd8wxlsBia9PDNxvVTRFmh # /JxLYO9bM4vRojaGOCFoppoF++JSdZzI6WM56hY465L3VCx36V1p2YESX8x/5F1B # +w/JPV0dUGeq+MFUNKg/pBy9dgRYIGJfcbcp2jwMxyEB5d0np53zXbMaZmqX/cEO # AjE/haqtpu/yAqSK7oklx1gJEI9gRE0cJp2B/7l/3RwW3fcMsN3HJB3GH8f+3vg2 # VQMYDrAWBF5wA/5HQtsGNrfImlYGHa535KnLujTcNLVwS+2gZ6N6FwfwhM2fwXQh # +X7nQZbBsAVa0jDqck8zkIarVuISocC10DWfuP5k4hlKxeyg71M= # =K5DF # -----END PGP SIGNATURE----- # gpg: Signature made Tue 18 Nov 2025 08:00:00 PM CET # gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE # gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: FAAB E75E 1291 7221 DCFD 6BB2 E3E3 2C2C DEAD C0DE * tag 'hw-misc-20251118' of https://github.com/philmd/qemu: ebpf: Make ebpf_rss_load() return value consistent with @errp ebpf: Clean up useless error check in ebpf_rss_set_all() ebpf: Fix stubs to set an error when they return failure scripts/checkpatch: Check DEVICE_NATIVE_ENDIAN docs: Mention 32-bit PPC host as removed docs: Correct release of MIPS deprecations / removals migration/rdma: Check ntohll() availability with meson buildsys: Remove dead 'mips' entry in supported_cpus[] array hw/southbridge/lasi: Correct LasiState parent qga/commands: Include proper Solaris header for getloadavg() hw/sd/sdcard: Avoid confusing address calculation in rpmb_calc_hmac hw/arm: Re-enable xenpvh machine in qemu-system-arm/aarch64 binaries hw/dma/zynq-devcfg: Fix register memory hw/sd: Fix ACMD41 state machine in SPI mode hw/sd: Fix incorrect idle state reporting in R1 response for SPI mode system/qtest.c: Allow for multiple CHR_EVENT_CLOSED events hw/intc/ioapic: Fix ACCEL_KERNEL_GSI_IRQFD_POSSIBLE typo MAINTAINERS: Update David Hildenbrand's email address Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2025-11-19 07:38:44 +01:00
Philippe Mathieu-Daudé	9c3b76a0d4	hw/southbridge/lasi: Correct LasiState parent TYPE_LASI_CHIP inherits from TYPE_SYS_BUS_DEVICE, not TYPE_PCI_HOST_BRIDGE, so its parent structure is of SysBusDevice type. Cc: qemu-stable@nongnu.org Fixes: `376b851909` ("hppa: Add support for LASI chip with i82596 NIC") Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Gustavo Romero <gustavo.romero@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <20251117091804.56529-1-philmd@linaro.org>	2025-11-18 19:59:36 +01:00
Hanna Czenczek	d45b2c65f2	block: Note in which AioContext AIO CBs are called This doesn’t seem to be specified anywhere, but is something we probably want to be clear. I believe it is reasonable to implicitly assume that callbacks are run in the current thread (unless explicitly noted otherwise), so codify that assumption. Some implementations don’t actually fulfill this contract yet. The next patches should rectify that. Note: I don’t know of any user-visible bugs produced by not running AIO callbacks in the original context. AIO functionality is generally mapped to coroutines through the use of bdrv_co_io_em_complete(), which can run in any AioContext, and will always wake the yielding coroutine in its original context. The only benefit here is that running bdrv_co_io_em_complete() in the original context will make that aio_co_wake() most likely a simpler qemu_coroutine_enter() instead of scheduling the wakeup through AioContext.co_schedule_bh. Signed-off-by: Hanna Czenczek <hreitz@redhat.com> Message-ID: <20251110154854.151484-17-hreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-11-18 18:01:55 +01:00
Hanna Czenczek	aed74d3d62	block: Note on aio_co_wake use if not yet yielding aio_co_wake() is generally safe to call regardless of whether the coroutine is already yielding or not. If it is not yet yielding, it will be scheduled to run when it does yield. Caveats: - The caller must be independent of the coroutine (to ensure the coroutine must be yielding if both are in the same AioContext), i.e. must not be the same coroutine - The coroutine must yield at some point Make note of this so callers can reason that their use is safe. Signed-off-by: Hanna Czenczek <hreitz@redhat.com> Message-ID: <20251110154854.151484-2-hreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-11-18 18:01:39 +01:00
Eric Blake	de252f7993	qio: Add QIONetListener API for using AioContext The user calling himself "John Doe" reported a deadlock when attempting to use qemu-storage-daemon to serve both a base file over NBD, and a qcow2 file with that NBD export as its backing file, from the same process, even though it worked just fine when there were two q-s-d processes. The bulk of the NBD server code properly uses coroutines to make progress in an event-driven manner, but the code for spawning a new coroutine at the point when listen(2) detects a new client was hard-coded to use the global GMainContext; in other words, the callback that triggers nbd_client_new to let the server start the negotiation sequence with the client requires the main loop to be making progress. However, the code for bdrv_open of a qcow2 image with an NBD backing file uses an AIO_WAIT_WHILE nested event loop to ensure that the entire qcow2 backing chain is either fully loaded or rejected, without any side effects from the main loop causing unwanted changes to the disk being loaded (in short, an AioContext represents the set of actions that are known to be safe while handling block layer I/O, while excluding any other pending actions in the global main loop with potentially larger risk of unwanted side effects). This creates a classic case of deadlock: the server can't progress to the point of accept(2)ing the client to write to the NBD socket because the main loop is being starved until the AIO_WAIT_WHILE completes the bdrv_open, but the AIO_WAIT_WHILE can't progress because it is blocked on the client coroutine stuck in a read() of the expected magic number from the server side of the socket. This patch adds a new API to allow clients to opt in to listening via an AioContext rather than a GMainContext. This will allow NBD to fix the deadlock by performing all actions during bdrv_open in the main loop AioContext. Technical debt warning: I would have loved to utilize a notify function with AioContext to guarantee that we don't finalize listener due to an object_unref if there is any callback still running (the way GSource does), but wiring up notify functions into AioContext is a bigger task that will be deferred to a later QEMU release. But for solving the NBD deadlock, it is sufficient to note that the QMP commands for enabling and disabling the NBD server are really the only points where we want to change the listener's callback. Furthermore, those commands are serviced in the main loop, which is the same AioContext that is also listening for connections. Since a thread cannot interrupt itself, we are ensured that at the point where we are changing the watch, there are no callbacks active. This is NOT as powerful as the GSource cross-thread safety, but sufficient for the needs of today. An upcoming patch will then add a unit test (kept separate to make it easier to rearrange the series to demonstrate the deadlock without this patch). Fixes: https://gitlab.com/qemu-project/qemu/-/issues/3169 Signed-off-by: Eric Blake <eblake@redhat.com> Message-ID: <20251113011625.878876-26-eblake@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2025-11-13 10:58:26 -06:00
Eric Blake	cc0faf8273	qio: Prepare NetListener to use AioContext For ease of review, this patch adds an AioContext pointer to the QIONetListener struct, the code to trace it, and refactors listener->io_source to instead be an array of utility structs; but the aio_context pointer is always NULL until the next patch adds an API to set it. There should be no semantic change in this patch. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-ID: <20251113011625.878876-25-eblake@redhat.com>	2025-11-13 10:54:50 -06:00
Eric Blake	ec59a65a4d	qio: Provide accessor around QIONetListener->sioc An upcoming patch needs to pass more than just sioc as the opaque pointer to an AioContext; but since our AioContext code in general (and its QIO Channel wrapper code) lacks a notify callback present with GSource, we do not have the trivial option of just g_malloc'ing a small struct to hold all that data coupled with a notify of g_free. Instead, the data pointer must outlive the registered handler; in fact, having the data pointer have the same lifetime as QIONetListener is adequate. But the cleanest way to stick such a helper struct in QIONetListener will be to rearrange internal struct members. And that in turn means that all existing code that currently directly accesses listener->nsioc and listener->sioc[] should instead go through accessor functions, to be immune to the upcoming struct layout changes. So this patch adds accessor methods qio_net_listener_nsioc() and qio_net_listener_sioc(), and puts them to use. While at it, notice that the pattern of grabbing an sioc from the listener only to turn around can call qio_channel_socket_get_local_address is common enough to also warrant the helper of qio_net_listener_get_local_address, and fix a copy-paste error in the corresponding documentation. Signed-off-by: Eric Blake <eblake@redhat.com> Message-ID: <20251113011625.878876-24-eblake@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2025-11-13 10:54:44 -06:00
Eric Blake	9d86181874	qio: Protect NetListener callback with mutex Without a mutex, NetListener can run into this data race between a thread changing the async callback callback function to use when a client connects, and the thread servicing polling of the listening sockets: Thread 1: qio_net_listener_set_client_func(lstnr, f1, ...); => foreach sock: socket => object_ref(lstnr) => sock_src = qio_channel_socket_add_watch_source(sock, ...., lstnr, object_unref); Thread 2: poll() => event POLLIN on socket => ref(GSourceCallback) => if (lstnr->io_func) // while lstnr->io_func is f1 ...interrupt.. Thread 1: qio_net_listener_set_client_func(lstnr, f2, ...); => foreach sock: socket => g_source_unref(sock_src) => foreach sock: socket => object_ref(lstnr) => sock_src = qio_channel_socket_add_watch_source(sock, ...., lstnr, object_unref); Thread 2: => call lstnr->io_func(lstnr->io_data) // now sees f2 => return dispatch(sock) => unref(GSourceCallback) => destroy-notify => object_unref Found by inspection; I did not spend the time trying to add sleeps or execute under gdb to try and actually trigger the race in practice. This is a SEGFAULT waiting to happen if f2 can become NULL because thread 1 deregisters the user's callback while thread 2 is trying to service the callback. Other messes are also theoretically possible, such as running callback f1 with an opaque pointer that should only be passed to f2 (if the client code were to use more than just a binary choice between a single async function or NULL). Mitigating factor: if the code that modifies the QIONetListener can only be reached by the same thread that is executing the polling and async callbacks, then we are not in a two-thread race documented above (even though poll can see two clients trying to connect in the same window of time, any changes made to the listener by the first async callback will be completed before the thread moves on to the second client). However, QEMU is complex enough that this is hard to generically analyze. If QMP commands (like nbd-server-stop) are run in the main loop and the listener uses the main loop, things should be okay. But when a client uses an alternative GMainContext, or if servicing a QMP command hands off to a coroutine to avoid blocking, I am unable to state with certainty whether a given net listener can be modified by a thread different from the polling thread running callbacks. At any rate, it is worth having the API be robust. To ensure that modifying a NetListener can be safely done from any thread, add a mutex that guarantees atomicity to all members of a listener object related to callbacks. This problem has been present since QIONetListener was introduced. Note that this does NOT prevent the case of a second round of the user's old async callback being invoked with the old opaque data, even when the user has already tried to change the async callback during the first async callback; it is only about ensuring that there is no sharding (the eventual io_func(io_data) call that does get made will correspond to a particular combination that the user had requested at some point in time, and not be sharded to a combination that never existed in practice). In other words, this patch maintains the status quo that a user's async callback function already needs to be robust to parallel clients landing in the same window of poll servicing, even when only one client is desired, if that particular listener can be amended in a thread other than the one doing the polling. CC: qemu-stable@nongnu.org Fixes: `53047392` ("io: introduce a network socket listener API", v2.12.0) Signed-off-by: Eric Blake <eblake@redhat.com> Message-ID: <20251113011625.878876-20-eblake@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> [eblake: minor commit message wording improvements] Signed-off-by: Eric Blake <eblake@redhat.com>	2025-11-13 08:46:41 -06:00
Eric Blake	b5676493a0	qio: Remember context of qio_net_listener_set_client_func_full io/net-listener.c has two modes of use: asynchronous (the user calls qio_net_listener_set_client_func to wake up the callback via the global GMainContext, or qio_net_listener_set_client_func_full to wake up the callback via the caller's own alternative GMainContext), and synchronous (the user calls qio_net_listener_wait_client which creates its own GMainContext and waits for the first client connection before returning, with no need for a user's callback). But commit `938c8b79` has a latent logic flaw: when qio_net_listener_wait_client finishes on its temporary context, it reverts all of the siocs back to the global GMainContext rather than the potentially non-NULL context they might have been originally registered with. Similarly, if the user creates a net-listener, adds initial addresses, registers an async callback with a non-default context (which ties to all siocs for the initial addresses), then adds more addresses with qio_net_listener_add, the siocs for later addresses are blindly placed in the global context, rather than sharing the context of the earlier ones. In practice, I don't think this has caused issues. As pointed out by the original commit, all async callers prior to that commit were already okay with the NULL default context; and the typical usage pattern is to first add ALL the addresses the listener will pay attention to before ever setting the async callback. Likewise, if a file uses only qio_net_listener_set_client_func instead of qio_net_listener_set_client_func_full, then it is never using a custom context, so later assignments of async callbacks will still be to the same global context as earlier ones. Meanwhile, any callers that want to do the sync operation to grab the first client are unlikely to register an async callback; altogether bypassing the question of whether later assignments of a GSource are being tied to a different context over time. I do note that chardev/char-socket.c is the only file that calls both qio_net_listener_wait_client (sync for a single client in tcp_chr_accept_server_sync), and qio_net_listener_set_client_func_full (several places, all with chr->gcontext, but sometimes with a NULL callback function during teardown). But as far as I can tell, the two uses are mutually exclusive, based on the is_waitconnect parameter to qmp_chardev_open_socket_server. That said, it is more robust to remember when an async callback function is tied to a non-default context, and have both the sync wait and any late address additions honor that same context. That way, the code will be robust even if a later user performs a sync wait for a specific client in the middle of servicing a longer-lived QIONetListener that has an async callback for all other clients. CC: qemu-stable@nongnu.org Fixes: `938c8b79` ("qio: store gsources for net listeners", v2.12.0) Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-ID: <20251113011625.878876-19-eblake@redhat.com>	2025-11-13 08:29:46 -06:00
Eric Blake	1bd7bfbc2b	block: Allow drivers to control protocol prefix at creation This patch is pure refactoring: instead of hard-coding permission to use a protocol prefix when creating an image, the drivers can now pass in a parameter, comparable to what they could already do for opening a pre-existing image. This patch is purely mechanical (all drivers pass in true for now), but it will enable the next patch to cater to drivers that want to differ in behavior for the primary image vs. any secondary images that are opened at the same time as creating the primary image. Signed-off-by: Eric Blake <eblake@redhat.com> Message-ID: <20250915213919.3121401-5-eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-11-11 22:06:09 +01:00

1 2 3 4 5 ...

18508 Commits