izzy/xemu

mirror of https://github.com/izzy2lost/xemu.git synced 2026-03-26 18:22:55 -07:00

Author	SHA1	Message	Date
Matt Borgerson	704ece9ac6	Merge QEMU v10.2.0	2026-01-18 16:36:55 -07:00
Klaus Jensen	3050b34921	hw/nvme: fix namespace atomic parameter setup Coverity complains about a possible copy-paste error in the verification of the namespace atomic parameters (CID 1642811). While the check is correct, the code (and the intention) is unclear. Fix this by reworking how the parameters are verified. Peter also identified that the realize function was not correctly erroring out if parameters were misconfigured, so fix that too. Lastly, change the error messages to be more describing. Coverity: CID 1642811 Fixes: `bce51b8370` ("hw/nvme: add atomic boundary support") Fixes: `3b41acc962` ("hw/nvme: enable ns atomic writes") Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>	2025-11-25 09:21:35 +01:00
Hanna Czenczek	d45b2c65f2	block: Note in which AioContext AIO CBs are called This doesn’t seem to be specified anywhere, but is something we probably want to be clear. I believe it is reasonable to implicitly assume that callbacks are run in the current thread (unless explicitly noted otherwise), so codify that assumption. Some implementations don’t actually fulfill this contract yet. The next patches should rectify that. Note: I don’t know of any user-visible bugs produced by not running AIO callbacks in the original context. AIO functionality is generally mapped to coroutines through the use of bdrv_co_io_em_complete(), which can run in any AioContext, and will always wake the yielding coroutine in its original context. The only benefit here is that running bdrv_co_io_em_complete() in the original context will make that aio_co_wake() most likely a simpler qemu_coroutine_enter() instead of scheduling the wakeup through AioContext.co_schedule_bh. Signed-off-by: Hanna Czenczek <hreitz@redhat.com> Message-ID: <20251110154854.151484-17-hreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-11-18 18:01:55 +01:00
Hanna Czenczek	aed74d3d62	block: Note on aio_co_wake use if not yet yielding aio_co_wake() is generally safe to call regardless of whether the coroutine is already yielding or not. If it is not yet yielding, it will be scheduled to run when it does yield. Caveats: - The caller must be independent of the coroutine (to ensure the coroutine must be yielding if both are in the same AioContext), i.e. must not be the same coroutine - The coroutine must yield at some point Make note of this so callers can reason that their use is safe. Signed-off-by: Hanna Czenczek <hreitz@redhat.com> Message-ID: <20251110154854.151484-2-hreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-11-18 18:01:39 +01:00
Eric Blake	1bd7bfbc2b	block: Allow drivers to control protocol prefix at creation This patch is pure refactoring: instead of hard-coding permission to use a protocol prefix when creating an image, the drivers can now pass in a parameter, comparable to what they could already do for opening a pre-existing image. This patch is purely mechanical (all drivers pass in true for now), but it will enable the next patch to cater to drivers that want to differ in behavior for the primary image vs. any secondary images that are opened at the same time as creating the primary image. Signed-off-by: Eric Blake <eblake@redhat.com> Message-ID: <20250915213919.3121401-5-eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-11-11 22:06:09 +01:00
Yeqi Fu	9730b9974d	block: replace TABs with space Bring the block files in line with the QEMU coding style, with spaces for indentation. This patch partially resolves the issue 371. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/371 Signed-off-by: Yeqi Fu <fufuyqqqqqq@gmail.com> Message-ID: <20230325085224.23842-1-fufuyqqqqqq@gmail.com> [thuth: Rebased the patch to the current master branch] Signed-off-by: Thomas Huth <thuth@redhat.com> Message-ID: <20251007163511.334178-1-thuth@redhat.com> [kwolf: Fixed up vertical alignemnt] Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-11-11 22:06:09 +01:00
Stefan Hajnoczi	047dabef97	block/io_uring: use aio_add_sqe() AioContext has its own io_uring instance for file descriptor monitoring. The disk I/O io_uring code was developed separately. Originally I thought the characteristics of file descriptor monitoring and disk I/O were too different, requiring separate io_uring instances. Now it has become clear to me that it's feasible to share a single io_uring instance for file descriptor monitoring and disk I/O. We're not using io_uring's IOPOLL feature or anything else that would require a separate instance. Unify block/io_uring.c and util/fdmon-io_uring.c using the new aio_add_sqe() API that allows user-defined io_uring sqe submission. Now block/io_uring.c just needs to submit readv/writev/fsync and most of the io_uring-specific logic is handled by fdmon-io_uring.c. There are two immediate advantages: 1. Fewer system calls. There is no need to monitor the disk I/O io_uring ring fd from the file descriptor monitoring io_uring instance. Disk I/O completions are now picked up directly. Also, sqes are accumulated in the sq ring until the end of the event loop iteration and there are fewer io_uring_enter(2) syscalls. 2. Less code duplication. Note that error_setg() messages are not supposed to end with punctuation, so I removed a '.' for the non-io_uring build error message. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-ID: <20251104022933.618123-15-stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-11-11 22:06:09 +01:00
Stefan Hajnoczi	1eebdab3c3	aio-posix: add aio_add_sqe() API for user-defined io_uring requests Introduce the aio_add_sqe() API for submitting io_uring requests in the current AioContext. This allows other components in QEMU, like the block layer, to take advantage of io_uring features without creating their own io_uring context. This API supports nested event loops just like file descriptor monitoring and BHs do. This comes at a complexity cost: CQE callbacks must be placed on a list so that nested event loops can invoke pending CQE callbacks from parent event loops. If you're wondering why CqeHandler exists instead of just a callback function pointer, this is why. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-ID: <20251104022933.618123-14-stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-11-11 22:06:09 +01:00
Stefan Hajnoczi	87e7a0f423	aio-posix: add fdmon_ops->dispatch() The ppoll and epoll file descriptor monitoring implementations rely on the event loop's generic file descriptor, timer, and BH dispatch code to invoke user callbacks. The io_uring file descriptor monitoring implementation will need io_uring-specific dispatch logic for CQE handlers for custom SQEs. Introduce a new FDMonOps ->dispatch() callback that allows file descriptor monitoring implementations to invoke user callbacks. The next patch will use this new callback. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20251104022933.618123-13-stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-11-11 22:06:09 +01:00
Stefan Hajnoczi	421dcc8023	aio: add errp argument to aio_context_setup() When aio_context_new() -> aio_context_setup() fails at startup it doesn't really matter whether errors are returned to the caller or the process terminates immediately. However, it is not acceptable to terminate when hotplugging --object iothread at runtime. Refactor aio_context_setup() so that errors can be propagated. The next commit will set errp when fdmon_io_uring_setup() fails. Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20251104022933.618123-10-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-11-11 22:06:09 +01:00
Stefan Hajnoczi	3769b9abe9	aio: free AioContext when aio_context_new() fails g_source_destroy() only removes the GSource from the GMainContext it's attached to, if any. It does not free it. Use g_source_unref() instead so that the AioContext (which embeds a GSource) is freed. There is no need to call g_source_destroy() in aio_context_new() because the GSource isn't attached to a GMainContext yet. aio_ctx_finalize() expects everything to be set up already, so introduce the new ctx->initialized boolean and do nothing when called with !initialized. This also requires moving aio_context_setup() down after event_notifier_init() since aio_ctx_finalize() won't release any resources that aio_context_setup() acquired. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-ID: <20251104022933.618123-9-stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-11-11 22:06:09 +01:00
Stefan Hajnoczi	d1f42b600a	aio: remove aio_context_use_g_source() There is no need for aio_context_use_g_source() now that epoll(7) and io_uring(7) file descriptor monitoring works with the glib event loop. AioContext doesn't need to be notified that GSource is being used. On hosts with io_uring support this now enables fdmon-io_uring.c by default, replacing fdmon-poll.c and fdmon-epoll.c. In other words, the event loop will use io_uring! Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20251104022933.618123-8-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-11-11 22:06:09 +01:00
Stefan Hajnoczi	ded29e64c6	aio-posix: integrate fdmon into glib event loop AioContext's glib integration only supports ppoll(2) file descriptor monitoring. epoll(7) and io_uring(7) disable themselves and switch back to ppoll(2) when the glib event loop is used. The main loop thread cannot use epoll(7) or io_uring(7) because it always uses the glib event loop. Future QEMU features may require io_uring(7). One example is uring_cmd support in FUSE exports. Each feature could create its own io_uring(7) context and integrate it into the event loop, but this is inefficient due to extra syscalls. It would be more efficient to reuse the AioContext's existing fdmon-io_uring.c io_uring(7) context because fdmon-io_uring.c will already be active on systems where Linux io_uring is available. In order to keep fdmon-io_uring.c's AioContext operational even when the glib event loop is used, extend FDMonOps with an API similar to GSourceFuncs so that file descriptor monitoring can integrate into the glib event loop. A quick summary of the GSourceFuncs API: - prepare() is called each event loop iteration before waiting for file descriptors and timers. - check() is called to determine whether events are ready to be dispatched after waiting. - dispatch() is called to process events. More details here: https://docs.gtk.org/glib/struct.SourceFuncs.html Move the ppoll(2)-specific code from aio-posix.c into fdmon-poll.c and also implement epoll(7)- and io_uring(7)-specific file descriptor monitoring code for glib event loops. Note that it's still faster to use aio_poll() rather than the glib event loop since glib waits for file descriptor activity with ppoll(2) and does not support adaptive polling. But at least epoll(7) and io_uring(7) now work in glib event loops. Splitting this into multiple commits without temporarily breaking AioContext proved difficult so this commit makes all the changes. The next commit will remove the aio_context_use_g_source() API because it is no longer needed. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-ID: <20251104022933.618123-7-stefanha@redhat.com> [kwolf: Build fixes; fix AioContext.list_lock use after destroy] Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-11-11 22:04:53 +01:00
Richard Henderson	c494afbb7d	Merge tag 'pull-nvme-20251030' of https://gitlab.com/birkelund/qemu into staging nvme queue # -----BEGIN PGP SIGNATURE----- # # iQEzBAABCgAdFiEEUigzqnXi3OaiR2bATeGvMW1PDekFAmkDE7gACgkQTeGvMW1P # DekCOwgAuOQKWWW/UA1MmZ4ZHs+djf4q5UDwqGDx8tra8d32mZWRHgpJ/OBBOY2z # CmuHqWLgooAqfx4hsrXELdNBEe7ccNE9nvsE3GjnYWxjoe51yl2Xc0RD5CZBVrN4 # RRMbBZRCewxGShyUaT31eedolWdr4zBuqkpLf9gcG8Yk7YD+xUkHUPeMXeAy+vkS # pxW59AkXdjJZgBktOdV5uVj9gaCPgTcGaQNH2FYSnzHwdu5VyV8BKiiZE/fXS6FU # xZvu+5p1Ro5vOdwG+iFBrbBwcGyjVOF1OfBZctyc83foyFxwzxqoqj9gy0ewuT2g # HsupUiJgbkZ1Ut9fzaS5pHx3dd3dKw== # =WDrH # -----END PGP SIGNATURE----- # gpg: Signature made Thu 30 Oct 2025 08:28:56 AM CET # gpg: using RSA key 522833AA75E2DCE6A24766C04DE1AF316D4F0DE9 # gpg: Good signature from "Klaus Jensen <its@irrelevant.dk>" [unknown] # gpg: aka "Klaus Jensen <k.jensen@samsung.com>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: DDCA 4D9C 9EF9 31CC 3468 4272 63D5 6FC5 E55D A838 # Subkey fingerprint: 5228 33AA 75E2 DCE6 A247 66C0 4DE1 AF31 6D4F 0DE9 * tag 'pull-nvme-20251030' of https://gitlab.com/birkelund/qemu: hw/nvme: add atomic boundary support hw/nvme: enable ns atomic writes hw/nvme: connect SPDM over NVMe Security Send/Recv spdm: define SPDM transport enum types hw/nvme: add NVMe Admin Security SPDM support spdm: add spdm storage transport virtual header spdm-socket: add seperate send/recv functions Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2025-10-31 12:56:05 +01:00
Wilfred Mallawa	e5534abeb4	hw/nvme: add NVMe Admin Security SPDM support Adds the NVMe Admin Security Send/Receive command support with support for DMTFs SPDM. The transport binding for SPDM is defined in the DMTF DSP0286. Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>	2025-10-30 07:07:14 +01:00
Kevin Wolf	5b4b3bfdfc	qemu-img info: Optionally show block limits Add a new --limits option to 'qemu-img info' that displays the block limits for the image and all of its children, making the information more accessible for human users than in QMP. This option is not enabled by default because it can be a lot of output that isn't usually relevant if you're not specifically trying to diagnose some I/O problem. This makes the same information automatically also available in HMP 'info block -v'. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Message-ID: <20251024123041.51254-4-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-10-29 12:10:10 +01:00
Kevin Wolf	46dd683d56	block: Improve comments in BlockLimits Patches to expose the limits in QAPI have made clear that the existing documentation of BlockLimits could be improved: The meaning of min_mem_alignment and opt_mem_alignment could be clearer, and talking about better alignment values isn't helpful when we only detect these values and never choose them. Make the changes in the BlockLimits documentation now, so that the patches exposing the fields in QAPI can use descriptions consistent with it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20251024123041.51254-2-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-10-29 12:10:09 +01:00
Fiona Ebner	cbadaf57a7	block: implement 'resize' callback for child_of_bds class If a filtered child is resized, the size of the parent node is now also refreshed (recursively for chains of filtered children). For filter block drivers that do not implement .bdrv_co_getlength(), this commit does not change the current behavior, because bdrv_co_refresh_total_sectors() will used the current size via the passed-in hint. This is the case for block drivers for (some) block jobs, as well as copy-before-write. Block jobs already set up a blocker preventing a QMP block_resize operation while the job is running. That does not directly cover an associated 'file' node of a 'raw' node, but resizing such a 'file' node is already prevented too (backup, commit, mirror and stream were checked). The other case is copy-before-write. This commit does not change the fact that the copy-before-write node still has the same size after its filtered child is resized. Block drivers that do implement .bdrv_co_getlength() and where .is_filter is true, already returned the length of the file child, so there is no change before and after this commit, with two exceptions: 1. preallocate can return an early data_end and otherwise queries the file child, but that special casing is not changed. 2. blkverify returns the length of the test file. This commit does not affect that behavior. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Message-ID: <20250917115509.401015-4-f.ebner@proxmox.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-10-29 12:10:09 +01:00
Fiona Ebner	08736e7584	block: make bdrv_co_parent_cb_resize() a proper IO API function In preparation for calling it via the bdrv_child_cb_resize() callback that will be added by the next commit. Rename it to include the "_co_" part while at it. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Message-ID: <20250917115509.401015-3-f.ebner@proxmox.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-10-29 12:10:09 +01:00
Fiona Ebner	4120375420	include/block/block_int-common: document when resize callback is used The 'resize' callback is only called by bdrv_parent_cb_resize() which is only called by bdrv_co_write_req_finish() to notify the parent(s) that the child was resized. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Message-ID: <20250917115509.401015-2-f.ebner@proxmox.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-10-29 12:10:09 +01:00
Chandan Somani	9f0c763e16	block: enable stats-intervals for storage devices This patch allows stats-intervals to be used for storage devices with the -device option. It accepts a list of interval lengths in JSON format. It configures and collects the stats in the BlockBackend layer through the storage device that consumes the BlockBackend. Signed-off-by: Chandan Somani <csomani@redhat.com> Message-ID: <20251003220039.1336663-1-csomani@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-10-29 12:10:09 +01:00
Fiona Ebner	a256a427b0	blockjob: mark block_job_remove_all_bdrv() as GRAPH_UNLOCKED The function block_job_remove_all_bdrv() calls bdrv_graph_wrlock_drained(), which must be called with the graph unlocked. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Message-ID: <20250530151125.955508-49-f.ebner@proxmox.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-07-14 15:42:28 +02:00
Fiona Ebner	2cf92b15cd	block: mark bdrv_open_child_common() and its callers GRAPH_UNLOCKED The function bdrv_open_child_common() calls bdrv_graph_wrlock_drained(), which must be called with the graph unlocked. Mark it and its two callers bdrv_open_file_child() and bdrv_open_child() as GRAPH_UNLOCKED. This requires temporarily unlocking in vmdk_parse_extents() and making the locked section shorter in vmdk_open(). Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Message-ID: <20250530151125.955508-48-f.ebner@proxmox.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-07-14 15:42:27 +02:00
Fiona Ebner	ede0859311	block: mark bdrv_close() as GRAPH_UNLOCKED The functions blk_log_writes_close(), blkverify_close(), quorum_close(), vmdk_close() via vmdk_free_extents(), and other bdrv_close() implementations call bdrv_graph_wrlock_drained(), which must be called with the graph unlocked. They are reached via the BlockDriver's bdrv_close() callback and the bdrv_close() wrapper, which are also marked as GRAPH_UNLOCKED_PTR and GRAPH_UNLOCKED. Furthermore, the function bdrv_close() also calls bdrv_drained_begin() and bdrv_graph_wrlock_drained(), so there are additional reasons for marking it GRAPH_UNLOCKED. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Message-ID: <20250530151125.955508-47-f.ebner@proxmox.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-07-14 15:42:26 +02:00
Fiona Ebner	6d7e3f8de0	block: mark bdrv_close_all() as GRAPH_UNLOCKED The function bdrv_close_all() calls bdrv_drain_all(), which must be called with the graph unlocked. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Message-ID: <20250530151125.955508-46-f.ebner@proxmox.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2025-07-14 15:42:25 +02:00

1 2 3 4 5 ...

1679 Commits