With CONFIG_DMA_API_DEBUG enabled, the following warning is observed:
DMA-API: snd_hda_intel 0000:03:00.1: device driver failed to check map error[device address=0x00000000ffff0000] [size=20480 bytes] [mapped as single]
WARNING: CPU: 28 PID: 2255 at kernel/dma/debug.c:1036 check_unmap+0x1408/0x2430
CPU: 28 UID: 42 PID: 2255 Comm: wireplumber Tainted: G W L 6.12.0-10-133577cad6bf48e5a7848c4338124081393bfe8a+ #759
debug_dma_unmap_page+0xe9/0xf0
snd_dma_wc_free+0x85/0x130 [snd_pcm]
snd_pcm_lib_free_pages+0x1e3/0x440 [snd_pcm]
snd_pcm_common_ioctl+0x1c9a/0x2960 [snd_pcm]
snd_pcm_ioctl+0x6a/0xc0 [snd_pcm]
...
Check for returned DMA addresses using specialized dma_mapping_error()
helper which is generally recommended for this purpose by
Documentation/core-api/dma-api.rst.
Fixes: c880a51466 ("ALSA: memalloc: Use proper DMA mapping API for x86 WC buffer allocations")
Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Closes: https://lore.kernel.org/r/CABXGCsNB3RsMGvCucOy3byTEOxoc-Ys+zB_HQ=Opb_GhX1ioDA@mail.gmail.com/
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Link: https://patch.msgid.link/20241219203345.195898-1-pchelkin@ispras.ru
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The fallback S/G buffer allocation for x86 used the addresses deduced
from the page allocations blindly. It broke the allocations on IOMMU
and made us to work around with a hackish DMA ops check.
For cleaning up those messes, this patch switches to the proper DMA
mapping API usages with the standard sg-table instead.
By introducing the sg-table, the address table isn't needed, but for
keeping the original allocation sizes for freeing, replace it with the
array keeping the number of pages.
The get_addr callback is changed to use the existing one for
non-contiguous buffers. (Also it's the reason sg_table is put at the
beginning of struct snd_dma_sg_fallback.)
And finally, the hackish workaround that checks the DMA ops is
dropped now.
Link: https://patch.msgid.link/20240912155227.4078-3-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The x86 WC page allocation assumes incorrectly the DMA address
directly taken from the page. Also it checks the DMA ops
inappropriately for switching to the own method.
This patch rewrites the stuff to use the proper DMA mapping API
instead.
Link: https://patch.msgid.link/20240912155227.4078-2-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Since recently in the commit e469e2045f ("ALSA: memalloc: Let IOMMU
handle S/G primarily"), the SG buffer allocation code was modified to
use the standard DMA code primarily and the fallback is applied only
limitedly. This made the Xen PV specific workarounds we took in the
commit 53466ebdec ("ALSA: memalloc: Workaround for Xen PV") rather
superfluous.
It was a hackish workaround for the regression at that time, and it
seems that it's causing another issues (reportedly memory
corruptions). So it's better to clean it up, after all.
Link: https://lore.kernel.org/20240906184209.25423-1-ariadne@ariadne.space
Cc: Ariadne Conill <ariadne@ariadne.space>
Link: https://patch.msgid.link/20240910113100.32542-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The definition of struct snd_malloc_ops was moved out to
memalloc_local.h since there was another code for S/G buffer
allocation referring to the struct. But since the code change to use
non-contiguous allocators, it's solely referred in memalloc.c, hence
it makes little sense to have a separate header file.
Let's move it back to memalloc.c.
Link: https://patch.msgid.link/20240910113141.32618-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The recent changes in IOMMU made the non-contiguous page allocations
as default, hence we can simply use the standard DMA allocation for
the S/G pages as well. In this patch, we simplify the code by trying
the standard DMA allocation at first, instead of
dma_alloc_noncontiguous().
For the case without IOMMU, we still need to manage the S/G pages
manually, so we keep the same fallback routines like before.
The fallback types (SNDRV_DMA_TYPE_DEV_SG_FALLBACK & co) are dropped /
folded into SNDRV_DMA_TYPE_DEV_SG and co now. The allocation via the
standard DMA call overrides the type accordingly, hence we don't have
to have extra fallback types any longer. OTOH, SNDRV_DMA_TYPE_DEV_SG
is no longer an alias but became its own type back again.
Note that this patch requires another prerequisite fix for memmalloc
helper to use the DMA API for WC pages on x86.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=219087
Link: https://patch.msgid.link/20240801064808.31205-2-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The memalloc helper used a house-made code for allocation of WC pages
on x86, since the standard DMA API doesn't cover it well. Meanwhile,
the manually allocated pages won't work together with IOMMU, resulting
in faults, so we should switch to the DMA API in that case, instead.
This patch tries to switch back to DMA API for WC pages on x86, but
with some additional tweaks that are missing.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=219087
Link: https://patch.msgid.link/20240801064808.31205-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
We change recently the memalloc helper to use
dma_alloc_noncontiguous() and the fallback to get_pages(). Although
lots of issues with IOMMU (or non-IOMMU) have been addressed, but
there seems still a regression on Xen PV. Interestingly, the only
proper way to work is use dma_alloc_coherent(). The use of
dma_alloc_coherent() for SG buffer was dropped as it's problematic on
IOMMU systems. OTOH, Xen PV has a different way, and it's fine to use
the dma_alloc_coherent().
This patch is a workaround for Xen PV. It consists of the following
changes:
- For Xen PV, use only the fallback allocation without
dma_alloc_noncontiguous()
- In the fallback allocation, use dma_alloc_coherent();
the DMA address from dma_alloc_coherent() is returned in get_addr
ops
- The DMA addresses are stored in an array; the first entry stores the
number of allocated pages in lower bits, which are referred at
releasing pages again
Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Fixes: a8d302a0b7 ("ALSA: memalloc: Revive x86-specific WC page allocations again")
Fixes: 9736a32513 ("ALSA: memalloc: Don't fall back for SG-buffer with IOMMU")
Link: https://lore.kernel.org/r/87tu256lqs.wl-tiwai@suse.de
Link: https://lore.kernel.org/r/20230125153104.5527-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
While not quite as bogus as for the dma-coherent allocations that were
fixed earlier, GFP_COMP for these allocations has no benefits for
the dma-direct case, and can't be supported at all by dma dma-iommu
backend which splits up allocations into smaller orders. Due to an
oversight in ffcb754584 that flag stopped being cleared for all
dma allocations, but only got rejected for coherent ones.
Start fixing this by not requesting __GFP_COMP in the sound code, which
is the only place that did this.
Fixes: ffcb754584 ("dma-mapping: reject __GFP_COMP in dma_alloc_attrs")
Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Reported-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Takashi Iwai <tiwai@suse.de>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Tested-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Pull sound updates from Takashi Iwai:
"This looks like a relatively calm development cycle; there have been
only few changes in ALSA and ASoC core sides while we get lots of
device-specific fixes and updates as usual. Most of commits are about
ASoC, including Intel SOF/AVS and many device tree updates.
Below are some highlights:
Core:
- Improvement in memalloc helper for fallback allocations
- More cleanups of ASoC DAPM code
ASoC:
- Factoring out of mapping hw_params onto SoundWire configuration
- The ever ongoing overhauls of the Intel DSP code continue,
including support for loading libraries and probes with IPC4 on
SOF.
- Support for more sample formats on JZ4740
- Lots of device tree conversions and fixups
- Support for Allwinner D1, a range of AMD and Intel systems,
Mediatek systems with multiple DMICs, Nuvoton NAU8318, NXP
fsl_rpmsg and i.MX93, Qualcomm AudioReach Enable, MFC and SAL,
RealTek RT1318 and Rockchip RK3588
ALSA:
- Addition of PCM kselftest; still minimalistic but can be extended
in future
- Fixes for corner-case XRUNs with USB-audio implicit feedback mode
- Usual device-specific quirk updates for USB- and HD-audio
- FireWire DICE updates
This also contains a few cross-tree updates:
- Some OMAP board file updates for removal of relevant OMAP platforms
- A new I2C API update for I2C probe API adaption
- A DRM update for the further hdmi-codec updates"
* tag 'sound-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (417 commits)
ALSA: mts64: fix possible null-ptr-defer in snd_mts64_interrupt
ALSA: patch_realtek: Fix Dell Inspiron Plus 16
ALSA: hda/cirrus: Add extra 10 ms delay to allow PLL settle and lock.
ASoC: dt-bindings: Correct Alexandre Belloni email
ASoC: dt-bindings: maxim,max98504: Convert to DT schema
ASoC: dt-bindings: maxim,max98357a: Convert to DT schema
ASoC: dt-bindings: Reference common DAI properties
ASoC: dt-bindings: Extend name-prefix.yaml into common DAI properties
ASoC: rt715: Make read-only arrays capture_reg_H and capture_reg_L static const
ASoC: uniphier: aio-core: Make some read-only arrays static const
ASoC: wcd938x: Make read-only array minCode_param static const
ASoC: qcom: lpass-sc7280: Add maybe_unused tag for system PM ops
ASoC : SOF: amd: Add support for IPC and DSP dumps
ASoC: SOF: amd: Use poll function instead to read ACP_SHA_DSP_FW_QUALIFIER
ALSA: usb-audio: Workaround for XRUN at prepare
ALSA: pcm: Handle XRUN at trigger START
ALSA: pcm: Set missing stop_operating flag at undoing trigger start
drm: tda99x: Don't advertise non-existent capture support
ASoC: hdmi-codec: Allow playback and capture to be disabled
kselftest/alsa: Add more coverage of sample rates and channel counts
...
Pull dma-mapping updates from Christoph Hellwig:
- reduce the swiotlb buffer size on allocation failure (Alexey
Kardashevskiy)
- clean up passing of bogus GFP flags to the dma-coherent allocator
(Christoph Hellwig)
* tag 'dma-mapping-6.2-2022-12-13' of git://git.infradead.org/users/hch/dma-mapping:
dma-mapping: reject __GFP_COMP in dma_alloc_attrs
ALSA: memalloc: don't pass bogus GFP_ flags to dma_alloc_*
s390/ism: don't pass bogus GFP_ flags to dma_alloc_coherent
cnic: don't pass bogus GFP_ flags to dma_alloc_coherent
RDMA/qib: don't pass bogus GFP_ flags to dma_alloc_coherent
RDMA/hfi1: don't pass bogus GFP_ flags to dma_alloc_coherent
media: videobuf-dma-contig: use dma_mmap_coherent
swiotlb: reduce the swiotlb buffer size on allocation failure
dma_alloc_coherent/dma_alloc_wc is an opaque allocator that only uses
the GFP_ flags for allocation context control. Don't pass __GFP_COMP
which makes no sense for an allocation that can't in any way be
converted to a page pointer.
Note that for dma_alloc_noncoherent and dma_alloc_noncontigous in
combination with the DMA mmap helpers __GFP_COMP looks sketchy as well,
so I would suggest to drop that as well after a careful audit.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Currently the fallback SG allocation tries to allocate each single
page, and this tends to result in the reverse order of memory
addresses when large space is available at boot, as the kernel takes a
free page from the top to the bottom in the zone. The end result
looks as if non-contiguous (although it actually is). What's worse is
that it leads to an overflow of BDL entries for HD-audio.
For avoiding such a problem, this patch modifies the allocation code
slightly; now it tries to allocate the larger contiguous chunks as
much as possible, then reduces to the smaller chunks only if the
allocation failed -- a similar strategy as the existing
snd_dma_alloc_pages_fallback() function.
Along with the trick, drop the unused address array from
snd_dma_sg_fallback object. It was needed in the past when
dma_alloc_coherent() was used, but with the standard page allocator,
it became superfluous and never referred.
Fixes: a8d302a0b7 ("ALSA: memalloc: Revive x86-specific WC page allocations again")
Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Link: https://lore.kernel.org/r/20221114141658.29620-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The latest fix for the non-contiguous memalloc helper changed the
allocation method for a non-IOMMU system to use only the fallback
allocator. This should have worked, but it caused a problem sometimes
when too many non-contiguous pages are allocated that can't be treated
by HD-audio controller.
As a quirk workaround, go back to the original strategy: use
dma_alloc_noncontiguous() at first, and apply the fallback only when
it fails, but only for non-IOMMU case.
We'll need a better fix in the fallback code as well, but this
workaround should paper over most cases.
Fixes: 9736a32513 ("ALSA: memalloc: Don't fall back for SG-buffer with IOMMU")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/CAHk-=wgSH5ubdvt76gNwa004ooZAEJL_1Q-Fyw5M2FDdqL==dg@mail.gmail.com
Link: https://lore.kernel.org/r/20221112084718.3305-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
When the non-contiguous page allocation for SG buffer allocation
fails, the memalloc helper tries to fall back to the old page
allocation methods. This would, however, result in the bogus page
addresses when IOMMU is enabled. Usually in such a case, the fallback
allocation should fail as well, but occasionally it succeeds and
hitting a bad access.
The fallback was thought for non-IOMMU case, and as the error from
dma_alloc_noncontiguous() with IOMMU essentially implies a fatal
memory allocation error, we should return the error straightforwardly
without fallback. This avoids the corner case like the above.
The patch also renames the local variable "dma_ops" with snd_ prefix
for avoiding the name conflict.
Fixes: a8d302a0b7 ("ALSA: memalloc: Revive x86-specific WC page allocations again")
Reported-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Link: https://lore.kernel.org/r/alpine.DEB.2.22.394.2211041541090.3532114@eliteleevi.tm.intel.com
Link: https://lore.kernel.org/r/20221110132216.30605-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Use __GFP_RETRY_MAYFAIL instead of __GFP__NORETRY in
snd_dma_dev_alloc(), snd_dma_wc_alloc() and friends, to allocate pages
for device memory. The MAYFAIL flag retains the semantics of not
triggering the OOM killer, but lowers the risk of alloc failure.
MAYFAIL flag was added in commit dcda9b0471 ("mm, tree wide: replace
__GFP_REPEAT by __GFP_RETRY_MAYFAIL with more useful semantic").
This change addresses recurring failures with SOF audio driver in test
cases where a system suspend-resume stress test is run, combined with an
active high memory-load use-case. The failure typically shows up as:
[ 379.480229] sof-audio-pci-intel-tgl 0000:00:1f.3: booting DSP firmware
[ 379.484803] sof-audio-pci-intel-tgl 0000:00:1f.3: error: memory alloc failed: -12
[ 379.484810] sof-audio-pci-intel-tgl 0000:00:1f.3: error: dma prepare for ICCMAX stream failed
Multiple fixes to reduce the memory usage of DSP boot have been
identified in SOF driver, but even with those fixes, debug on affected
systems has shown that even a single page alloc may fail with
__GFP_NORETRY. When this occurs, system is under significant load on
physical memory, but a lot of reclaimable pages are available, so the
system has not run out of memory. With __GFP_RETRY_MAYFAIL, the errors
are not hit in these stress tests.
The alloc failure is severe as audio capability is completely lost if
alloc failure is hit at system resume.
An alternative solution was considered where the resources for DSP boot
would be kept allocated until driver is unbound. This would avoid the
allocation failure, but consume memory that is only needed temporarily
at probe and resume time. It seems better to not hang on to the memory,
but rather work a bit harder for allocating the pages at resume.
BugLink: https://github.com/thesofproject/linux/issues/3844
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20220923153501.3326041-1-kai.vehmanen@linux.intel.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The last fix for trying to recover the regression on AMD platforms,
unfortunately, leaded to yet another regression: it turned out that
IOMMUs don't like the usage of raw page allocations.
This is yet another attempt for addressing the log saga; at this time,
we re-use the existing buffer allocation mechanism with SG-pages
although we require only single pages. The SG buffer allocation
itself was confirmed to work for stream buffers, so it's relatively
easy to adapt for other places.
The only problem is: although the HD-audio code is accessing the
address directly via dmab->address field, SG-pages don't set up it.
For the ease of adaption, we now set up the dmab->addr field from the
address of the first page as default, so that it can run with the
HD-audio driver code as-is without the excessive call of
snd_sgbuf_get_addr() multiple times; that's the only change in the
memalloc helper side. The rest is nothing but a flip of the dma_type
field in the HD-audio side.
Fixes: a8d302a0b7 ("ALSA: memalloc: Revive x86-specific WC page allocations again")
Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/CABXGCsO+kB2t5QyHY-rUe76npr1m0-5JOtt8g8SiHUo34ur7Ww@mail.gmail.com
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216112
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216363
Link: https://lore.kernel.org/r/20220906090319.23358-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Now that all users of snd_dma_continuous_data() is gone, let's drop
this ugly (and dangerous) way.
After this commit, SNDRV_DMA_TYPE_CONTINUOUS may take the standard
device pointer instead of the hacked pointer by the macro above, and
the memalloc core refers to the coherent_dma_mask of the given
device like other SNDRV_DMA_TYPE. It's still allowed to pass NULL
there, and in that case, the allocation is performed always in the
normal zone.
For SNDRV_DMA_TYPE_VMALLOC, the device pointer is simply ignored.
Link: https://lore.kernel.org/r/20220823115740.14123-5-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
We dropped the x86-specific hack for WC-page allocations with a hope
that the standard dma_alloc_wc() works nowadays. Alas, it doesn't,
and we need to take back some workaround again, but in a different
form, as the previous one was broken for some platforms.
This patch re-introduces the x86-specific WC-page allocations, but it
uses rather the manual page allocations instead of
dma_alloc_coherent(). The use of dma_alloc_coherent() was also a
potential problem in the recent addition of the fallback allocation
for noncontig pages, and this patch eliminates both at once.
Fixes: 9882d63bea ("ALSA: memalloc: Drop x86-specific hack for WC allocations")
Cc: <stable@vger.kernel.org>
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=216363
Link: https://lore.kernel.org/r/20220821155911.10715-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The recent change for memory allocator replaced the SG-buffer handling
helper for x86 with the standard non-contiguous page handler. This
works for most cases, but there is a corner case I obviously
overlooked, namely, the fallback of non-contiguous handler without
IOMMU. When the system runs without IOMMU, the core handler tries to
use the continuous pages with a single SGL entry. It works nicely for
most cases, but when the system memory gets fragmented, the large
allocation may fail frequently.
Ideally the non-contig handler could deal with the proper SG pages,
it's cumbersome to extend for now. As a workaround, here we add new
types for (minimalistic) SG allocations, instead, so that the
allocator falls back to those types automatically when the allocation
with the standard API failed.
BTW, one better (but pretty minor) improvement from the previous
SG-buffer code is that this provides the proper mmap support without
the PCM's page fault handling.
Fixes: 2c95b92ecd ("ALSA: memalloc: Unify x86 SG-buffer handling (take#3)")
BugLink: https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/2272
BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1198248
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220413054808.7547-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>