Pull dmaengine fixes from Vinod Koul:
- tegra210 div_u64 divison and max page fixes
- revert Qualcomm unavailable register workaround which is causing
regression, fixes have been proposed but still gaps are present so
revert this for now
* tag 'dmaengine-fix-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
dmaengine: Revert "dmaengine: qcom: bam_dma: Avoid writing unavailable register"
dmaengine: tegra210-adma: check for adma max page
dmaengine: tegra210-adma: Use div_u64 for 64 bit division
The Tegra210 Audio DMA controller driver did a plain divide:
page_no = (res_page->start - res_base->start) / cdata->ch_base_offset;
which causes problems on 32-bit x86 configurations that have 64-bit
resource sizes:
x86_64-linux-ld: drivers/dma/tegra210-adma.o: in function `tegra_adma_probe':
tegra210-adma.c:(.text+0x1322): undefined reference to `__udivdi3'
because gcc doesn't generate the trivial code for a 64-by-32 divide,
turning it into a function call to do a full 64-by-64 divide. And the
kernel intentionally doesn't provide that helper function, because 99%
of the time all you want is the narrower version.
Of course, tegra210 is a 64-bit architecture and the 32-bit x86 build is
purely for build testing, so this really is just about build coverage
failure.
But build coverage is good.
Side note: div_u64() would be suboptimal if you actually have a 32-bit
resource_t, so our "helper" for divides are admittedly making it harder
than it should be to generate good code for all the possible cases.
At some point, I'll consider 32-bit x86 so entirely legacy that I can't
find it in myself to care any more, and we'll just add the __udivdi3
library function.
But for now, the right thing to do is to use "div_u64()" to show that
you know that you are doing the simpler divide with a 32-bit number.
And the build error enforces that.
While fixing the build issue, also check for division-by-zero, and for
overflow. Which hopefully cannot happen on real production hardware,
but the value of 'ch_base_offset' can definitely be zero in other
places.
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull dmaengine updates from Vinod Koul:
"A bunch of new device support and updates to few drivers, biggest of
them amd ones.
New support:
- TI J722S CSI BCDMA controller support
- Intel idxd Panther Lake family platforms
- Allwinner F1C100s suniv DMA
- Qualcomm QCS615, QCS8300, SM8750, SA8775P GPI dma controller support
- AMD ae4dma controller support and reorganisation of amd driver
Updates:
- Channel page support for Nvidia Tegra210 adma driver
- Freescale support for S32G based platforms
- Yamilfy atmel dma bindings"
* tag 'dmaengine-6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (45 commits)
dmaengine: idxd: Enable Function Level Reset (FLR) for halt
dmaengine: idxd: Refactor halt handler
dmaengine: idxd: Add idxd_device_config_save() and idxd_device_config_restore() helpers
dmaengine: idxd: Binding and unbinding IDXD device and driver
dmaengine: idxd: Add idxd_pci_probe_alloc() helper
dt-bindings: dma: atmel: Convert to json schema
dt-bindings: dma: st-stm32-dmamux: Add description for dma-cell values
dmaengine: qcom: gpi: Add GPI immediate DMA support for SPI protocol
dt-bindings: dma: adi,axi-dmac: deprecate adi,channels node
dt-bindings: dma: adi,axi-dmac: convert to yaml schema
dmaengine: mv_xor: switch to for_each_child_of_node_scoped()
dmaengine: bcm2835-dma: Prevent suspend if DMA channel is busy
dmaengine: tegra210-adma: Support channel page
dt-bindings: dma: Support channel page to nvidia,tegra210-adma
dmaengine: ti: k3-udma: Add support for J722S CSI BCDMA
dt-bindings: dma: ti: k3-bcdma: Add J722S CSI BCDMA
dmaengine: ti: edma: fix OF node reference leaks in edma_driver
dmaengine: ti: edma: make the loop condition simpler in edma_probe()
dmaengine: fsl-edma: read/write multiple registers in cyclic transactions
dmaengine: fsl-edma: add support for S32G based platforms
...
Pull x86 cpuid updates from Borislav Petkov:
- Remove the less generic CPU matching infra around struct x86_cpu_desc
and use the generic struct x86_cpu_id thing
- Remove magic naked numbers for CPUID functions and use proper defines
of the prefix CPUID_LEAF_*. Consolidate some of the crazy use around
the tree
- Smaller cleanups and improvements
* tag 'x86_cpu_for_v6.14_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpu: Make all all CPUID leaf names consistent
x86/fpu: Remove unnecessary CPUID level check
x86/fpu: Move CPUID leaf definitions to common code
x86/tsc: Remove CPUID "frequency" leaf magic numbers.
x86/tsc: Move away from TSC leaf magic numbers
x86/cpu: Move TSC CPUID leaf definition
x86/cpu: Refresh DCA leaf reading code
x86/cpu: Remove unnecessary MwAIT leaf checks
x86/cpu: Use MWAIT leaf definition
x86/cpu: Move MWAIT leaf definition to common header
x86/cpu: Remove 'x86_cpu_desc' infrastructure
x86/cpu: Move AMD erratum 1386 table over to 'x86_cpu_id'
x86/cpu: Replace PEBS use of 'x86_cpu_desc' use with 'x86_cpu_id'
x86/cpu: Expose only stepping min/max interface
x86/cpu: Introduce new microcode matching helper
x86/cpufeature: Document cpu_feature_enabled() as the default to use
x86/paravirt: Remove the WBINVD callback
x86/cpufeatures: Free up unused feature bits
Pull dmaengine fixes from Vinod Koul:
"Bunch of minor driver fixes for drivers in this cycle:
- Kernel doc warning documentation fixes
- apple driver fix for register access
- amd driver dropping private dma_ops
- freescale cleanup path fix
- refcount fix for mv_xor driver
- null pointer deref fix for at_xdmac driver
- GENMASK to GENMASK_ULL fix for loongson2 apb driver
- Tegra driver fix for correcting dma status"
* tag 'dmaengine-fix-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
dmaengine: tegra: Return correct DMA status when paused
dmaengine: mv_xor: fix child node refcount handling in early exit
dmaengine: fsl-edma: implement the cleanup path of fsl_edma3_attach_pd()
dmaengine: amd: qdma: Remove using the private get and set dma_ops APIs
dmaengine: apple-admac: Avoid accessing registers in probe
linux/dmaengine.h: fix a few kernel-doc warnings
dmaengine: loongson2-apb: Change GENMASK to GENMASK_ULL
dmaengine: dw: Select only supported masters for ACPI devices
dmaengine: at_xdmac: avoid null_prt_deref in at_xdmac_prep_dma_memset
When DSA/IAA device hits a fatal error, the device enters a halt state.
The driver can reset the device depending on Reset Type required by
hardware to recover the device.
Supported Reset Types are:
0: Reset Device command
1: Function Level Reset (FLR)
2: Warm reset
3: Cold reset
Currently, the driver only supports Reset Type 0.
This patch adds support for FLR recovery Type 1. Before issuing a PCIe
FLR command, IDXD device and WQ states are saved. After the FLR command
execution, the device is recovered to its previous states, allowing
the user can continue using the device.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20241122233028.2762809-6-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Add the idxd_pci_probe_alloc() helper to probe IDXD PCI device with or
without allocating and setting idxd software values.
The idxd_pci_probe() function is refactored to call this helper and
always probe the IDXD device with allocating and setting the software
values.
This helper will be called later in the Function Level Reset (FLR)
process without modifying the idxd software data.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20241122233028.2762809-2-fenghua.yu@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The DMA TRE(Transfer ring element) buffer contains the DMA
buffer address. Accessing data from this address can cause
significant delays in SPI transfers, which can be mitigated to
some extent by utilizing immediate DMA support.
QCOM GPI DMA hardware supports an immediate DMA feature for data
up to 8 bytes, storing the data directly in the DMA TRE buffer
instead of the DMA buffer address. This enhancement enables faster
SPI data transfers.
This optimization reduces the average transfer time from 25 us to
16 us for a single SPI transfer of 8 bytes length, with a clock
frequency of 50 MHz.
Signed-off-by: Jyothi Kumar Seerapu <quic_jseerapu@quicinc.com>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20241209075033.16860-1-quic_jseerapu@quicinc.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
When i == ecc->num_tc, the edma_probe() calls
of_parse_phandle_with_fixed_args() and breaks from the loop regardless
of the return value. Since neither the returned value nor the output
argument tc_args is used, set i < ecc->num_tc as the loop condition.
Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20241219020507.1983124-2-joe@pf.is.s.u-tokyo.ac.jp
Signed-off-by: Vinod Koul <vkoul@kernel.org>