Calling ipa_cmd_pipeline_clear() after stopping the channel
underlying the AP<-modem RX endpoint can lead to a deadlock.
This occurs in the ->runtime_suspend device power operation for the
IPA driver. While this callback is in progress, any other requests
for power will block until the callback returns.
Stopping the AP<-modem RX channel does not prevent the modem from
sending another packet to this endpoint. If a packet arrives for an
RX channel when the channel is stopped, an SUSPEND IPA interrupt
condition will be pending. Handling an IPA interrupt requires
power, so ipa_isr_thread() calls pm_runtime_get_sync() first thing.
The problem occurs because a "pipeline clear" command will not
complete while such a SUSPEND interrupt condition exists. So the
SUSPEND IPA interrupt handler won't proceed until it gets power;
that won't happen until the ->runtime_suspend callback (and its
"pipeline clear" command) completes; and that can't happen while
the SUSPEND interrupt condition exists.
It turns out that in this case there is no need to use the "pipeline
clear" command. There are scenarios in which clearing the pipeline
is required while suspending, but those are not (yet) supported
upstream. So a simple fix, avoiding the potential deadlock, is to
stop calling ipa_cmd_pipeline_clear() in ipa_endpoint_suspend().
This removes the only user of ipa_cmd_pipeline_clear(), so get rid
of that function. It can be restored again whenever it's needed.
This is basically a manual revert along with an explanation for
commit 6cb63ea6a3 ("net: ipa: introduce ipa_cmd_tag_process()").
Fixes: 6cb63ea6a3 ("net: ipa: introduce ipa_cmd_tag_process()")
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The IPA setup_complete flag is set at the end of ipa_setup(), when
the setup phase of initialization has completed successfully. This
occurs as part of driver probe processing, or (if "modem-init" is
specified in the DTS file) it is triggered by the "ipa-setup-ready"
SMP2P interrupt generated by the modem.
In the latter case, it's possible for driver shutdown (or remove) to
begin while setup processing is underway, and this can't be allowed.
The problem is that the setup_complete flag is not adequate to signal
that setup is underway.
If setup_complete is set, it will never be un-set, so that case is
not a problem. But if setup_complete is false, there's a chance
setup is underway.
Because setup is triggered by an interrupt on a "modem-init" system,
there is a simple way to ensure the value of setup_complete is safe
to read. The threaded handler--if it is executing--will complete as
part of a request to disable the "ipa-modem-ready" interrupt. This
means that ipa_setup() (which is called from the handler) will run
to completion if it was underway, or will never be called otherwise.
The request to disable the "ipa-setup-ready" interrupt is currently
made within ipa_modem_stop(). Instead, disable the interrupt
outside that function in the two places it's called. In the case of
ipa_remove(), this ensures the setup_complete flag is safe to read
before we read it.
Rename ipa_smp2p_disable() to be ipa_smp2p_irq_disable_setup(), to be
more specific about its effect.
Fixes: 530f9216a9 ("soc: qcom: ipa: AP/modem communications")
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We currently maintain a "disabled" Boolean flag to determine whether
the "ipa-setup-ready" SMP2P IRQ handler does anything. That flag
must be accessed under protection of a mutex.
Instead, disable the SMP2P interrupt when requested, which prevents
the interrupt handler from ever being called. More importantly, it
synchronizes a thread disabling the interrupt with the completion of
the interrupt handler in case they run concurrently.
Use the IPA setup_complete flag rather than the disabled flag in the
handler to determine whether to ignore any interrupts arriving after
the first.
Rename the "disabled" flag to be "setup_disabled", to be specific
about its purpose.
Fixes: 530f9216a9 ("soc: qcom: ipa: AP/modem communications")
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The head-of-line blocking timer should only be modified when
head-of-line drop is disabled.
One of the steps in recovering from a modem crash is to enable
dropping of packets with timeout of 0 (immediate). We don't know
how the modem configured its endpoints, so before we program the
timer, we need to ensure HOL_BLOCK is disabled.
Fixes: 84f9bd12d4 ("soc: qcom: ipa: IPA endpoints")
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Starting with IPA v4.5, the HOL_BLOCK_EN register must be written
twice when enabling head-of-line blocking avoidance.
Fixes: 84f9bd12d4 ("soc: qcom: ipa: IPA endpoints")
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull asm-generic fixes from Arnd Bergmann:
"There is one build fix for Arm platforms that ended up impacting most
architectures because of the way the drivers/firmware Kconfig file is
wired up:
The CONFIG_QCOM_SCM dependency have caused a number of randconfig
regressions over time, and some still remain in v5.15-rc4. The fix we
agreed on in the end is to make this symbol selected by any driver
using it, and then building it even for non-Arm platforms with
CONFIG_COMPILE_TEST.
To make this work on all architectures, the drivers/firmware/Kconfig
file needs to be included for all architectures to make the symbol
itself visible.
In a separate discussion, we found that a sound driver patch that is
pending for v5.16 needs the same change to include this Kconfig file,
so the easiest solution seems to have my Kconfig rework included in
v5.15.
Finally, the branch also includes a small unrelated build fix for
NOMMU architectures"
Link: https://lore.kernel.org/all/20210928153508.101208f8@canb.auug.org.au/
Link: https://lore.kernel.org/all/20210928075216.4193128-1-arnd@kernel.org/
Link: https://lore.kernel.org/all/20211007151010.333516-1-arnd@kernel.org/
* tag 'asm-generic-fixes-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
asm-generic/io.h: give stub iounmap() on !MMU same prototype as elsewhere
qcom_scm: hide Kconfig symbol
firmware: include drivers/firmware/Kconfig unconditionally
Now that SCM can be a loadable module, we have to add another
dependency to avoid link failures when ipa or adreno-gpu are
built-in:
aarch64-linux-ld: drivers/net/ipa/ipa_main.o: in function `ipa_probe':
ipa_main.c:(.text+0xfc4): undefined reference to `qcom_scm_is_available'
ld.lld: error: undefined symbol: qcom_scm_is_available
>>> referenced by adreno_gpu.c
>>> gpu/drm/msm/adreno/adreno_gpu.o:(adreno_zap_shader_load) in archive drivers/built-in.a
This can happen when CONFIG_ARCH_QCOM is disabled and we don't select
QCOM_MDT_LOADER, but some other module selects QCOM_SCM. Ideally we'd
use a similar dependency here to what we have for QCOM_RPROC_COMMON,
but that causes dependency loops from other things selecting QCOM_SCM.
This appears to be an endless problem, so try something different this
time:
- CONFIG_QCOM_SCM becomes a hidden symbol that nothing 'depends on'
but that is simply selected by all of its users
- All the stubs in include/linux/qcom_scm.h can go away
- arm-smccc.h needs to provide a stub for __arm_smccc_smc() to
allow compile-testing QCOM_SCM on all architectures.
- To avoid a circular dependency chain involving RESET_CONTROLLER
and PINCTRL_SUNXI, drop the 'select RESET_CONTROLLER' statement.
According to my testing this still builds fine, and the QCOM
platform selects this symbol already.
Acked-by: Kalle Valo <kvalo@codeaurora.org>
Acked-by: Alex Elder <elder@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
There is an off-by-one problem in ipa_table_init_add(), when
initializing filter tables.
In that function, the number of filter table entries is determined
based on the number of set bits in the filter map. However that
count does *not* include the extra "slot" in the filter table that
holds the filter map itself. Meanwhile, ipa_table_addr() *does*
include the filter map in the memory it returns, but because the
count it's provided doesn't include it, it includes one too few
table entries.
Fix this by including the extra slot for the filter map in the count
computed in ipa_table_init_add().
Note: ipa_filter_reset_table() does not have this problem; it resets
filter table entries one by one, but does not overwrite the filter
bitmap.
Fixes: 2b9feef2b6 ("soc: qcom: ipa: filter and routing tables")
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Finally, rename "ipa_clock.c" to be "ipa_power.c" and "ipa_clock.h"
to be "ipa_power.h".
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rename a number of functions to clarify that there is no longer a
notion of an "IPA clock," but rather that the functions are more
generally related to IPA power management.
ipa_clock_enable() -> ipa_power_enable()
ipa_clock_disable() -> ipa_power_disable()
ipa_clock_rate() -> ipa_core_clock_rate()
ipa_clock_init() -> ipa_power_init()
ipa_clock_exit() -> ipa_power_exit()
Rename the ipa_clock structure to be ipa_power. Rename all
variables and fields using that structure type "power" rather
than "clock".
Rename the ipa_clock_data structure to be ipa_power_data, and more
broadly, just substitute "power" for "clock" in places that
previously represented things related to the "IPA clock".
Update comments throughout.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use runtime power management autosuspend.
Up until this point, we only suspended the IPA hardware for system
suspend; now we'll suspend it aggressively using runtime power
management, setting the initial autosuspend delay to half a second
of inactivity.
Replace pm_runtime_put() calls with pm_runtime_put_autosuspend(),
call pm_runtime_mark_last_busy() before each of those. In places
where we're shutting things down, or decrementing power references
for errors, use pm_runtime_put_noidle() instead.
Finally, remove ipa_runtime_idle(), so the ->runtime_suspend
callback will occur if idle.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The only remaining user of the ipa_clock_{get,put}() interface is
ipa_isr_thread(). Replace calls to ipa_clock_get() there calling
pm_runtime_get_sync() instead. And call pm_runtime_put() there
rather than ipa_clock_put(). Warn if we ever get an error.
With that, we can get rid of ipa_clock_get() and ipa_clock_put().
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we open or close the modem network device we need to ensure the
hardware is powered. Replace the callers of ipa_clock_get() found
in ipa_open() and ipa_stop() with calls to pm_runtime_get_sync().
If an error is returned, simply return that error to the caller
(without any error or warning message). This could conceivably
occur if the function was called while the system was suspended,
but that really shouldn't happen. Replace corresponding calls to
ipa_clock_put() with pm_runtime_put() also.
If the modem crashes we also need to ensure the hardware is powered
to recover. If getting power returns an error there's not much we
can do, but at least report the error. (Ideally the remoteproc SSR
code would ensure the AP was not suspended when it sends the
notification, but that is not (yet) the case.)
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace the ipa_clock_get() call in ipa_uc_clock() when taking the
"proxy" clock reference for the microcontroller with a call to
pm_runtime_get_sync(). Replace calls of ipa_clock_put() for the
microcontroller with pm_runtime_put() calls instead.
There is a chance we get an error when taking the microcontroller
power reference. This is an unlikely scenario, where system suspend
is initiated just before we learn the modem is booting. For now
we'll just accept that this could occur, and report it if it does.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the "modem-init" Device Tree property is present for a platform,
the modem performs early IPA hardware initialization, and signals
this is complete with an "ipa-setup-ready" SMP2P interrupt. This
triggers a call to ipa_setup(), which requires the hardware to be
powered.
Replace the call to ipa_clock_get() in this case with a call to
pm_runtime_get_sync(). And replace the corresponding calls to
ipa_clock_put() with calls to pm_runtime_put() instead.
There is a chance we get an error when taking this power reference.
This is an unlikely scenario, where system suspend is initiated just
before the modem signals it has finished initializing the IPA
hardware. For now we'll just accept that this could occur, and
report it if it does.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need the hardware to be powered starting at the config stage of
initialization when the IPA driver probes. And we need it powered
when the driver is removed, at least until the deconfig stage has
completed.
Replace callers of ipa_clock_get() in ipa_probe() and ipa_exit(),
calling pm_runtime_get_sync() instead. Replace the corresponding
callers of ipa_clock_put(), calling pm_runtime_put() instead.
The only error we expect when getting power would occur when the
system is suspended. The ->probe and ->remove driver callbacks
won't be called when suspended, so issue a WARN() call if an error
is seen getting power.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski pointed out a race condition in ipa_start_xmit() in a
recently-accepted series of patches:
https://lore.kernel.org/netdev/20210812195035.2816276-1-elder@linaro.org/
We are stopping the modem TX queue in that function if the power
state is not active. We restart the TX queue again once hardware
resume is complete.
TX path Power Management
------- ----------------
pm_runtime_get(); no power Start resume
Stop TX queue ...
pm_runtime_put() Resume complete
return NETDEV_TX_BUSY Start TX queue
pm_runtime_get()
Power present, transmit
pm_runtime_put() (auto-suspend)
The issue is that the power management (resume) activity and the
network transmit activity can occur concurrently, and there's a
chance the queue will be stopped *after* it has been started again.
TX path Power Management
------- ----------------
Resume underway
pm_runtime_get(); no power ...
Resume complete
Start TX queue
Stop TX queue <-- No more transmits after this
pm_runtime_put()
return NETDEV_TX_BUSY
We address this using a STARTED flag to indicate when the TX queue
has been started from the resume path, and a spinlock to make the
flag and queue updates happen atomically.
TX path Power Management
------- ----------------
Resume underway
pm_runtime_get(); no power Resume complete
start TX queue \
If STARTED flag is *not* set: > atomic
Stop TX queue set STARTED flag /
pm_runtime_put()
return NETDEV_TX_BUSY
A second flag is used to address a different race that involves
another path requesting power.
TX path Other path Power Management
------- ---------- ----------------
pm_runtime_get_sync() Resume
Start TX queue \ atomic
Set STARTED flag /
(do its thing)
pm_runtime_put()
(auto-suspend)
pm_runtime_get() Mark delayed resume
STARTED *is* set, so
do *not* stop TX queue <-- Queue should be stopped here
pm_runtime_put()
return NETDEV_TX_BUSY Suspend done, resume
Resume complete
pm_runtime_get()
Stop TX queue
(STARTED is *not* set) Start TX queue \ atomic
pm_runtime_put() Set STARTED flag /
return NETDEV_TX_BUSY
So a STOPPED flag is set in the transmit path when it has stopped
the TX queue, and this pair of operations is also protected by the
spinlock. The resume path only restarts the TX queue if the STOPPED
flag is set. This case isn't a major problem, but it avoids the
"non-trivial amount of useless work" done by the networking stack
when NETDEV_TX_BUSY is returned.
Fixes: 6b51f802d6 ("net: ipa: ensure hardware has power in ipa_start_xmit()")
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently a clock reference is taken whenever the ->ndo_open
callback for the modem netdev is called. That reference is dropped
when the device is closed, in ipa_stop().
We no longer need this, because ipa_start_xmit() now handles the
situation where the hardware power state is not active.
Drop the clock reference in ipa_open() when we're done, and take a
new reference in ipa_stop() before we begin closing the interface.
Finally (and unrelated, but trivial), change the return type of
ipa_start_xmit() to be netdev_tx_t instead of int.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently we stop the modem netdev transmit queue when suspending
the hardware. For system suspend this ensured we'd never attempt
to transmit while attempting to suspend the modem endpoints.
For runtime suspend, the IPA hardware might get suspended while the
system is operating. In that case we want an attempt to transmit a
packet to cause the hardware to resume if necessary. But if we
disable the queue this cannot happen.
So stop disabling the queue on suspend. In case we end up disabling
it in ipa_start_xmit() (see the previous commit), we still arrange
to start the TX queue on resume.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to ensure the hardware is powered when we transmit a packet.
But if it's not, we can't block to wait for it. So asynchronously
request power in ipa_start_xmit(), and only proceed if the return
value indicates the power state is active.
If the hardware is not active, a runtime resume request will have
been initiated. In that case, stop the network stack from further
transmit attempts until the resume completes. Return NETDEV_TX_BUSY,
to retry sending the packet once the queue is restarted.
If the power request returns an error (other than -EINPROGRESS,
which just means a resume requested elsewhere isn't complete), just
drop the packet.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Create a new work structure in the modem private data, and use it to
re-enable the modem network device transmit queue when resuming.
This is needed by the next patch, which stops the TX queue if IPA
power isn't active when a transmit request arrives. Packets will
start arriving the instant the TX queue is enabled, but resuming
isn't complete until ipa_modem_resume() returns. This way we're
sure to be resumed before transmits are allowed again.
Cancel it before calling ipa_stop() in ipa_modem_stop() to ensure
the transmit queue restart completes before it gets stopped there.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new flag that is set when the hardware is suspended due to a
system suspend operation, distingishing it from runtime suspend.
Use it in the SUSPEND IPA interrupt handler to determine whether to
trigger a system resume because of the event. Define new suspend
and resume power management callback functions to set and clear the
new flag, respectively.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the call to enable the IPA interrupt as a wakeup interrupt into
ipa_power_setup(), disable it in ipa_power_teardown().
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
It isn't required, but all callers of ipa_aggr_granularity_val()
pass a constant value (IPA_AGGR_GRANULARITY) as the usec argument.
Two of those callers are in ipa_validate_build(), with the result
being passed to BUILD_BUG_ON().
Evidently the "sparc64-linux-gcc" compiler (at least) doesn't always
inline ipa_aggr_granularity_val(), so the result of the function is
not constant at compile time, and that leads to build errors.
Define the function with the __always_inline attribute to avoid the
errors. We can see by inspection that the value passed is never
zero, so we can just remove its WARN_ON() call.
Fixes: 5bc5588466 ("net: ipa: use WARN_ON() rather than assertions")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20210811135948.2634264-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>