Alex Barba <alex.barba@broadcom.com> discovered Broadcom NS2 GICv2m
implementation has an erratum where the MSI data needs to be the SPI
number subtracted by an offset of 32, for the correct MSI interrupt
to be triggered.
Here we are adding the workaround based on readings from the MSI_IIDR
register, which contains a value unique to Broadcom NS2 GICv2m
Reported-by: Alex Barba <alex.barba@broadcom.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Ray Jui <ray.jui@broadcom.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
of_platform_device_create() returns NULL on error, it never returns
error pointers.
Fixes: ed2a1002d2 ('irqchip/mbigen: Handle multiple device nodes in a mbigen module')
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The GICv3 include file defines GICR_ISACTIVER and GICR_ICACTIVER
in the RD_base page. News flash, they do not exist (probably
a copy/paste brain fart). Just drop them.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
We are not checking whether the requested device identifier fits into
the device table memory or not. The function its_create_device()
assumes that enough memory has been allocated for whole DevID space
(reported by ITS_TYPER.Devbits) during the ITS probe() and continues
to initialize ITS hardware.
This assumption is not perfect, sometimes we reduce memory size either
because of its size crossing MAX_ORDER-1 or BASERn max size limit. The
MAPD command fails if 'Device ID' is outside of device table range.
Add a simple validation check to avoid MAPD failures since we are
not handling ITS command errors. This change also helps to return an
error -ENOMEM instead of success to caller.
Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The change adds improved support of NXP LPC32xx MIC, SIC1 and SIC2
interrupt controllers.
This is a list of new features in comparison to the legacy driver:
* irq types are taken from device tree settings, no more need to
hardcode them,
* old driver is based on irq_domain_add_legacy, which causes problems
with handling MIC hardware interrupt 0 produced by SIC1,
* there is one driver for MIC, SIC1 and SIC2, no more need to handle
them separately, e.g. have two separate handlers for SIC1 and SIC2,
* the driver does not have any dependencies on hardcoded register
offsets,
* the driver is much simpler for maintenance,
* SPARSE_IRQS option is supported.
Legacy LPC32xx interrupt controller driver was broken since commit
76ba59f836 ("genirq: Add irq_domain-aware core IRQ handler"), which
requires a private interrupt handler, otherwise any SIC1 generated
interrupt (mapped to MIC hwirq 0) breaks the kernel with the message
"unexpected IRQ trap at vector 00".
The change disables compilation of a legacy driver found at
arch/arm/mach-lpc32xx/irq.c, the file will be removed in a separate
commit.
Fixes: 76ba59f836 ("genirq: Add irq_domain-aware core IRQ handler")
Tested-by: Sylvain Lemieux <slemieux.tyco@gmail.com>
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
When an IPI is generated by a CPU, the pattern looks roughly like:
<write shared data>
smp_wmb();
<write to GIC to signal SGI>
On the receiving CPU we rely on the fact that, once we've taken the
interrupt, then the freshly written shared data must be visible to us.
Put another way, the CPU isn't going to speculate taking an interrupt.
Unfortunately, this assumption turns out to be broken.
Consider that CPUx wants to send an IPI to CPUy, which will cause CPUy
to read some shared_data. Before CPUx has done anything, a random
peripheral raises an IRQ to the GIC and the IRQ line on CPUy is raised.
CPUy then takes the IRQ and starts executing the entry code, heading
towards gic_handle_irq. Furthermore, let's assume that a bunch of the
previous interrupts handled by CPUy were SGIs, so the branch predictor
kicks in and speculates that irqnr will be <16 and we're likely to
head into handle_IPI. The prefetcher then grabs a speculative copy of
shared_data which contains a stale value.
Meanwhile, CPUx gets round to updating shared_data and asking the GIC
to send an SGI to CPUy. Internally, the GIC decides that the SGI is
more important than the peripheral interrupt (which hasn't yet been
ACKed) but doesn't need to do anything to CPUy, because the IRQ line
is already raised.
CPUy then reads the ACK register on the GIC, sees the SGI value which
confirms the branch prediction and we end up with a stale shared_data
value.
This patch fixes the problem by adding an smp_rmb() to the IPI entry
code in gic_handle_irq. As it turns out, the combination of a control
dependency and an ISB instruction from the EOI in the GICv3 driver is
enough to provide the ordering we need, so we add a comment there
justifying the absence of an explicit smp_rmb().
Cc: stable@vger.kernel.org
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Some kind of Freescale Layerscape SoC provides a MSI
implementation which uses two SCFG registers MSIIR and
MSIR to support 32 MSI interrupts for each PCIe controller.
The patch is to support it.
Signed-off-by: Minghuan Lian <Minghuan.Lian@nxp.com>
Tested-by: Alexander Stein <alexander.stein@systec-electronic.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Some Layerscape SoCs use a simple MSI controller implementation.
It contains only two SCFG register to trigger and describe a
group 32 MSI interrupts. The patch adds bindings to describe
the controller.
Signed-off-by: Minghuan Lian <Minghuan.Lian@nxp.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
We've unfortunately started seeing a situation where percpu interrupts
are partitioned in the system: one arbitrary set of CPUs has an
interrupt connected to a type of device, while another disjoint
set of CPUs has the same interrupt connected to another type of device.
This makes it impossible to have a device driver requesting this interrupt
using the current percpu-interrupt abstraction, as the same interrupt number
is now potentially claimed by at least two drivers, and we forbid interrupt
sharing on per-cpu interrupt.
A solution to this is to turn things upside down. Let's assume that our
system describes all the possible partitions for a given interrupt, and
give each of them a unique identifier. It is then possible to create
a namespace where the affinity identifier itself is a form of interrupt
number. At this point, it becomes easy to implement a set of partitions
as a cascaded irqchip, each affinity identifier being the HW irq.
This allows us to keep a number of nice properties:
- Each partition results in a separate percpu-interrupt (with a restrictied
affinity), which keeps drivers happy.
- Because the underlying interrupt is still per-cpu, the overhead of
the indirection can be kept pretty minimal.
- The core code can ignore most of that crap.
For that purpose, we implement a small library that deals with some of
the boilerplate code, relying on platform-specific drivers to provide
a description of the affinity sets and a set of callbacks.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: devicetree@vger.kernel.org
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Rob Herring <robh+dt@kernel.org>
Link: http://lkml.kernel.org/r/1460365075-7316-4-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
When iterating over the irq domain list, we try to match a domain
either by calling a match() function or by comparing a number
of fields passed as parameters.
Both approaches are a bit restrictive:
- match() is DT specific and only takes a device node
- the fallback case only deals with the fwnode_handle
It would be useful if we had a per-domain function that would
actually perform the matching check on the whole of the
irq_fwspec structure. This would allow for a domain to triage
matching attempts that need to extend beyond the fwnode.
Let's introduce irq_find_matching_fwspec(), which takes a full
blown irq_fwspec structure, and call into a select() function
implemented by the irqdomain. irq_find_matching_fwnode() is
made a wrapper around irq_find_matching_fwspec in order to
preserve compatibility.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: devicetree@vger.kernel.org
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Rob Herring <robh+dt@kernel.org>
Link: http://lkml.kernel.org/r/1460365075-7316-2-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Make these functions return appropriate error codes when something goes
wrong.
Previously irq_destroy_ipi returned void making it impossible to notify
the caller if the request could not be fulfilled. Patch 1 in the series
added another condition in which this could fail in addition to the
existing ones. irq_reserve_ipi returned an unsigned int meaning it could
only return 0 on failure and give the caller no indication as to why the
request failed.
As time goes on there are likely to be further conditions added in which
these functions can fail. These APIs and the IPI IRQ domain are new in
4.6 and the number of existing call sites are low, changing the API now
has little impact on the code, while making it easier for these
functions to grow over time.
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: jason@lakedaemon.net
Cc: marc.zyngier@arm.com
Cc: ralf@linux-mips.org
Cc: Qais Yousef <qsyousef@gmail.com>
Cc: lisa.parratt@imgtec.com
Cc: jiang.liu@linux.intel.com
Link: http://lkml.kernel.org/r/1461568464-31701-2-git-send-email-matt.redfearn@imgtec.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull workqueue fix from Tejun Heo:
"So, it turns out we had a silly bug in the most fundamental part of
workqueue for a very long time. AFAICS, this dates back to pre-git
era and has quite likely been there from the time workqueue was first
introduced.
A work item uses its PENDING bit to synchronize multiple queuers.
Anyone who wins the PENDING bit owns the pending state of the work
item. Whether a queuer wins or loses the race, one thing should be
guaranteed - there will soon be at least one execution of the work
item - where "after" means that the execution instance would be able
to see all the changes that the queuer has made prior to the queueing
attempt.
Unfortunately, we were missing a smp_mb() after clearing PENDING for
execution, so nothing guaranteed visibility of the changes that a
queueing loser has made, which manifested as a reproducible blk-mq
stall.
Lots of kudos to Roman for debugging the problem. The patch for
-stable is the minimal one. For v3.7, Peter is working on a patch to
make the code path slightly more efficient and less fragile"
* 'for-4.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: fix ghost PENDING flag while doing MQ IO
Pull cgroup fixes from Tejun Heo:
"Two patches to fix a deadlock which can be easily triggered if memcg
charge moving is used.
This bug was introduced while converting threadgroup locking to a
global percpu_rwsem and is caused by cgroup controller task migration
path depending on the ability to create new kthreads. cpuset had a
similar issue which was fixed by performing heavy-lifting operations
asynchronous to task migration. The two patches fix the same issue in
memcg in a similar way. The first patch makes the mechanism generic
and the second relocates memcg charge moving outside the migration
path.
Given that we don't want to perform heavy operations while
writelocking threadgroup lock anyway, moving them out of the way is a
desirable solution. One thing to note is that the problem was
difficult to debug because lockdep couldn't figure out the deadlock
condition. Looking into how to improve that"
* 'for-4.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
memcg: relocate charge moving from ->attach to ->post_attach
cgroup, cpuset: replace cpuset_post_attach_flush() with cgroup_subsys->post_attach callback
Pull i2c fixes from Wolfram Sang:
"I2C has one buildfix, one ABBA deadlock fix, and three simple 'add ID'
patches"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: exynos5: Fix possible ABBA deadlock by keeping I2C clock prepared
i2c: cpm: Fix build break due to incompatible pointer types
i2c: ismt: Add Intel DNV PCI ID
i2c: xlp9xx: add support for Broadcom Vulcan
i2c: rk3x: add support for rk3228
Pull ARC fixes from Vineet Gupta:
- lockdep now works for ARCv2 builds
- enable DT reserved-memory binding (for forthcoming HDMI driver)
* tag 'arc-4.6-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: add support for reserved memory defined by device tree
ARC: support generic per-device coherent dma mem
Documentation: dt: arc: fix spelling mistakes
ARCv2: Enable LOCKDEP
Pull arch/nios2 fix from Ley Foon Tan:
"memset: use the right constraint modifier for the %4 output operand"
* tag 'nios2-v4.6-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2:
nios2: memset: use the right constraint modifier for the %4 output operand
Pull x86 platform driver fix from Darren Hart:
"Fix regression caused by hotkey enabling value in toshiba_acpi"
* tag 'platform-drivers-x86-v4.6-3' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86:
toshiba_acpi: Fix regression caused by hotkey enabling value
Depending on the size of the area to be memset'ed, the nios2 memset implementation
either uses a naive loop (for buffers smaller or equal than 8 bytes) or a more optimized
implementation (for buffers larger than 8 bytes). This implementation does 4-byte stores
rather than 1-byte stores to speed up memset.
However, we discovered that on our nios2 platform, memset() was not properly setting the
buffer to the expected value. A memset of 0xff would not set the entire buffer to 0xff, but to:
0xff 0x00 0xff 0x00 0xff 0x00 0xff 0x00 ...
Which is obviously incorrect. Our investigation has revealed that the problem lies in the
incorrect constraints used in the inline assembly.
The following piece of assembly, from the nios2 memset implementation, is supposed to
create a 4-byte value that repeats 4 times the 1-byte pattern passed as memset argument:
/* fill8 %3, %5 (c & 0xff) */
" slli %4, %5, 8\n"
" or %4, %4, %5\n"
" slli %3, %4, 16\n"
" or %3, %3, %4\n"
However, depending on the compiler and optimization level, this code might be compiled as:
34: 280a923a slli r5,r5,8
38: 294ab03a or r5,r5,r5
3c: 2808943a slli r4,r5,16
40: 2148b03a or r4,r4,r5
This is wrong because r5 gets used both for %5 and %4, which leads to the final pattern
stored in r4 to be 0xff00ff00 rather than the expected 0xffffffff.
%4 is defined with the "=r" constraint, i.e as an output operand. However, as explained in
http://www.ethernut.de/en/documents/arm-inline-asm.html, this does not prevent gcc from
using the same register for an output operand (%4) and input operand (%5). By using the
constraint modifier '&', we indicate that the register should be used for output only. With this
change, we get the following assembly output:
34: 2810923a slli r8,r5,8
38: 4150b03a or r8,r8,r5
3c: 400e943a slli r7,r8,16
40: 3a0eb03a or r7,r7,r8
Which correctly produces the 0xffffffff pattern when 0xff is passed as the memset() pattern.
It is worth mentioning the observed consequence of this bug: we were hitting the kernel
BUG() in mm/bootmem.c:__free() that verifies when marking a page as free that it was
previously marked as occupied (i.e that the bit was set to 1). The entire bootmem bitmap is
set to 0xff bit via a memset() during the bootmem initialization. The bootmem_free() call right
after the initialization was finding some bits to be set to 0, which didn't make sense since the
bitmap has just been memset'ed to 0xff. Except that due to the bug explained above, the
bitmap was in fact initialized to 0xff00ff00.
Thanks to Marek Vasut for his help and feedback.
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Marek Vasut <marex@denx.de>
Acked-by: Ley Foon Tan <lftan@altera.com>