There is no need to have flags as unsigned long. Make it unsigned int
and put it together with irq so it does not waste space on 64 bit.
Make irq unsigned int while at it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Remove all #inclusions of asm/system.h preparatory to splitting and killing
it. Performed with the following command:
perl -p -i -e 's!^#\s*include\s*<asm/system[.]h>.*\n!!' `grep -Irl '^#\s*include\s*<asm/system[.]h>' *`
Signed-off-by: David Howells <dhowells@redhat.com>
The __raise_softirq_irqoff() contains a tracepoint. As tracepoints in headers
can cause issues, and not to mention, bloats the kernel when they are
in a static inline, it is best to move the function that contains the
tracepoint out of the header and into softirq.c.
Link: http://lkml.kernel.org/r/20120118120711.GB14863@elte.hu
Suggested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
As request_percpu_irq() doesn't allow for a percpu interrupt to have
its type configured (it is generally impossible to configure it on all
CPUs at once), add a 'type' argument to enable_percpu_irq().
This allows some low-level, board specific init code to be switched to
a generic API.
[ tglx: Added WARN_ON argument ]
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Abhijeet Dharmapurikar <adharmap@codeaurora.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The ARM GIC interrupt controller offers per CPU interrupts (PPIs),
which are usually used to connect local timers to each core. Each CPU
has its own private interface to the GIC, and only sees the PPIs that
are directly connect to it.
While these timers are separate devices and have a separate interrupt
line to a core, they all use the same IRQ number.
For these devices, request_irq() is not the right API as it assumes
that an IRQ number is visible by a number of CPUs (through the
affinity setting), but makes it very awkward to express that an IRQ
number can be handled by all CPUs, and yet be a different interrupt
line on each CPU, requiring a different dev_id cookie to be passed
back to the handler.
The *_percpu_irq() functions is designed to overcome these
limitations, by providing a per-cpu dev_id vector:
int request_percpu_irq(unsigned int irq, irq_handler_t handler,
const char *devname, void __percpu *percpu_dev_id);
void free_percpu_irq(unsigned int, void __percpu *);
int setup_percpu_irq(unsigned int irq, struct irqaction *new);
void remove_percpu_irq(unsigned int irq, struct irqaction *act);
void enable_percpu_irq(unsigned int irq);
void disable_percpu_irq(unsigned int irq);
The API has a number of limitations:
- no interrupt sharing
- no threading
- common handler across all the CPUs
Once the interrupt is requested using setup_percpu_irq() or
request_percpu_irq(), it must be enabled by each core that wishes its
local interrupt to be delivered.
Based on an initial patch by Thomas Gleixner.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1316793788-14500-2-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Commit a26ac2455ffcf3(rcu: move TREE_RCU from softirq to kthread)
introduced performance regression. In an AIM7 test, this commit degraded
performance by about 40%.
The commit runs rcu callbacks in a kthread instead of softirq. We observed
high rate of context switch which is caused by this. Out test system has
64 CPUs and HZ is 1000, so we saw more than 64k context switch per second
which is caused by RCU's per-CPU kthread. A trace showed that most of
the time the RCU per-CPU kthread doesn't actually handle any callbacks,
but instead just does a very small amount of work handling grace periods.
This means that RCU's per-CPU kthreads are making the scheduler do quite
a bit of work in order to allow a very small amount of RCU-related
processing to be done.
Alex Shi's analysis determined that this slowdown is due to lock
contention within the scheduler. Unfortunately, as Peter Zijlstra points
out, the scheduler's real-time semantics require global action, which
means that this contention is inherent in real-time scheduling. (Yes,
perhaps someone will come up with a workaround -- otherwise, -rt is not
going to do well on large SMP systems -- but this patch will work around
this issue in the meantime. And "the meantime" might well be forever.)
This patch therefore re-introduces softirq processing to RCU, but only
for core RCU work. RCU callbacks are still executed in kthread context,
so that only a small amount of RCU work runs in softirq context in the
common case. This should minimize ksoftirqd execution, allowing us to
skip boosting of ksoftirqd for CONFIG_RCU_BOOST=y kernels.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Tested-by: "Alex,Shi" <alex.shi@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
If RCU priority boosting is to be meaningful, callback invocation must
be boosted in addition to preempted RCU readers. Otherwise, in presence
of CPU real-time threads, the grace period ends, but the callbacks don't
get invoked. If the callbacks don't get invoked, the associated memory
doesn't get freed, so the system is still subject to OOM.
But it is not reasonable to priority-boost RCU_SOFTIRQ, so this commit
moves the callback invocations to a kthread, which can be boosted easily.
Also add comments and properly synchronized all accesses to
rcu_cpu_kthread_task, as suggested by Lai Jiangshan.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (116 commits)
x86: Enable forced interrupt threading support
x86: Mark low level interrupts IRQF_NO_THREAD
x86: Use generic show_interrupts
x86: ioapic: Avoid redundant lookup of irq_cfg
x86: ioapic: Use new move_irq functions
x86: Use the proper accessors in fixup_irqs()
x86: ioapic: Use irq_data->state
x86: ioapic: Simplify irq chip and handler setup
x86: Cleanup the genirq name space
genirq: Add chip flag to force mask on suspend
genirq: Add desc->irq_data accessor
genirq: Add comments to Kconfig switches
genirq: Fixup fasteoi handler for oneshot mode
genirq: Provide forced interrupt threading
sched: Switch wait_task_inactive to schedule_hrtimeout()
genirq: Add IRQF_NO_THREAD
genirq: Allow shared oneshot interrupts
genirq: Prepare the handling of shared oneshot interrupts
genirq: Make warning in handle_percpu_event useful
x86: ioapic: Move trigger defines to io_apic.h
...
Fix up trivial(?) conflicts in arch/x86/pci/xen.c due to genirq name
space changes clashing with the Xen cleanups. The set_irq_msi() had
moved to xen_bind_pirq_msi_to_irq().
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (26 commits)
sched: Resched proper CPU on yield_to()
sched: Allow users with sufficient RLIMIT_NICE to change from SCHED_IDLE policy
sched: Allow SCHED_BATCH to preempt SCHED_IDLE tasks
sched: Clean up the IRQ_TIME_ACCOUNTING code
sched: Add #ifdef around irq time accounting functions
sched, autogroup: Stop claiming ownership of the root task group
sched, autogroup: Stop going ahead if autogroup is disabled
sched, autogroup, sysctl: Use proc_dointvec_minmax() instead
sched: Fix the group_imb logic
sched: Clean up some f_b_g() comments
sched: Clean up remnants of sd_idle
sched: Wholesale removal of sd_idle logic
sched: Add yield_to(task, preempt) functionality
sched: Use a buddy to implement yield_task_fair()
sched: Limit the scope of clear_buddies
sched: Check the right ->nr_running in yield_task_fair()
sched: Avoid expensive initial update_cfs_load(), on UP too
sched: Fix switch_from_fair()
sched: Simplify the idle scheduling class
softirqs: Account ksoftirqd time as cpustat softirq
...
Add a commandline parameter "threadirqs" which forces all interrupts except
those marked IRQF_NO_THREAD to run threaded. That's mostly a debug option to
allow retrieving better debug data from crashing interrupt handlers. If
"threadirqs" is not enabled on the kernel command line, then there is no
impact in the interrupt hotpath.
Architecture code needs to select CONFIG_IRQ_FORCED_THREADING after
marking the interrupts which cant be threaded IRQF_NO_THREAD. All
interrupts which have IRQF_TIMER set are implict marked
IRQF_NO_THREAD. Also all PER_CPU interrupts are excluded.
Forced threading hard interrupts also forces all soft interrupt
handling into thread context.
When enabled it might slow down things a bit, but for debugging problems in
interrupt code it's a reasonable penalty as it does not immediately
crash and burn the machine when an interrupt handler is buggy.
Some test results on a Core2Duo machine:
Cache cold run of:
# time git grep irq_desc
non-threaded threaded
real 1m18.741s 1m19.061s
user 0m1.874s 0m1.757s
sys 0m5.843s 0m5.427s
# iperf -c server
non-threaded
[ 3] 0.0-10.0 sec 1.09 GBytes 933 Mbits/sec
[ 3] 0.0-10.0 sec 1.09 GBytes 934 Mbits/sec
[ 3] 0.0-10.0 sec 1.09 GBytes 933 Mbits/sec
threaded
[ 3] 0.0-10.0 sec 1.09 GBytes 939 Mbits/sec
[ 3] 0.0-10.0 sec 1.09 GBytes 934 Mbits/sec
[ 3] 0.0-10.0 sec 1.09 GBytes 937 Mbits/sec
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20110223234956.772668648@linutronix.de>
Some low level interrupts cannot be threaded even when we force thread
all interrupt handlers. Add a flag to annotate such interrupts. Add
all timer interrupts to this category by default.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20110223234956.578893460@linutronix.de>
For level type interrupts we need to track how many threads are on
flight to avoid useless interrupt storms when not all thread handlers
have finished yet. Keep track of the woken threads and only unmask
when there are no more threads in flight.
Yes, I'm lazy and using a bitfield. But not only because I'm lazy, the
main reason is that it's way simpler than using a refcount. A refcount
based solution would need to keep track of various things like
crashing the irq thread, spurious interrupts coming in,
disables/enables, free_irq() and some more. The bitfield keeps the
tracking simple and makes things just work. It's also nicely confined
to the thread code pathes and does not require additional checks all
over the place.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20110223234956.388095876@linutronix.de>
All archs implement show_interrupts() in more or less the same
way. That's tons of duplicated code with different bugs with no
value. Implement a generic version and deprecate show_interrupts()
Unfortunately we need some ifdeffery for !GENERIC_HARDIRQ archs.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The irq namespace has become quite convoluted. My bad. Clean it up
and deprecate the old functions. All new functions follow the scheme:
irq number based:
irq_set/get/xxx/_xxx(unsigned int irq, ...)
irq_data based:
irq_data_set/get/xxx/_xxx(struct irq_data *d, ....)
irq_desc based:
irq_desc_get_xxx(struct irq_desc *desc)
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
irq/for-xen contains new functionality to avoid Xen private irq
hackery. That branch has a single irq commit and is pulled by Xen to
base their new features on.
Merge it into irq/core as other patches modify the same code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Xen needs to reenable interrupts which are marked IRQF_NO_SUSPEND in the
resume path. Add a flag to force the reenabling in the resume code.
Tested-and-acked-by: Ian Campbell <Ian.Campbell@eu.citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
When initiating I/O on a multiqueue and multi-IRQ device, we may want
to select a queue for which the response will be handled on the same
or a nearby CPU. This requires a reverse-map of IRQ affinity. Add a
notification mechanism to support this.
This is based closely on work by Thomas Gleixner <tglx@linutronix.de>.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Cc: linux-net-drivers@solarflare.com
Cc: Tom Herbert <therbert@google.com>
Cc: David Miller <davem@davemloft.net>
LKML-Reference: <1295470904.11126.84.camel@bwh-desktop>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We currently use kmalloc-96 slab for struct irqaction allocations on
64bit arches.
This is unfortunate because of possible false sharing and two cache
lines accesses.
Move 'name' and 'dir' fields at the end of the structure, and force a
suitable alignement.
Hot path fields now use one cache line on x86_64.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: Andi Kleen <andi@firstfloor.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <1288865628.2659.69.camel@edumazet-laptop>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>