Note: dom0 checking in v4 has been separated out into 2/2.
This patch enables P2P upstream forwarding in ACS capable PCIe switches.
It solves two potential problems in virtualization environment where a PCIe
device is assigned to a guest domain using a HW iommu such as VT-d:
1) Unintentional failure caused by guest physical address programmed
into the device's DMA that happens to match the memory address range
of other downstream ports in the same PCIe switch. This causes the PCI
transaction to go to the matching downstream port instead of go to the
root complex to get translated by VT-d as it should be.
2) Malicious guest software intentionally attacks another downstream
PCIe device by programming the DMA address into the assigned device
that matches memory address range of the downstream PCIe port.
We are in process of implementing device filtering software in KVM/XEN
management software to allow device assignment of PCIe devices behind a PCIe
switch only if it has ACS capability and with the P2P upstream forwarding bits
enabled. This patch is intended to work for both KVM and Xen environments.
Signed-off-by: Allen Kay <allen.m.kay@intel.com>
Reviewed-by: Mathew Wilcox <willy@linux.intel.com>
Reviewed-by: Chris Wright <chris@sous-sol.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Move xen_domain and related tests out of asm-x86 to xen/xen.h so they
can be included whenever they are necessary.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
For non hotplug PCI devices, the system firmware usually configures
CLS correctly. For pccard devices system firmware can't do it and
Linux PCI layer doesn't do it either. Unfortunately this leads to
poor performance for certain devices (sata_sil). Unless MWI, which
requires separate configuration, is to be used, CLS doesn't affect
correctness, so the configuration should be harmless.
This patch makes pci_set_cacheline_size() always built and export it
and make pccard call it during attach.
Please note that some other PCI hotplug drivers (shpchp and pciehp)
also configure CLS on hotplug.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Daniel Ritz <daniel.ritz@gmx.ch>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Greg KH <greg@kroah.com>
Cc: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Cc: Axel Birndt <towerlexa@gmx.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Till now, CLS has been determined either by arch code or as
L1_CACHE_BYTES. Only x86 and ia64 set CLS explicitly and x86 doesn't
always get it right. On most configurations, the chance is that
firmware configures the correct value during boot.
This patch makes pci_init() determine CLS by looking at what firmware
has configured. It scans all devices and if all non-zero values
agree, the value is used. If none is configured or there is a
disagreement, pci_dfl_cache_line_size is used. arch can set the dfl
value (via PCI_CACHE_LINE_BYTES or pci_dfl_cache_line_size) or
override the actual one.
ia64, x86 and sparc64 updated to set the default cls instead of the
actual one.
While at it, declare pci_cache_line_size and pci_dfl_cache_line_size
in pci.h and drop private declarations from arch code.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: David Miller <davem@davemloft.net>
Acked-by: Greg KH <gregkh@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
PM: Remove some debug messages producing too much noise
PM: Fix warning on suspend errors
PM / Hibernate: Add newline to load_image() fail path
PM / Hibernate: Fix error handling in save_image()
PM / Hibernate: Fix blkdev refleaks
PM / yenta: Split resume into early and late parts (rev. 4)
Commit 0c570cdeb8
(PM / yenta: Fix cardbus suspend/resume regression) caused resume to
fail on systems with two CardBus bridges. While the exact nature
of the failure is not known at the moment, it can be worked around by
splitting the yenta resume into an early part, executed during the
early phase of resume, that will only resume the socket and power it
up if there was a card in it during suspend, and a late part,
executed during "regular" resume, that will carry out all of the
remaining yenta resume operations.
Fixes http://bugzilla.kernel.org/show_bug.cgi?id=14334, which is a
listed regression from 2.6.31.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Dominik Brodowski <linux@dominikbrodowski.net>
Reported-by: Stephen J. Gowdy <gowdy@cern.ch>
Tested-by: Jose Marino <braket@hotmail.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
9p: fix readdir corner cases
9p: fix readlink
9p: fix a small bug in readdir for long directories
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf tools: Remove -Wcast-align
perf tools: Fix compatibility with libelf 0.8 and autodetect
perf events: Don't generate events for the idle task when exclude_idle is set
perf events: Fix swevent hrtimer sampling by keeping track of remaining time when enabling/disabling swevent hrtimers
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
tracing: Remove cpu arg from the rb_time_stamp() function
tracing: Fix comment typo and documentation example
tracing: Fix trace_seq_printf() return value
tracing: Update *ppos instead of filp->f_pos
The patch below also addresses a couple of other corner cases in readdir
seen with a large (e.g. 64k) msize. I'm not sure what people think of
my co-opting of fid->aux here. I'd be happy to rework if there's a better
way.
When the size of the user supplied buffer passed to readdir is smaller
than the data returned in one go by the 9P read request, v9fs_dir_readdir()
currently discards extra data so that, on the next call, a 9P read
request will be issued with offset < previous offset + bytes returned,
which voilates the constraint described in paragraph 3 of read(5) description.
This patch preseves the leftover data in fid->aux for use in the next call.
Signed-off-by: Jim Garlick <garlick@llnl.gov>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Platform drivers registered via platform_driver_probe() can be bound
to devices only once, upon registration, because discard their probe()
routines to save memory. Unbinding the driver through sysfs 'unbind'
leaves the device stranded and confuses users so let's not create
bind and unbind attributes for such drivers.
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Cc: Éric Piel <eric.piel@tremplin-utc.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
On UDP sockets, we must call skb_free_datagram() with socket locked,
or risk sk_forward_alloc corruption. This requirement is not respected
in SUNRPC.
Add a convenient helper, skb_free_datagram_locked() and use it in SUNRPC
Reported-by: Francis Moreau <francis.moro@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Policy routing is not looked up by mark on reverse path filtering.
This fixes it.
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
struct sk_buff kmemcheck annotations enlarged this structure by 8/16 bytes
Fix this by moving 'protocol' inside flags1 bitfield,
and queue_mapping inside flags2 bitfield.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-param-fixes:
param: fix setting arrays of bool
param: fix NULL comparison on oom
param: fix lots of bugs with writing charp params from sysfs, by leaking mem.
* 'hwpoison-2.6.32' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6:
HWPOISON: fix invalid page count in printk output
HWPOISON: Allow schedule_on_each_cpu() from keventd
HWPOISON: fix/proc/meminfo alignment
HWPOISON: fix oops on ksm pages
HWPOISON: Fix page count leak in hwpoison late kill in do_swap_page
HWPOISON: return early on non-LRU pages
HWPOISON: Add brief hwpoison description to Documentation
HWPOISON: Clean up PR_MCE_KILL interface
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
futex: Move drop_futex_key_refs out of spinlock'ed region
rcu: Fix TREE_PREEMPT_RCU CPU_HOTPLUG bad-luck hang
rcu: Stopgap fix for synchronize_rcu_expedited() for TREE_PREEMPT_RCU
rcu: Prevent RCU IPI storms in presence of high call_rcu() load
futex: Check for NULL keys in match_futex
futex: Handle spurious wake up
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Do less agressive buddy clearing
sched: Disable SD_PREFER_LOCAL for MC/CPU domains
The IBM Saturn serial card has only one port. Without that fixup,
the kernel thinks it has two, which confuses userland setup and
admin tools as well.
[akpm@linux-foundation.org: fix pci-ids.h layout]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Alan Cox <alan@linux.intel.com>
Cc: Michael Reed <mreed10@us.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
strstrip() can return a modified value of its input argument, when
removing elading whitesapce. So it is surely bug for this function's
return value to be ignored. The caller is probably going to use the
incorrect original pointer.
So mark it __must_check to prevent this frm happening (as it has before).
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When CONFIG_CPU_FREQ is disabled, cpufreq_get() needs a stub. Used by kvm
(although it looks like a bit of the kvm code could be omitted when
CONFIG_CPU_FREQ is disabled).
arch/x86/built-in.o: In function `kvm_arch_init':
(.text+0x10de7): undefined reference to `cpufreq_get'
(Needed in linux-next's KVM tree, but it's correct in 2.6.32).
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Tested-by: Eric Paris <eparis@redhat.com>
Cc: Jiri Slaby <jirislaby@gmail.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
e180a6b775 "param: fix charp parameters set via sysfs" fixed the case
where charp parameters written via sysfs were freed, leaving drivers
accessing random memory.
Unfortunately, storing a flag in the kparam struct was a bad idea: it's
rodata so setting it causes an oops on some archs. But that's not all:
1) module_param_array() on charp doesn't work reliably, since we use an
uninitialized temporary struct kernel_param.
2) there's a fundamental race if a module uses this parameter and then
it's changed: they will still access the old, freed, memory.
The simplest fix (ie. for 2.6.32) is to never free the memory. This
prevents all these problems, at cost of a memory leak. In practice, there
are only 18 places where a charp is writable via sysfs, and all are
root-only writable.
Reported-by: Takashi Iwai <tiwai@suse.de>
Cc: Sitsofe Wheeler <sitsofe@yahoo.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org