Get rid of these:
drivers/pci/pcie/portdrv_core.c: In function 'pcie_port_device_register':
drivers/pci/pcie/portdrv_core.c:275:16: warning: 'cap_mask' may be used
uninitialized in this function [-Wuninitialized]
drivers/pci/pcie/portdrv_core.c:240:6: note: 'cap_mask' was declared here
In some cases, 'cap_mask' may be not set in pcie_port_platform_notify,
holding a garbage value.
Signed-off-by: Chunhe Lan <Chunhe.Lan@freescale.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Since 3.2.12 and 3.3, some systems are failing to boot with a BUG_ON.
Some other systems using the pata_jmicron driver fail to boot because no
disks are detected. Passing pcie_aspm=force on the kernel command line
works around it.
The cause: commit 4949be1682 ("PCI: ignore pre-1.1 ASPM quirking when
ASPM is disabled") changed the behaviour of pcie_aspm_sanity_check() to
always return 0 if aspm is disabled, in order to avoid cases where we
changed ASPM state on pre-PCIe 1.1 devices.
This skipped the secondary function of pcie_aspm_sanity_check which was
to avoid us enabling ASPM on devices that had non-PCIe children, causing
trouble later on. Move the aspm_disabled check so we continue to honour
that scenario.
Addresses https://bugzilla.kernel.org/show_bug.cgi?id=42979 and
http://bugs.debian.org/665420
Reported-by: Romain Francoise <romain@orebokech.com> # kernel panic
Reported-by: Chris Holland <bandidoirlandes@gmail.com> # disk detection trouble
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Cc: stable@vger.kernel.org
Tested-by: Hatem Masmoudi <hatem.masmoudi@gmail.com> # Dell Latitude E5520
Tested-by: janek <jan0x6c@gmail.com> # pata_jmicron with JMB362/JMB363
[jn: with more symptoms in log message]
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull PCI changes (including maintainer change) from Jesse Barnes:
"This pull has some good cleanups from Bjorn and Yinghai, as well as
some more code from Yinghai to better handle resource re-allocation
when enabled.
There's also a new initcall_debug feature from Arjan which will print
out quirk timing information to help identify slow quirks for fixing
or refinement (Yinghai sent in a few patches to do just that once the
new debug code landed).
Beyond that, I'm handing off PCI maintainership to Bjorn Helgaas.
He's been a core PCI and Linux contributor for some time now, and has
kindly volunteered to take over. I just don't feel I have the time
for PCI review and work that it deserves lately (I've taken on some
other projects), and haven't been as responsive lately as I'd like, so
I approached Bjorn asking if he'd like to manage things. He's going
to give it a try, and I'm confident he'll do at least as well as I
have in keeping the tree managed, patches flowing, and keeping things
stable."
Fix up some fairly trivial conflicts due to other cleanups (mips device
resource fixup cleanups clashing with list handling cleanup, ppc iseries
removal clashing with pci_probe_only cleanup etc)
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci: (112 commits)
PCI: Bjorn gets PCI hotplug too
PCI: hand PCI maintenance over to Bjorn Helgaas
unicore32/PCI: move <asm-generic/pci-bridge.h> include to asm/pci.h
sparc/PCI: convert devtree and arch-probed bus addresses to resource
powerpc/PCI: allow reallocation on PA Semi
powerpc/PCI: convert devtree bus addresses to resource
powerpc/PCI: compute I/O space bus-to-resource offset consistently
arm/PCI: don't export pci_flags
PCI: fix bridge I/O window bus-to-resource conversion
x86/PCI: add spinlock held check to 'pcibios_fwaddrmap_lookup()'
PCI / PCIe: Introduce command line option to disable ARI
PCI: make acpihp use __pci_remove_bus_device instead
PCI: export __pci_remove_bus_device
PCI: Rename pci_remove_behind_bridge to pci_stop_and_remove_behind_bridge
PCI: Rename pci_remove_bus_device to pci_stop_and_remove_bus_device
PCI: print out PCI device info along with duration
PCI: Move "pci reassigndev resource alignment" out of quirks.c
PCI: Use class for quirk for usb host controller fixup
PCI: Use class for quirk for ti816x class fixup
PCI: Use class for quirk for intel e100 interrupt fixup
...
Right now we won't touch ASPM state if ASPM is disabled, except in the case
where we find a device that appears to be too old to reliably support ASPM.
Right now we'll clear it in that case, which is almost certainly the wrong
thing to do. The easiest way around this is just to disable the blacklisting
when ASPM is disabled.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Add a parameter to avoid using MSI/MSI-X for PCIe native hotplug; it's
known to be buggy on some platforms.
In my environment, while shutting down, following stack trace is shown
sometimes.
irq 16: nobody cared (try booting with the "irqpoll" option)
Pid: 1081, comm: reboot Not tainted 3.2.0 #1
Call Trace:
<IRQ> [<ffffffff810cec1d>] __report_bad_irq+0x3d/0xe0
[<ffffffff810cee1c>] note_interrupt+0x15c/0x210
[<ffffffff810cc485>] handle_irq_event_percpu+0xb5/0x210
[<ffffffff810cc621>] handle_irq_event+0x41/0x70
[<ffffffff810cf675>] handle_fasteoi_irq+0x55/0xc0
[<ffffffff81015356>] handle_irq+0x46/0xb0
[<ffffffff814fbe9d>] do_IRQ+0x5d/0xe0
[<ffffffff814f146e>] common_interrupt+0x6e/0x6e
[<ffffffff8106b040>] ? __do_softirq+0x60/0x210
[<ffffffff8108aeb1>] ? hrtimer_interrupt+0x151/0x240
[<ffffffff814fb5ec>] call_softirq+0x1c/0x30
[<ffffffff810152d5>] do_softirq+0x65/0xa0
[<ffffffff8106ae9d>] irq_exit+0xbd/0xe0
[<ffffffff814fbf8e>] smp_apic_timer_interrupt+0x6e/0x99
[<ffffffff814f9e5e>] apic_timer_interrupt+0x6e/0x80
<EOI> [<ffffffff814f0fb1>] ? _raw_spin_unlock_irqrestore+0x11/0x20
[<ffffffff812629fc>] pci_bus_write_config_word+0x6c/0x80
[<ffffffff81266fc2>] pci_intx+0x52/0xa0
[<ffffffff8127de3d>] pci_intx_for_msi+0x1d/0x30
[<ffffffff8127e4fb>] pci_msi_shutdown+0x7b/0x110
[<ffffffff81269d34>] pci_device_shutdown+0x34/0x50
[<ffffffff81326c4f>] device_shutdown+0x2f/0x140
[<ffffffff8107b981>] kernel_restart_prepare+0x31/0x40
[<ffffffff8107b9e6>] kernel_restart+0x16/0x60
[<ffffffff8107bbfd>] sys_reboot+0x1ad/0x220
[<ffffffff814f4b90>] ? do_page_fault+0x1e0/0x460
[<ffffffff811942d0>] ? __sync_filesystem+0x90/0x90
[<ffffffff8105c9aa>] ? __cond_resched+0x2a/0x40
[<ffffffff814ef090>] ? _cond_resched+0x30/0x40
[<ffffffff81169e17>] ? iterate_supers+0xb7/0xd0
[<ffffffff814f9382>] system_call_fastpath+0x16/0x1b
handlers:
[<ffffffff8138a0f0>] usb_hcd_irq
[<ffffffff8138a0f0>] usb_hcd_irq
[<ffffffff8138a0f0>] usb_hcd_irq
Disabling IRQ #16
An un-wanted interrupt is generated when PCI driver switches from
MSI/MSI-X to INTx while shutting down the device. The interrupt does
not happen if MSI/MSI-X is not used on the device.
I confirmed that this problem does not happen if pcie_hp=nomsi was
specified and hotplug operation worked fine as usual.
v2: Automatically disable MSI/MSI-X against following device:
PCI bridge: Integrated Device Technology, Inc. Device 807f (rev 02)
v3: Based on the review comment, combile the if statements.
v4: Removed module parameter.
Move some code to build pciehp as a module.
Move device specific code to driver/pci/quirks.c.
v5: Drop a device specific code until getting a vendor statement.
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: MUNEDA Takahiro <muneda.takahiro@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Distributions may wish to provide different defaults for PCIE ASPM
depending on their target audience. Provide a configuration option for
choosing the default policy.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
module_param(bool) used to counter-intuitively take an int. In
fddd5201 (mid-2009) we allowed bool or int/unsigned int using a messy
trick.
It's time to remove the int/unsigned int option. For this version
it'll simply give a warning, but it'll break next kernel version.
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Right now we forcibly clear ASPM state on all devices if the BIOS indicates
that the feature isn't supported. Based on the Microsoft presentation
"PCI Express In Depth for Windows Vista and Beyond", I'm starting to think
that this may be an error. The implication is that unless the platform
grants full control via _OSC, Windows will not touch any PCIe features -
including ASPM. In that case clearing ASPM state would be an error unless
the platform has granted us that control.
This patch reworks the ASPM disabling code such that the actual clearing
of state is triggered by a successful handoff of PCIe control to the OS.
The general ASPM code undergoes some changes in order to ensure that the
ability to clear the bits isn't overridden by ASPM having already been
disabled. Further, this theoretically now allows for situations where
only a subset of PCIe roots hand over control, leaving the others in the
BIOS state.
It's difficult to know for sure that this is the right thing to do -
there's zero public documentation on the interaction between all of these
components. But enough vendors enable ASPM on platforms and then set this
bit that it seems likely that they're expecting the OS to leave them alone.
Measured to save around 5W on an idle Thinkpad X220.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The land of PCI power management is a land of sorrow and ugliness,
especially in the area of signaling events by devices. There are
devices that set their PME Status bits, but don't really bother
to send a PME message or assert PME#. There are hardware vendors
who don't connect PME# lines to the system core logic (they know
who they are). There are PCI Express Root Ports that don't bother
to trigger interrupts when they receive PME messages from the devices
below. There are ACPI BIOSes that forget to provide _PRW methods for
devices capable of signaling wakeup. Finally, there are BIOSes that
do provide _PRW methods for such devices, but then don't bother to
call Notify() for those devices from the corresponding _Lxx/_Exx
GPE-handling methods. In all of these cases the kernel doesn't have
a chance to receive a proper notification that it should wake up a
device, so devices stay in low-power states forever. Worse yet, in
some cases they continuously send PME Messages that are silently
ignored, because the kernel simply doesn't know that it should clear
the device's PME Status bit.
This problem was first observed for "parallel" (non-Express) PCI
devices on add-on cards and Matthew Garrett addressed it by adding
code that polls PME Status bits of such devices, if they are enabled
to signal PME, to the kernel. Recently, however, it has turned out
that PCI Express devices are also affected by this issue and that it
is not limited to add-on devices, so it seems necessary to extend
the PME polling to all PCI devices, including PCI Express and planar
ones. Still, it would be wasteful to poll the PME Status bits of
devices that are known to receive proper PME notifications, so make
the kernel (1) poll the PME Status bits of all PCI and PCIe devices
enabled to signal PME and (2) disable the PME Status polling for
devices for which correct PME notifications are received.
Tested-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: remove printks about disabled bridge windows
PCI: fold pci_calc_resource_flags() into decode_bar()
PCI: treat mem BAR type "11" (reserved) as 32-bit, not 64-bit, BAR
PCI: correct pcie_set_readrq write size
PCI: pciehp: change wait time for valid configuration access
x86/PCI: Preserve existing pci=bfsort whitelist for Dell systems
PCI: ARI is a PCIe v2 feature
x86/PCI: quirks: Use pci_dev->revision
PCI: Make the struct pci_dev * argument of pci_fixup_irqs const.
PCI hotplug: cpqphp: use pci_dev->vendor
PCI hotplug: cpqphp: use pci_dev->subsystem_{vendor|device}
x86/PCI: config space accessor functions should not ignore the segment argument
PCI: Assign values to 'pci_obff_signal_type' enumeration constants
x86/PCI: reduce severity of host bridge window conflict warnings
PCI: enumerate the PCI device only removed out PCI hieratchy of OS when re-scanning PCI
PCI: PCIe AER: add aer_recover_queue
x86/PCI: select direct access mode for mmconfig option
PCI hotplug: Rename is_ejectable which also exists in dock.c
In addition to native PCIe AER, now APEI (ACPI Platform Error
Interface) GHES (Generic Hardware Error Source) can be used to report
PCIe AER errors too. To add support to APEI GHES PCIe AER recovery,
aer_recover_queue is added to export the recovery function in native
PCIe AER driver.
Recoverable PCIe AER errors are reported via NMI in APEI GHES. Then
APEI GHES uses irq_work to delay the error processing into an IRQ
handler. But PCIe AER recovery can be very time-consuming, so
aer_recover_queue, which can be used in IRQ handler, delays the real
recovery action into the process context, that is, work queue.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Merriam-Webster tells us that the word exists. However ...
* Google suggests `forcibly' because it doesn't recognize `forcedly'.
* Google lists 494 thousand results for `forcedly'.
* Google lists 13.7 million results for `forcibly'.
* Linus's repo contains 1 occurrence of `forcedly' ( 0 after my change).
* Linus's repo contains 60 occurrences of `forcibly' (61 after my change).
Signed-off-by: Michael Witten <mfwitten@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
If it was preempted, and the variable aer_mask_override is changed
after the spin_unlock_irqrestore it will write an uninitialized
variable by the pci_write_config_dword() function.
Signed-off-by: Wanlong Gao <wanlong.gao@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Commit 2f671e2d allowed us to clear ASPM state when the FADT
tells us it isn't supported, but we don't put this into effect
if the aspm_policy is set to POLICY_POWERSAVE. Enable the
state to be cleared regardless of policy.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: Disable ASPM when _OSC control is not granted for PCIe services
PCI: Changing ASPM policy, via /sys, to POWERSAVE could cause NMIs
PCI: PCIe links may not get configured for ASPM under POWERSAVE mode
PCI/ACPI: Report ASPM support to BIOS if not disabled from command line
The AER error information printing support is implemented in
drivers/pci/pcie/aer/aer_print.c. So some string constants, functions
and macros definitions can be re-used without being exported.
The original PCIe AER error information printing function is not
re-used directly because the overall format is quite different. And
changing the original printing format may make some original users'
scripts broken.
Signed-off-by: Huang Ying <ying.huang@intel.com>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
CC: Zhang Yanmin <yanmin.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
When printing PCIe AER error information, each line is prefixed with
PCIe device and driver information. In original implementation, the
prefix is generated when each line is printed. In fact, all lines
share the same prefix. So this patch pre-generated the prefix, and
use that one when each line is printed.
In addition to common prefix can be pre-generated, the trailing white
spaces in string constants and NULLs in char * array constants can be
removed too. These can reduce the object file size further.
The size of object file before and after changing is as follow:
text data bss dec
before: 3038 0 0 3038
after: 2118 0 0 2118
Signed-off-by: Huang Ying <ying.huang@intel.com>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
CC: Zhang Yanmin <yanmin.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
v3 -> v2: Added text to describe the problem
v2 -> v1: Split this patch from v1
v1 : Part of: http://marc.info/?l=linux-pci&m=130042212003242&w=2
Disable ASPM when no _OSC control for PCIe services is granted
by the BIOS. This is to protect systems with a buggy BIOS that
did not set the ACPI FADT "ASPM Controls" bit even though the
underlying HW can't do ASPM.
To turn "on" ASPM the minimum the BIOS needs to do:
1. Clear the ACPI FADT "ASPM Controls" bit.
2. Support _OSC appropriately
There is no _OSC Control bit for ASPM. However, we expect the BIOS to
support _OSC for a Root Bridge that originates a PCIe hierarchy. If this
is not the case - we are better off not enabling ASPM on that server.
Commit 852972acff (ACPI: Disable ASPM if the
Platform won't provide _OSC control for PCIe) describes the above scenario.
To quote verbatim from there:
[The PCI SIG documentation for the _OSC OS/firmware handshaking interface
states:
"If the _OSC control method is absent from the scope of a host bridge
device, then the operating system must not enable or attempt to use any
features defined in this section for the hierarchy originated by the host
bridge."
The obvious interpretation of this is that the OS should not attempt to use
PCIe hotplug, PME or AER - however, the specification also notes that an
_OSC method is *required* for PCIe hierarchies, and experimental validation
with An Alternative OS indicates that it doesn't use any PCIe functionality
if the _OSC method is missing. That arguably means we shouldn't be using
MSI or extended config space, but right now our problems seem to be limited
to vendors being surprised when ASPM gets enabled on machines when other
OSs refuse to do so. So, for now, let's just disable ASPM if the _OSC
method doesn't exist or refuses to hand over PCIe capability control.]
Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
v3 -> v2: Modified the text that describes the problem
v2 -> v1: Returned -EPERM
v1 : http://marc.info/?l=linux-pci&m=130013194803727&w=2
For servers whose hardware cannot handle ASPM the BIOS ought to set the
FADT bit shown below:
In Sec 5.2.9.3 (IA-PC Boot Arch. Flags) of ACPI4.0a Specification, please
see Table 5-11:
PCIe ASPM Controls: If set, indicates to OSPM that it must not enable
OPSM ASPM control on this platform.
However there are shipping servers whose BIOS did not set this bit. (An
example is the HP ProLiant DL385 G6. A Maintenance BIOS will fix that).
For such servers even if a call is made via pci_no_aspm(), based on _OSC
support in the BIOS, it may be too late because the ASPM code may have
already allocated and filled its "link_list".
So if a user sets the ASPM "policy" to "powersave" via /sys then
pcie_aspm_set_policy() will run through the "link_list" and re-configure
ASPM policy on devices that advertise ASPM L0s/L1 capability:
# echo powersave > /sys/module/pcie_aspm/parameters/policy
# cat /sys/module/pcie_aspm/parameters/policy
default performance [powersave]
That can cause NMIs since the hardware doesn't play well with ASPM:
[ 1651.906015] NMI: PCI system error (SERR) for reason b1 on CPU 0.
[ 1651.906015] Dazed and confused, but trying to continue
Ideally, the BIOS should have set that FADT bit in the first place but we
could be more robust - especially given the fact that Windows doesn't
cause NMIs in the above scenario.
There should be a sanity check to not allow a user to modify ASPM policy
when aspm_disabled is set.
Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
v3 -> v2: Moved ASPM enabling logic to pci_set_power_state()
v2 -> v1: Preserved the logic in pci_raw_set_power_state()
: Added ASPM enabling logic after scanning Root Bridge
: http://marc.info/?l=linux-pci&m=130046996216391&w=2
v1 : http://marc.info/?l=linux-pci&m=130013164703283&w=2
The assumption made in commit 41cd766b06
(PCI: Don't enable aspm before drivers have had a chance to veto it) that
pci_enable_device() will result in re-configuring ASPM when aspm_policy is
POWERSAVE is no longer valid. This is due to commit
97c145f7c8 (PCI: read current power state
at enable time) which resets dev->current_state to D0. Due to this the
call to pcie_aspm_pm_state_change() is never made. Note the equality check
(below) that returns early:
./drivers/pci/pci.c: pci_raw_set_pci_power_state()
546 /* Check if we're already there */
547 if (dev->current_state == state)
548 return 0;
Therefore OSPM never configures the PCIe links for ASPM to turn them "on".
Fix it by configuring ASPM from the pci_enable_device() code path. This
also allows a driver such as the e1000e networking driver a chance to
disable ASPM (L0s, L1), if need be, prior to enabling the device. A
driver may perform this action if the device is known to mis-behave
wrt ASPM.
Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We need to distinguish the situation in which ASPM support is
disabled from the command line or through .config from the situation
in which it is disabled, because the hardware or BIOS can't handle
it. In the former case we should not report ASPM support to the BIOS
through ACPI _OSC, but in the latter case we should do that.
Introduce pcie_aspm_support_enabled() that can be used by
acpi_pci_root_add() to determine whether or not it should report ASPM
support to the BIOS through _OSC.
Cc: stable@kernel.org
References: https://bugzilla.kernel.org/show_bug.cgi?id=29722
References: https://bugzilla.kernel.org/show_bug.cgi?id=20232
Reported-and-tested-by: Ortwin Glück <odi@odi.ch>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Tested-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>