Pull PCI changes from Bjorn Helgaas:
"Host bridge hotplug:
- Add MMCONFIG support for hot-added host bridges (Jiang Liu)
Device hotplug:
- Move fixups from __init to __devinit (Sebastian Andrzej Siewior)
- Call FINAL fixups for hot-added devices, too (Myron Stowe)
- Factor out generic code for P2P bridge hot-add (Yinghai Lu)
- Remove all functions in a slot, not just those with _EJx (Amos
Kong)
Dynamic resource management:
- Track bus number allocation (struct resource tree per domain)
(Yinghai Lu)
- Make P2P bridge 1K I/O windows work with resource reassignment
(Bjorn Helgaas, Yinghai Lu)
- Disable decoding while updating 64-bit BARs (Bjorn Helgaas)
Power management:
- Add PCIe runtime D3cold support (Huang Ying)
Virtualization:
- Add VFIO infrastructure (ACS, DMA source ID quirks) (Alex
Williamson)
- Add quirks for devices with broken INTx masking (Jan Kiszka)
Miscellaneous:
- Fix some PCI Express capability version issues (Myron Stowe)
- Factor out some arch code with a weak, generic, pcibios_setup()
(Myron Stowe)"
* tag 'for-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (122 commits)
PCI: hotplug: ensure a consistent return value in error case
PCI: fix undefined reference to 'pci_fixup_final_inited'
PCI: build resource code for M68K architecture
PCI: pciehp: remove unused pciehp_get_max_lnk_width(), pciehp_get_cur_lnk_width()
PCI: reorder __pci_assign_resource() (no change)
PCI: fix truncation of resource size to 32 bits
PCI: acpiphp: merge acpiphp_debug and debug
PCI: acpiphp: remove unused res_lock
sparc/PCI: replace pci_cfg_fake_ranges() with pci_read_bridge_bases()
PCI: call final fixups hot-added devices
PCI: move final fixups from __init to __devinit
x86/PCI: move final fixups from __init to __devinit
MIPS/PCI: move final fixups from __init to __devinit
PCI: support sizing P2P bridge I/O windows with 1K granularity
PCI: reimplement P2P bridge 1K I/O windows (Intel P64H2)
PCI: disable MEM decoding while updating 64-bit MEM BARs
PCI: leave MEM and IO decoding disabled during 64-bit BAR sizing, too
PCI: never discard enable/suspend/resume_early/resume fixups
PCI: release temporary reference in __nv_msi_ht_cap_quirk()
PCI: restructure 'pci_do_fixups()'
...
Quite a few ASUS computers experience a nasty problem, related to the
EHCI controllers, when going into system suspend. It was observed
that the problem didn't occur if the controllers were not put into the
D3 power state before starting the suspend, and commit
151b612847 (USB: EHCI: fix crash during
suspend on ASUS computers) was created to do this.
It turned out this approach messed up other computers that didn't have
the problem -- it prevented USB wakeup from working. Consequently
commit c2fb8a3fa2 (USB: add
NO_D3_DURING_SLEEP flag and revert 151b612847) was merged; it
reverted the earlier commit and added a whitelist of known good board
names.
Now we know the actual cause of the problem. Thanks to AceLan Kao for
tracking it down.
According to him, an engineer at ASUS explained that some of their
BIOSes contain a bug that was added in an attempt to work around a
problem in early versions of Windows. When the computer goes into S3
suspend, the BIOS tries to verify that the EHCI controllers were first
quiesced by the OS. Nothing's wrong with this, but the BIOS does it
by checking that the PCI COMMAND registers contain 0 without checking
the controllers' power state. If the register isn't 0, the BIOS
assumes the controller needs to be quiesced and tries to do so. This
involves making various MMIO accesses to the controller, which don't
work very well if the controller is already in D3. The end result is
a system hang or memory corruption.
Since the value in the PCI COMMAND register doesn't matter once the
controller has been suspended, and since the value will be restored
anyway when the controller is resumed, we can work around the BIOS bug
simply by setting the register to 0 during system suspend. This patch
(as1590) does so and also reverts the second commit mentioned above,
which is now unnecessary.
In theory we could do this for every PCI device. However to avoid
introducing new problems, the patch restricts itself to EHCI host
controllers.
Finally the affected systems can suspend with USB wakeup working
properly.
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=37632
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=42728
Based-on-patch-by: AceLan Kao <acelan.kao@canonical.com>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Dâniel Fraga <fragabr@gmail.com>
Tested-by: Javier Marcet <jmarcet@gmail.com>
Tested-by: Andrey Rahmatullin <wrar@wrar.name>
Tested-by: Oleksij Rempel <bug-track@fisher-privat.net>
Tested-by: Pavel Pisa <pisa@cmp.felk.cvut.cz>
Cc: stable <stable@vger.kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit cc2893b6 (PCI: Ensure we re-enable devices on resume)
addressed the problem with USB not being powered after resume on
recent Lenovo machines, but it did that in a suboptimal way.
Namely, it should have changed the relevant code paths only,
which are pci_pm_resume_noirq() and pci_pm_restore_noirq() supposed
to restore the device's power and standard configuration registers
after system resume from suspend or hibernation. Instead, however,
it modified pci_set_power_state() which is executed in several
other situations too. That resulted in some undesirable effects,
like attempting to change a device's power state in the same way
multiple times in a row (up to as many as 4 times in a row in the
snd_hda_intel driver).
Fix the bug addressed by commit cc2893b6 in an alternative way,
by forcibly powering up all devices in pci_pm_default_resume_early(),
which is called by pci_pm_resume_noirq() and pci_pm_restore_noirq()
to restore the device's power and standard configuration registers,
and modifying pci_pm_runtime_resume() to avoid the forcible power-up
if not necessary. Then, revert the changes made by commit cc2893b6
to make the confusion introduced by it go away.
Acked-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
This patch adds runtime D3cold support and corresponding ACPI platform
support. This patch only enables runtime D3cold support; it does not
enable D3cold support during system suspend/hibernate.
D3cold is the deepest power saving state for a PCIe device, where its main
power is removed. While it is in D3cold, you can't access the device at
all, not even its configuration space (which is still accessible in D3hot).
Therefore the PCI PM registers can not be used to transition into/out of
the D3cold state; that must be done by platform logic such as ACPI _PR3.
To support wakeup from D3cold, a system may provide auxiliary power, which
allows a device to request wakeup using a Beacon or the sideband WAKE#
signal. WAKE# is usually connected to platform logic such as ACPI GPE.
This is quite different from other power saving states, where devices
request wakeup via a PME message on the PCIe link.
Some devices, such as those in plug-in slots, have no direct platform
logic. For example, there is usually no ACPI _PR3 for them. D3cold
support for these devices can be done via the PCIe Downstream Port leading
to the device. When the PCIe port is powered on/off, the device is powered
on/off too. Wakeup events from the device will be notified to the
corresponding PCIe port.
For more information about PCIe D3cold and corresponding ACPI support,
please refer to:
- PCI Express Base Specification Revision 2.0
- Advanced Configuration and Power Interface Specification Revision 5.0
[bhelgaas: changelog]
Reviewed-by: Rafael J. Wysocki <rjw@sisk.pl>
Originally-by: Zheng Yan <zheng.z.yan@intel.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Disable Bus Master bit on the device in pci_device_shutdown() to ensure PCI
devices do not continue to DMA data after shutdown. This can cause memory
corruption in case of a kexec where the current kernel shuts down and
transfers control to a new kernel while a PCI device continues to DMA to
memory that does not belong to it any more in the new kernel.
I have tested this code on two laptops, two workstations and a 16-socket
server. kexec worked correctly on all of them.
Signed-off-by: Khalid Aziz <khalid.aziz@hp.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Pull PCI changes (including maintainer change) from Jesse Barnes:
"This pull has some good cleanups from Bjorn and Yinghai, as well as
some more code from Yinghai to better handle resource re-allocation
when enabled.
There's also a new initcall_debug feature from Arjan which will print
out quirk timing information to help identify slow quirks for fixing
or refinement (Yinghai sent in a few patches to do just that once the
new debug code landed).
Beyond that, I'm handing off PCI maintainership to Bjorn Helgaas.
He's been a core PCI and Linux contributor for some time now, and has
kindly volunteered to take over. I just don't feel I have the time
for PCI review and work that it deserves lately (I've taken on some
other projects), and haven't been as responsive lately as I'd like, so
I approached Bjorn asking if he'd like to manage things. He's going
to give it a try, and I'm confident he'll do at least as well as I
have in keeping the tree managed, patches flowing, and keeping things
stable."
Fix up some fairly trivial conflicts due to other cleanups (mips device
resource fixup cleanups clashing with list handling cleanup, ppc iseries
removal clashing with pci_probe_only cleanup etc)
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci: (112 commits)
PCI: Bjorn gets PCI hotplug too
PCI: hand PCI maintenance over to Bjorn Helgaas
unicore32/PCI: move <asm-generic/pci-bridge.h> include to asm/pci.h
sparc/PCI: convert devtree and arch-probed bus addresses to resource
powerpc/PCI: allow reallocation on PA Semi
powerpc/PCI: convert devtree bus addresses to resource
powerpc/PCI: compute I/O space bus-to-resource offset consistently
arm/PCI: don't export pci_flags
PCI: fix bridge I/O window bus-to-resource conversion
x86/PCI: add spinlock held check to 'pcibios_fwaddrmap_lookup()'
PCI / PCIe: Introduce command line option to disable ARI
PCI: make acpihp use __pci_remove_bus_device instead
PCI: export __pci_remove_bus_device
PCI: Rename pci_remove_behind_bridge to pci_stop_and_remove_behind_bridge
PCI: Rename pci_remove_bus_device to pci_stop_and_remove_bus_device
PCI: print out PCI device info along with duration
PCI: Move "pci reassigndev resource alignment" out of quirks.c
PCI: Use class for quirk for usb host controller fixup
PCI: Use class for quirk for ti816x class fixup
PCI: Use class for quirk for intel e100 interrupt fixup
...
If a PCI device is enabled to generate wakeup signals (PME) when put
into a low-power state by runtime PM, it will be still enabled to
generate those signals after the system shutdown, unless its driver's
.shutdown() callback takes care of the wakeup signals generation
setting. Moreover, there are devices that are not enabled to wake
up the system and that are configured by runtime PM to generate
wakeup signals so that (runtime) remote wakeup works with them.
Those devices should be reconfigured during system shutdown so that
they don't generate wakeup signals, but at least some drivers don't
do that. However, that very well may be done by the PCI core so
that drivers don't have to worry about it. For this reason, modify
pci_device_shutdown() to disable the generation of wakeup events for
devices not supposed to wake up the system.
References: https://bugzilla.kernel.org/show_bug.cgi?id=37952
Reported-and-tested-by: Kamil Iskra <kamil.54002@iskra.name>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This patch (as1514) cleans up some places where new_id and remove_id
sysfs attributes are created and deleted. Handling both attributes in
a single routine rather than a pair of routines makes the code
smaller. It also prevents certain kinds of errors, like one we
currently have in the USB subsystem: The removeid attribute is often
created even when newid isn't (because the driver's no_dynamid_id flag
is set).
In the case of the PCMCIA subsystem, the newid attribute is created
but never explicitly deleted. The patch adds a deletion routine.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
As part of the removal of get_driver()/put_driver(), this patch
(as1511) changes all the places that add dynamic IDs for drivers.
Since these additions are done by writing to the drivers' sysfs
attribute files, and the attributes are removed when the drivers are
unregistered, there is no reason to take an extra reference to the
drivers.
The one exception is the pci-stub driver, which calls pci_add_dynid()
as part of its registration. But again, there's no reason to take an
extra reference here, because the driver can't be unloaded while it is
being registered.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: Dmitry Torokhov <dmitry.torokhov@gmail.com>
CC: Jiri Kosina <jkosina@suse.cz>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
CC: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Include the driver name and device in warning when a pci driver
supports both legacy pm and new framework as just the stack trace
gives no way to identify the driver.
Signed-off-by: David Fries <David@Fries.net>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
A subsequent patch is going to move the invocation of
pm_runtime_barrier() from dpm_prepare() to __device_suspend().
Consequently, early wakeup events resulting from runtime resume
requests for wakeup devices queued up right before system suspend
will only be detected after all of the subsystem-level .prepare()
callbacks have run. However, the PCI bus type calls
pm_runtime_get_sync() from its pci_pm_prepare() callback routine,
so it would destroy the early wakeup events information regarding PCI
devices. To prevent this from happening add an early wakeup
detection mechanism, analogous to the one currently in dpm_prepare(),
to pci_pm_prepare().
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
After commit e866500247
(PM: Allow pm_runtime_suspend() to succeed during system suspend) it
is possible that a device resumed by the pm_runtime_resume(dev) in
pci_pm_prepare() will be suspended immediately from a work item,
timer function or otherwise, defeating the very purpose of calling
pm_runtime_resume(dev) from there. To prevent that from happening
it is necessary to increment the runtime PM usage counter of the
device by replacing pm_runtime_resume() with pm_runtime_get_sync().
Moreover, the incremented runtime PM usage counter has to be
decremented by the corresponding pci_pm_complete(), via
pm_runtime_put_sync().
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: stable@kernel.org
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Xen save/restore is going to use hibernate device callbacks for
quiescing devices and putting them back to normal operations and it
would need to select CONFIG_HIBERNATION for this purpose. However,
that also would cause the hibernate interfaces for user space to be
enabled, which might confuse user space, because the Xen kernels
don't support hibernation. Moreover, it would be wasteful, as it
would make the Xen kernels include a substantial amount of code that
they would never use.
To address this issue introduce new power management Kconfig option
CONFIG_HIBERNATE_CALLBACKS, such that it will only select the code
that is necessary for the hibernate device callbacks to work and make
CONFIG_HIBERNATION select it. Then, Xen save/restore will be able to
select CONFIG_HIBERNATE_CALLBACKS without dragging the entire
hibernate code along with it.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Tested-by: Shriram Rajagopalan <rshriram@cs.ubc.ca>
After redefining CONFIG_PM to depend on (CONFIG_PM_SLEEP ||
CONFIG_PM_RUNTIME) the CONFIG_PM_OPS option is redundant and can be
replaced with CONFIG_PM.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
pci_restore_state only ever returns 0, thus there is no benefit in
having it return any value. Also, a large majority of the callers do
not check the return code of pci_restore_state. Make the
pci_restore_state a void return and avoid the overhead.
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Jon Mason <jon.mason@exar.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This patch (as1388) changes the way the PCI core handles runtime PM
settings when probing or unbinding drivers. Now the core will make
sure the device is enabled for runtime PM, with a usage count >= 1,
when a driver is probed. It does the same when calling a driver's
remove method.
If the driver wants to use runtime PM, all it has to do is call
pm_runtime_pu_noidle() near the end of its probe routine (to cancel
the core's usage increment) and pm_runtime_get_noresume() near the
start of its remove routine (to restore the usage count). It does not
need to mess around with setting the runtime state to enabled,
disabled, active, or suspended.
The patch updates e1000e and r8169, the only PCI drivers that already
use the existing runtime PM interface.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Introduce run-time PM callbacks for the PCI bus type. Make the new
callbacks work in analogy with the existing system sleep PM
callbacks, so that the drivers already converted to struct dev_pm_ops
can use their suspend and resume routines for run-time PM without
modifications.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (75 commits)
PCI hotplug: clean up acpi_run_hpp()
PCI hotplug: acpiphp: use generic pci_configure_slot()
PCI hotplug: shpchp: use generic pci_configure_slot()
PCI hotplug: pciehp: use generic pci_configure_slot()
PCI hotplug: add pci_configure_slot()
PCI hotplug: clean up acpi_get_hp_params_from_firmware() interface
PCI hotplug: acpiphp: don't cache hotplug_params in acpiphp_bridge
PCI hotplug: acpiphp: remove superfluous _HPP/_HPX evaluation
PCI: Clear saved_state after the state has been restored
PCI PM: Return error codes from pci_pm_resume()
PCI: use dev_printk in quirk messages
PCI / PCIe portdrv: Fix pcie_portdrv_slot_reset()
PCI Hotplug: convert acpi_pci_detect_ejectable() to take an acpi_handle
PCI Hotplug: acpiphp: find bridges the easy way
PCI: pcie portdrv: remove unused variable
PCI / ACPI PM: Propagate wake-up enable for devices w/o ACPI support
ACPI PM: Replace wakeup.prepared with reference counter
PCI PM: Introduce device flag wakeup_prepared
PCI / ACPI PM: Rework some debug messages
PCI PM: Simplify PCI wake-up code
...
Fixed up conflict in arch/powerpc/kernel/pci_64.c due to OF device tree
scanning having been moved and merged for the 32- and 64-bit cases. The
'needs_freset' initialization added in 6e19314cc ("PCI/powerpc: support
PCIe fundamental reset") is now in arch/powerpc/kernel/pci_of_scan.c.
Some PCI devices fail if their standard configuration registers are
restored twice in a row. Prevent this from happening by making
pci_restore_state() clear the saved_state flag of the device right
after the device's standard configuration registers have been
populated with the previously saved values.
Simplify PCI PM callbacks by removing the direct clearing of
state_saved from them, as it shouldn't be necessary any more (except
in pci_pm_thaw(), where it has to be cleared, so that the values saved
during the "freeze" phase of hibernation are not used later by mistake).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Currently pci_pm_resume() always returns 0, which makes the error
variable defined in there a bit pointless. Make pci_pm_resume()
return error codes obtained from drivers' callbacks.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Separate out pci_add_dynid() from store_new_id() and export it so that
in-kernel code can add PCI IDs dynamically. As the function will be
available regardless of HOTPLUG, put it and pull pci_free_dynids()
outside of CONFIG_HOTPLUG.
This will be used by pci-stub to initialize initial IDs via module
param.
While at it, remove bogus get_driver() failure check.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Reviewed-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Without the check, the config space may be filled with zeros. Though
the driver should try to avoid call restoring before saving, but the
pci layer also should check this.
Also removes the existing check in pci_restore_standard_config, since
it's superfluous with the new check in restore_state.
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Alek Du <alek.du@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
They are not supposed to be modified by drivers, so make them const.
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>