Add a new config option, PCI_REALLOC_ENABLE_AUTO, which will
automatically try to re-allocate PCI resources if PCI_IOV support is
enabled and the SR-IOV resources are unassigned. Behavior can still be
controlled using the pci=realloc= parameter.
-v2: According to Jesse, adding one CONFIG option for distribution to
disable it or enable it.
-v3: update Kconfig text (jbarnes)
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Let the user could enable and disable with pci=realloc=on or pci=realloc=off
Also
1. move variable and functions near the place they are used.
2. change macro to function
3. change related functions and variable to static and _init
4. update parameter description accordingly.
This will let us add a config option to control default behavior, and
still allow the user to turn off automatic reallocation if it fails on
their platform until a permanent solution is found.
-v2: still honor pci=realloc, and treat it as pci=realloc=on
also use enum instead of ...
-v3: update kernel-paramenters.txt according to Jesse.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
When enabling pci reallocation for a pci bridge, we clear the small size
in in bridge and re-assign with requested + optional size for first
several tries, but Ram mention could have problem with one case:
https://bugzilla.kernel.org/show_bug.cgi?id=15960
After checking the booting log in
https://lkml.org/lkml/2010/4/19/44
[regression, bisected] Xonar DX invalid PCI I/O range since 977d17bb17
We should not stop too early for io ports.
Apr 19 10:19:38 [kernel] pci 0000:04:00.0: BAR 7: can't assign io (size 0x4000)
Apr 19 10:19:38 [kernel] pci 0000:05:01.0: BAR 8: assigned [mem 0x80400000-0x805fffff]
Apr 19 10:19:38 [kernel] pci 0000:05:01.0: BAR 7: can't assign io (size 0x2000)
Apr 19 10:19:38 [kernel] pci 0000:05:02.0: BAR 7: can't assign io (size 0x1000)
Apr 19 10:19:38 [kernel] pci 0000:05:03.0: BAR 7: can't assign io (size 0x1000)
Apr 19 10:19:38 [kernel] pci 0000:08:00.0: BAR 7: can't assign io (size 0x1000)
Apr 19 10:19:38 [kernel] pci 0000:09:04.0: BAR 0: can't assign io (size 0x100)
and clear 00:1c.0 to retry again.
This patch removes IORESOUCE_IO checking, and tries one more time. It
gives us a chance to get an allocation for the 00:1c.0 io port range
because the range from 0x4000 to 0x8000 will be freed and we can use it.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
If we move resource assignment functions into the core, we'll still
need a way for architectures to prevent reassignment, e.g., the
"pci_probe_only" functionality, and we'll need a generic, always
available way the core can test for that. The "pci_flags"
arrangement used by several architectures seems like a convenient
way to do this.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
We should not set the requested size to -2; that will confuse the
resource list sorting with align when SIZEALIGN is used.
Change to STARTALIGN and pass align from start; we are safe to do that
just as we do that regular pci bridge. In the long run, we should just
treat cardbus like a regular pci bridge.
Also fix the case when realloc_head is not passed: we should keep the
requested size.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Tested-by: Dominik Brodowski <linux@dominikbrodowski.net>
Acked-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
After merging struct pci_dev_resource_x and pci_dev_resource,
We can use a function instead of macro now.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Linus says don't use dev_res_x because it doesn't communicate anything
about usage. Rename them to add_res or fail_res etc according to
context.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
pci_dev_resource_x is a superset of pci_dev_resource and they're just
temp structs used during resource reallocation.
pci_dev_resource usage is quite limted.
So just use pci_dev_resource_x, and rename it as new pci_dev_resource.
-v2: According to Linus, Separate free_list change to another patch
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
So we can use helper functions for generic list. This makes the
resource re-allocation code much more readable.
-v2: Use list_add_tail instead of adding list_insert_before, Pointed out
by Linus.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This allows us to move the definition of struct resource_list to
setup_bus.c and later convert resource_list to a regular list.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
On a system with devices that support SRIOV connected to a pcie switch
to pcie root port:
+-[0000:80]-+-00.0-[81-8f]--
| +-01.0-[90-9f]--
| +-02.0-[a0-af]----00.0-[a1-a3]--+-02.0-[a2]--+-00.0 Oracle Corporation Device 207a
| | \-03.0-[a3]--+-00.0 Oracle Corporation Device 207a
| +-02.2-[b0-bf]----00.0-[b1-b3]--+-02.0-[b2]--+-00.0 Oracle Corporation Device 207a
| | \-03.0-[b3]--+-00.0 Oracle Corporation Device 207a
When the BIOS does not assign resources for SRIOV BARs, kernel pci
reallocation only goes up one bridge and then gives up, failing to to
get resources for all sSRIOV BARs, even though the range is large enough
in the peer root bus.
Specifically, only the bridge at the a1:02.0 level has its resources
cleared and reallocated. The kernel does not go up to clear the bridge
at the 80:02.0 level.
To make it go to upper levels, during retry, we need to treat "good to have"
resources as "must have".
Only on the last try will we treat good to have resources as optional.
At that time, parent bridge resources will already have been released so
we'll have a chance to get everything assigned with must_have plus
good_to_have for all child devices.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This allows us to allocate resources to hotplug bridges during
remove/rescan.
We need to move the function to setup-bus.c so it can use
__pci_bus_size_bridges and __pci_bus_assign_resources directly to take
the add_list resource tracking list.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We need add size for hot plug path when pluging in hotplug chassis
without cards.
-v2: change descriptions. make it applicable after "pci: Check bridge
resources after resource allocation."
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Need to call it from __assign_resources_sorted() later and we'd like to
avoid a forward declaraion.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Will be used for resource_list_x duplication when trying
requested+optional at first.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
During debug of one SRIOV enabled hotplug device, we found found that
add_size is not passed properly.
The device has devices under two level bridges:
+-[0000:80]-+-00.0-[81-8f]--
| +-01.0-[90-9f]--
| +-02.0-[a0-af]----00.0-[a1-a3]--+-02.0-[a2]--+-00.0 Oracle Corporation Device
| | \-03.0-[a3]--+-00.0 Oracle Corporation Device
Which means later the parent bridge will not try to add a big enough range:
[ 557.455077] pci 0000:a0:00.0: BAR 14: assigned [mem 0xf9000000-0xf93fffff]
[ 557.461974] pci 0000:a0:00.0: BAR 15: assigned [mem 0xf6000000-0xf61fffff pref]
[ 557.469340] pci 0000:a1:02.0: BAR 14: assigned [mem 0xf9000000-0xf91fffff]
[ 557.476231] pci 0000:a1:02.0: BAR 15: assigned [mem 0xf6000000-0xf60fffff pref]
[ 557.483582] pci 0000:a1:03.0: BAR 14: assigned [mem 0xf9200000-0xf93fffff]
[ 557.490468] pci 0000:a1:03.0: BAR 15: assigned [mem 0xf6100000-0xf61fffff pref]
[ 557.497833] pci 0000:a1:03.0: BAR 14: can't assign mem (size 0x200000)
[ 557.504378] pci 0000:a1:03.0: failed to add optional resources res=[mem 0xf9200000-0xf93fffff]
[ 557.513026] pci 0000:a1:02.0: BAR 14: can't assign mem (size 0x200000)
[ 557.519578] pci 0000:a1:02.0: failed to add optional resources res=[mem 0xf9000000-0xf91fffff]
It turns out we did not calculate size1 properly.
static resource_size_t calculate_memsize(resource_size_t size,
resource_size_t min_size,
resource_size_t size1,
resource_size_t old_size,
resource_size_t align)
{
if (size < min_size)
size = min_size;
if (old_size == 1 )
old_size = 0;
if (size < old_size)
size = old_size;
size = ALIGN(size + size1, align);
return size;
}
We should not pass add_size with min_size in calculate_memsize since
that will make add_size not contribute final add_size.
So just pass add_size with size1 to calculate_memsize().
With this change, we should have chance to remove extra addon in
pci_reassign_resource.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>