Commit Graph

278767 Commits

Author SHA1 Message Date
Eric Dumazet fa0f5aa743 net_sched: qdisc_alloc_handle() can be too slow
When trying to allocate ~32768 qdiscs using autohandle mechanism, we can
fill the space managed by kernel (handles in [8000-FFFF]:0000 range)

But O(N^2) qdisc_alloc_handle() loops 0x10000 times instead of 0x8000

time tc add qdisc add dev eth0 parent 10:7fff pfifo limit 10
RTNETLINK answers: Cannot allocate memory
real    1m54.826s
user    0m0.000s
sys     0m0.004s

INFO: rcu_sched_state detected stall on CPU 0 (t=60000 jiffies)

Half number of loops, and add a cond_resched() call.
We hold rtnl at this point.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Dave Taht <dave.taht@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-03 13:03:20 -05:00
Eric Dumazet d32ae76f2b sch_qfq: accurate wsum handling
We can underestimate q->wsum in case of "tc class replace ... qfq"
and/or qdisc_create_dflt() error.

wsum is not really used in fast path, only at qfq qdisc/class setup,
to catch user error.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-03 13:02:19 -05:00
Eric Dumazet d47a0ac7b6 sch_sfq: dont put new flow at the end of flows
SFQ enqueue algo puts a new flow _behind_ all pre-existing flows in the
circular list. In fact this is probably an old SFQ implementation bug.

100 Mbits = ~8333 full frames per second, or ~8 frames per ms.

With 50 flows, it means your "new flow" will have to wait 50 packets
being sent before its own packet. Thats the ~6ms.

We certainly can change SFQ to give a priority advantage to new flows,
so that next dequeued packet is taken from a new flow, not an old one.

Reported-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-03 12:52:09 -05:00
Yevgeny Petrilin 1c015b3b82 mlx4_core: Elaborating limitation on VF port options
Showing which capabilities are not passed to VF
when executing QUERY_PORT

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-03 12:49:16 -05:00
Marcel Apfelbaum 1e27ca6944 mlx4_core: fix mtt range deallocation
The mtt range was allocated in mtt units but deallocated
in segments. Among the rest, this caused crash during hotplug removal

Reported-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Marcel Apfelbaum <marcela@mellanox.co.il>
Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Tested-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-03 12:49:16 -05:00
stephen hemminger f7d9821a6a bonding: fix error handling if slave is busy (v2)
If slave device already has a receive handler registered, then the
error unwind of bonding device enslave function is broken.

The following will leave a pointer to freed memory in the slave
device list, causing a later kernel panic.
# modprobe dummy
# ip li add dummy0-1 link dummy0 type macvlan
# modprobe bonding
# echo +dummy0 >/sys/class/net/bond0/bonding/slaves

The fix is to detach the slave (which removes it from the list)
in the unwind path.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-03 12:49:16 -05:00
Jason Wang f872b237c1 8139cp: properly config rx mode after resuming
Rx mode should be reset after resming, so unconditionally updating rx
mode rather than conditionally updating based on the value we
remembered, otherwise unexpected value may be used by the nic after
resuming.

btw. I find and test this when debugging guest hibernation in qemu, as
I did not have a 8139cp card in hand, this patch is untested in a
physical 8139cp card, plase review it carefully.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-03 12:47:50 -05:00
Jason Wang 7d03f5a48e 8139cp/8139too: do not read into reserved registers
delay_eeprom() use long read for Cfg9346 register(offset 0x50) which may read
into the area of reserved register(offset 0x53). Use byte read instead.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-03 12:47:50 -05:00
Don Skidmore 0e22d0437e ixgbe: add support for new 82599 device.
This device uses an already existing DevID but since it supports
WoL we need to add the Sub DevID.  It's support of WoL is limited
to the first port.

Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-01-02 17:44:34 -08:00
Emil Tantilov 9e791e4a04 ixgbe: add support for new 82599 device id
Support for new 82599 based quad port adapter.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-01-02 17:44:05 -08:00
Emil Tantilov 176f950d31 ixgbe: add write flush in ixgbe_clock_out_i2c_byte()
I2C access is timing critical. Always do a write flush after writing
to the I2CCTL register.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-01-02 17:43:44 -08:00
Stephen Hemminger 52f33af8ac ixgbe: fix typo's
Saw typo in one message, so decided to run spell checker.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-01-02 17:43:17 -08:00
Emil Tantilov c1085b1092 ixgbe: fix incorrect PHY register reads
Fix some register reads that had the opcode and register parameters swapped.
Also use define instead of a magic (0x3) number.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-01-02 17:42:46 -08:00
Carolyn Wyborny f83396ad83 igb: Add flow control advertising to ethtool setting.
Added pause flag for bi-directional flow control advertising to ethtool
settings.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-01-02 17:42:02 -08:00
Alexander Duyck f131a6c07e ixgbevf: Fix register defines to correctly handle complex expressions
This patch is meant to address possible issues with the IXGBEVF register
defines generating incorrect values when given a complex expression for the
register offset.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-01-02 17:41:34 -08:00
David S. Miller 455ffa607f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-01-02 18:56:49 -05:00
Linus Torvalds 115e8e705e Merge branch 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6
* 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6:
  dt/device: Fix auxdata matching to handle entries without a name override
2012-01-02 12:34:03 -08:00
Linus Torvalds 733bbb7e1c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  netfilter: ctnetlink: fix timeout calculation
  ipvs: try also real server with port 0 in backup server
  skge: restore rx multicast filter on resume and after config changes
  mlx4_en: nullify cq->vector field when closing completion queue
2012-01-01 19:36:08 -08:00
Pablo Neira Ayuso 3ab0b245aa netfilter: nfnetlink_acct: fix nfnl_acct_get operation
The get operation was not sending the message that was built to
user-space. This patch also includes the appropriate handling for
the return value of netlink_unicast().

Moreover, fix error codes on error (for example, for non-existing
entry was uncorrect).

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-01-01 16:36:08 +01:00
Linus Torvalds c7f46b7aa4 Merge branch 'fix/asoc' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
* 'fix/asoc' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ASoC: wm8776: add missing break in sample size switch
2011-12-31 11:55:06 -08:00
Mauro Carvalho Chehab ac97ecc886 gspca: Fix falling back to lower isoc alt settings
The current gspca core code has a regression where it no longer properly
falls back to lower alt settings when there is not enough bandwidth.

This causes many iso based usb-1 cameras to not work when plugged into a
usb2 hub or a sandybridge chipset motherboard!

This patch fixes this.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-12-31 11:53:58 -08:00
Hugh Dickins e6780f7243 futex: Fix uninterruptible loop due to gate_area
It was found (by Sasha) that if you use a futex located in the gate
area we get stuck in an uninterruptible infinite loop, much like the
ZERO_PAGE issue.

While looking at this problem, PeterZ realized you'll get into similar
trouble when hitting any install_special_pages() mapping.  And are there
still drivers setting up their own special mmaps without page->mapping,
and without special VM or pte flags to make get_user_pages fail?

In most cases, if page->mapping is NULL, we do not need to retry at all:
Linus points out that even /proc/sys/vm/drop_caches poses no problem,
because it ends up using remove_mapping(), which takes care not to
interfere when the page reference count is raised.

But there is still one case which does need a retry: if memory pressure
called shmem_writepage in between get_user_pages_fast dropping page
table lock and our acquiring page lock, then the page gets switched from
filecache to swapcache (and ->mapping set to NULL) whatever the refcount.
Fault it back in to get the page->mapping needed for key->shared.inode.

Reported-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-12-31 11:48:28 -08:00
Xi Wang c121638277 netfilter: ctnetlink: fix timeout calculation
The sanity check (timeout < 0) never works; the dividend is unsigned
and so is the division, which should have been a signed division.

	long timeout = (ct->timeout.expires - jiffies) / HZ;
	if (timeout < 0)
		timeout = 0;

This patch converts the time values to signed for the division.

Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-12-31 16:59:04 +01:00
Julian Anastasov 52793dbe3d ipvs: try also real server with port 0 in backup server
We should not forget to try for real server with port 0
in the backup server when processing the sync message. We should
do it in all cases because the backup server can use different
forwarding method.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-12-31 16:06:29 +01:00
Florian Zumbiehl fe3c8cc922 skge: restore rx multicast filter on resume and after config changes
Restore skge hardware registers for multicast filtering to their
appropriate values after system resume and after hardware restarts
that are done when changing certain settings.

Signed-off-by: Florian Zumbiehl <florz@florz.de>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-30 23:32:45 -05:00