When two systems using bonding devices in adaptive load
balancing (ALB) communicates with each other, an endless
ping-pong of ARP replies starts between these two systems.
What happens? In the ALB mode, bonding driver keeps track
of each client connected in a hash table, so it can do the
receive load balancing (RLB). This hash table is updated
when an ARP reply is received, then it scans for the client
entry, updates its MAC address and flag it to be announced
later. Therefore, two seconds later, the alb monitor runs
and send for each updated client entry two ARP replies
updating this specific client. The same process happens on
the receiving system, causing the endless ping-pong of arp
replies.
See more information including the relevant functions below:
System 1 System 2
bond0 bond0
ping <system2>
ARP request --------->
<--------- ARP reply
+->rlb_arp_recv <---------------------+ <--- loop begins
| rlb_update_entry_from_arp |
| client_info->ntt = 1; |
| bond_info->rx_ntt = 1; |
| |
| <communication succeed> |
| |
| bond_alb_monitor |
| rlb_update_rx_clients |
| rlb_update_client |
| arp_create(ARPOP_REPLY) |
| send ARP reply --------------> V
| send ARP reply -------------->
| rlb_arp_recv
| rlb_update_entry_from_arp
| client_info->ntt = 1;
| bond_info->rx_ntt = 1;
| < snipped, same as in system 1>
+------- <-------------- send ARP reply
<-------------- send ARP reply
Besides the unneeded networking traffic, this loop breaks
a cluster because a backup system can't take over the IP
address. There is always one system sending an ARP reply
poisoning the network.
This patch fixes the problem adding a check for the MAC
address before updating it. Thus, if the MAC address didn't
change, there is no need to update neither to announce it later.
Signed-off-by: Flavio Leitner <fleitner@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix abuse of the preincrement operator as detected when building with gcc
4.6.0:
CC [M] drivers/bluetooth/hci_bcsp.o
drivers/bluetooth/hci_bcsp.c: In function 'bcsp_prepare_pkt':
drivers/bluetooth/hci_bcsp.c:247:20: warning: operation on 'bcsp->msgq_txseq' may be undefined
Reported-by: Justin P. Mattock <justinmattock@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some controllers (KW, Dove) limits the TX IP/layer4 checksum offloading to a max size.
Signed-off-by: Saeed Bishara <saeed@marvell.com>
Acked-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Disabling the tx laser while receiving DMA requests
can hang the device. After this occurs the device
is in a bad state. The GPIO bit never clears when
PCI master access is disabled and a reboot is required
to get the device in a good state again.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch added to 2.6.34:
commit 5f6c018199
Author: Jesse Brandeburg <jesse.brandeburg@intel.com>
Date: Wed Apr 14 16:04:23 2010 -0700
ixgbe: fix bug with vlan strip in promsic mode
among other things added a function called ixgbe_vlan_filter_enable.
This new function wants to access and set some rx_ring parameters, but
adapter->rx_ring has already been freed. This simply moves the free
until after the access and makes __ixgbe_shutdown look more like
ixgbe_remove.
Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Support for netpoll over bonded interfaces was added here:
commit f6dc31a85c
Author: WANG Cong <amwang@redhat.com>
Date: Thu May 6 00:48:51 2010 -0700
bonding: make bonding support netpoll
but it is bad enough that we should probably just disable netpoll over
bonding until some of the locking logic in the bonding driver is changed
or converted completely to RCU. Simple actions like changing the active
slave in active-backup mode will hang the box if a high enough printk
debugging level is enabled.
Keeping the old code around will be good for anyone that wants to work
on it (and for after the RCU conversion), so I propose this small patch
rather than ripping it all out.
Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit e13647c1 (phylib: Add support for the LXT973 phy.) added a new ID
but neglected to also add it to the MODULE_DEVICE_TABLE.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When pci_enable_msix() returned ret<0, entries and vxge_entries were leaked.
While at it, use the centralized exit idiom in the function.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Acked-by: Ram Vepa <ram.vepa@exar.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CAPI applications can handle several connections in parallel,
so one connection state per application isn't sufficient.
Store the connection state in the channel structure instead.
Impact: bugfix
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adapt to buggy device firmware which accepts setting HLC only in the
same command line as BC, by encoding HLC and BC in a single command
if both are specified, and rejecting HLC without BC.
Impact: bugfix
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Gigaset CAPI driver handled all DATA_B3_REQ messages as if the
Delivery Confirmation flag bit was set, delaying the emission of the
DATA_B3_CONF reply until the data was actually transmitted. Some
CAPI applications (notably Asterisk) aren't happy with that
behaviour. Change it to actually evaluate the Delivery Confirmation
flag as described the CAPI specification.
Impact: bugfix
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the Gigaset CAPI driver select L2_VOICE (AT^SBPR=2) as the
layer 2 encoding for transparent connections, like the ISDN4Linux
variant. L2_BITSYNC (AT^SBPR=0) mutes internal connections and
distorts external ones.
Impact: bugfix
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the Gigaset CAPI driver to limit the length of a connection's
payload data receive buffers to the corresponding CAPI application's
data buffer size, as some real-life CAPI applications tend to be
rather unhappy if they receive bigger data blocks than requested.
Impact: bugfix
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the call to phy_connect fails, we will return directly instead of freeing
the previously allocated struct net_device.
Signed-off-by: Florian Fainelli <florian@openwrt.org>
CC: stable@kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
smc91c92_cs:
Fix the problem that lan & modem does not work simultaneously
in the Megahertz multi-function card.
We need to write MEGAHERTZ_ISR to retrigger interrupt.
Signed-off-by: Ken Kawasaki <ken_kawasaki@spring.nifty.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
When building tx command, always set TX_CMD_FLAG_PROT_REQUIRE_MSK
for 5000 series and up.
Without setting this bit the firmware will not examine the RTS/CTS setting
and thus not send traffic with the appropriate protection. RTS/CTS is is
required for HT traffic in a noisy environment where, without this setting,
connections will stall on some hardware as documented in the patch that
initially attempted to address this:
commit 1152dcc28c
Author: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Date: Fri Jan 15 13:42:58 2010 -0800
iwlwifi: Fix throughput stall issue in HT mode for 5000
Similar to 6000 and 1000 series, RTS/CTS is the recommended
protection mechanism for 5000 series in HT mode based on the HW design.
Using RTS/CTS will better protect the inner exchange from interference,
especially in highly-congested environment, it also prevent uCode encounter
TX FIFO underrun and other HT mode related performance issues.
For 3945 and 4965, different flags are used for RTS/CTS or CTS-to-Self
protection.
Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
commit 3474ad635d
Author: Johannes Berg <johannes.berg@intel.com>
Date: Thu Apr 29 04:43:05 2010 -0700
iwlwifi: apply filter flags directly
broke multicast. The reason, it turns out, is that
the code previously checked if ALLMULTI _changed_,
which the new code no longer did, and normally it
_never_ changes. Had somebody changed it manually,
the code prior to my patch there would have been
broken already.
The reason is that we always, unconditionally, ask
the device to pass up all multicast frames, but the
new code made it depend on ALLMULTI which broke it
since now we'd pass up multicast frames depending
on the default filter in the device, which isn't
necessarily what we want (since we don't program it
right now).
Fix this by simply not checking allmulti as we have
allmulti behaviour enabled already anyway.
Reported-by: Maxim Levitsky <maximlevitsky@gmail.com>
Tested-by: Maxim Levitsky <maximlevitsky@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
sky2_phy_reinit is called by the ethtool helpers sky2_set_settings,
sky2_nway_reset and sky2_set_pauseparam when netif_running.
However, at the end of sky2_phy_init GM_GP_CTRL has GM_GPCR_RX_ENA and
GM_GPCR_TX_ENA cleared. So, doing these commands causes the device to
stop working:
$ ethtool -r eth0
$ ethtool -A eth0 autoneg off
Fix this issue by enabling Rx/Tx after running sky2_phy_init in
sky2_phy_reinit.
Signed-off-by: Brandon Philips <bphilips@suse.de>
Tested-by: Brandon Philips <bphilips@suse.de>
Cc: stable@kernel.org
Tested-by: Mike McCormack <mikem@ring3k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are few places where ANI is started without checking
if it is right to start. This might lead to a case where ani
timer would be left undeleted and cause improper memory acccess
during module unload. This bug is clearly exposed with
paprd support where the driver detects tx hang and does a
chip reset. During this reset ani is (re)started without checking
if it needs to be started. This would leave a timer scheduled
even after all the resources are freed and cause a panic.
This patch introduces a bit in sc_flags to indicate if ani
needs to be started in sw_scan_start() and ath_reset().
This would fix the following panic. This issue is easily seen
with ar9003 + paprd.
BUG: unable to handle kernel paging request at 0000000000003f38
[<ffffffff81075391>] ? __queue_work+0x41/0x50
[<ffffffff8106afaa>] run_timer_softirq+0x17a/0x370
[<ffffffff81088be8>] ? tick_dev_program_event+0x48/0x110
[<ffffffff81061f69>] __do_softirq+0xb9/0x1f0
[<ffffffff810ba060>] ? handle_IRQ_event+0x50/0x160
[<ffffffff8100af5c>] call_softirq+0x1c/0x30
[<ffffffff8100c9f5>] do_softirq+0x65/0xa0
[<ffffffff81061e25>] irq_exit+0x85/0x90
[<ffffffff8155e095>] do_IRQ+0x75/0xf0
[<ffffffff815570d3>] ret_from_intr+0x0/0x11
<EOI>
[<ffffffff812fd67b>] ? acpi_idle_enter_simple+0xe4/0x119
[<ffffffff812fd674>] ? acpi_idle_enter_simple+0xdd/0x119
[<ffffffff81441c87>] cpuidle_idle_call+0xa7/0x140
[<ffffffff81008da3>] cpu_idle+0xb3/0x110
[<ffffffff81550722>] start_secondary+0x1ee/0x1f5
Signed-off-by: Vasanthakumar Thiagarajan <vasanth@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Disable statistics initialization for eth clients that do not support
statistics. This prevents memory corruption on bnx2x hw.
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>