Kerin Millar reported hardlockups while running `conntrackd -c'
in a busy firewall. That system (with several processors) was
acting as backup in a primary-backup setup.
After several tries, I found a race condition between the deletion
operation of ctnetlink and timeout expiration. This patch fixes
this problem.
Tested-by: Kerin Millar <kerframil@gmail.com>
Reported-by: Kerin Millar <kerframil@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
ip6_mc_find_dev_rcu() is called with rcu_read_lock(), so don't
need to dev_hold().
With dev_hold(), not corresponding dev_put(), will lead to leak.
[ bug introduced in 96b52e61be (ipv6: mcast: RCU conversions) ]
Signed-off-by: RongQing.Li <roy.qing.li@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A driver start_xmit() method cannot free skb and return NETDEV_TX_BUSY,
since caller is going to reuse freed skb.
In fact netif_tx_stop_queue() / netif_stop_queue() is needed before
returning NETDEV_TX_BUSY or you can trigger a ksoftirqd fatal loop.
In case of memory allocation error, only safe way is to drop the packet
and return NETDEV_TX_OK
Also increments tx_dropped counter
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A driver start_xmit() method cannot free skb and return NETDEV_TX_BUSY,
since caller is going to reuse freed skb.
This is mostly a revert of commit bf769375c (staging: hv: fix the return
status of netvsc_start_xmit())
In fact netif_tx_stop_queue() / netif_stop_queue() is needed before
returning NETDEV_TX_BUSY or you can trigger a ksoftirqd fatal loop.
In case of memory allocation error, only safe way is to drop the packet
and return NETDEV_TX_OK
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
network drivers should reserve some headroom on incoming skbs so that we
dont need expensive reallocations, eg forwarding packets in tunnels.
This NET_SKB_PAD padding is done in various helpers, like
__netdev_alloc_skb_ip_align() in this patch, combining NET_SKB_PAD and
NET_IP_ALIGN magic.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Oliver Neukum <oneukum@suse.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
When cycling the interface down and up, bnx2x_init_firmware() knows that
the firmware is already loaded, but nevertheless it allocates certain
arrays anew (init_data, init_ops, init_ops_offsets, iro_arr). The old
arrays are leaked.
Fix the leaks by returning early if the firmware was already loaded.
Because if the firmware is loaded, so are the arrays.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Acked-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the requested firmware is deemed corrupt and then released, reset the
pointer to NULL in order to avoid double-freeing it in
bnx2x_release_firmware() or dereferencing it in bnx2x_init_firmware().
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Acked-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit d47a0ac7b6 (sch_sfq: dont put new flow at the end of
flows)
As Jesper found out, patch sounded great but has bad side effects.
In stress situation, pushing new flows in front of the queue can prevent
old flows doing any progress. Packets can stay in SFQ queue for
unlimited amount of time.
It's possible to add heuristics to limit this problem, but this would
add complexity outside of SFQ scope.
A more sensible answer to Dave Taht concerns (who reported the issued I
tried to solve in original commit) is probably to use a qdisc hierarchy
so that high prio packets dont enter a potentially crowded SFQ qdisc.
Reported-by: Jesper Dangaard Brouer <jdb@comx.dk>
Cc: Dave Taht <dave.taht@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 87a115783 ( ipv6: Move xfrm_lookup() call down into
icmp6_dst_alloc().) forgot to convert one error path, leading
to crashes in mld_sendpack()
Many thanks to Dave Jones for providing a very complete bug report.
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull sparc updates from David Miller:
"Please pull to get this fix for the sparc32 build when using a more
recent binutils."
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc32: Add -Av8 to assembler command line.
Newer version of binutils are more strict about specifying the
correct options to enable certain classes of instructions.
The sparc32 build is done for v7 in order to support sun4c systems
which lack hardware integer multiply and divide instructions.
So we have to pass -Av8 when building the assembler routines that
use these instructions and get patched into the kernel when we find
out that we have a v8 capable cpu.
Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking changes from David Miller:
"The most important bit here is the TCP syncookies issue, which seems
to have been busted for some time. That fix has been verified in
production by the reporter.
1) Persistent TUN devices erroneously hold on to the network namespace
in such a way that it cannot be shutdown. Fix from Stanislav
Kinsbursky with help from Eric Dumazet.
2) TCP SYN cookies have been broken for a while due to how the route
lookup flow key is managed, connections can be delayed by as much
as 20 seconds due to this bug. Fix from Eric Dumazet.
3) Missing jiffies.h include in lib/dynamic_queue_limits.c can break
the build, from Tom Herbert.
4) Add USB device ID for Sitecom LN-031, from Joerg Neikes.
5) Fix OOPS in delayed workqueue in iwlegacy, from Stanislaw Gruszka.
6) rt2x00 TX queue can be disabled forever due to races, fix by
synchronizing pause/unpause with a lock. Also from Stanislaw
Gruszka.
7) Statistics and endian fix in bnx2x driver from Yuval Mintz, Eilon
Greenstein, and Ariel Elior."
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
tun: don't hold network namespace by tun sockets
bnx2x: FCoE statistics id fixed
bnx2x: dcb bit indices flags used as bits
bnx2x: added cpu_to_le16 when preparing ramrod's data
bnx2x: pfc statistics counts pfc events twice
rt2x00: fix random stalls
iwl3945: fix possible il->txq NULL pointer dereference in delayed works
dql: Fix undefined jiffies
tcp: fix syncookie regression
usb: asix: Patch for Sitecom LN-031
Pull arch/tile update from Chris Metcalf
"These include a couple of queued-up minor bug fixes from the
community, a fix to unbreak the sysfs hooks in tile, and syncing up
the defconfigs."
Ugh. defconfigs updates without "make minconfig". Tons of ugly
pointless lines there, I suspect.
* 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
tile: Use set_current_blocked() and block_sigmask()
arch/tile: misplaced parens near likely
arch/tile: sync up the defconfig files to the tip
arch/tile: Fix up from commit 8a25a2fd12
Pull perf fixes from Ingo Molnar.
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf record: Fix buffer overrun bug in tracepoint_id_to_path()
perf/x86: Fix local vs remote memory events for NHM/WSM
Pull CIFS fixes from Steve French.
* git://git.samba.org/sfrench/cifs-2.6:
CIFS: Do not kmalloc under the flocks spinlock
cifs: possible memory leak in xattr.
Pull vfs fixes from Al Viro:
"A bunch of assorted fixes; Jan's freezing stuff still _not_ in there
and neither is mm fun ;-/"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
restore smp_mb() in unlock_new_inode()
vfs: fix return value from do_last()
vfs: fix double put after complete_walk()
udf: Fix deadlock in udf_release_file()
vfs: Correctly set the dir i_mutex lockdep class
As described in e6fa16ab ("signal: sigprocmask() should do
retarget_shared_pending()") the modification of current->blocked is
incorrect as we need to check whether the signal we're about to block
is pending in the shared queue.
Also, use the new helper function introduced in commit 5e6292c0f2
("signal: add block_sigmask() for adding sigmask to current->blocked")
which centralises the code for updating current->blocked after
successfully delivering a signal and reduces the amount of duplicate
code across architectures. In the past some architectures got this
code wrong, so using this helper function should stop that from
happening again.
Cc: Arnd Bergmann <arnd@arndb.de>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
This patch fixes a buffer overrun bug in
tracepoint_id_to_path(). The bug manisfested itself as a memory
error reported by perf record. I ran into it with perf sched:
$ perf sched rec noploop 2 noploop for 2 seconds
[ perf record: Woken up 14 times to write data ]
[ perf record: Captured and wrote 42.701 MB perf.data (~1865622 samples) ]
Fatal: No memory to alloc tracepoints list
It turned out that tracepoint_id_to_path() was reading the
tracepoint id using read() but the buffer was not large enough
to include the \n terminator for id with 4 digits or more.
The patch fixes the problem by extending the buffer to a more
reasonable size covering all possible id length include \n
terminator. Note that atoll() stops at the first non digit
character, thus it is not necessary to clear the buffer between
each read.
Signed-off-by: Stephane Eranian <eranian@google.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: fweisbec@gmail.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/20120313155102.GA6465@quad
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Pull x86 platfrm driver fixes from Matthew Garrett:
"Some trivial patches that fix wifi on some Lenovos and avoid a
potential memory corruption issue on some Panasonics, plus two
straightforward new drivers that touch no existing code."
* 'for_linus' of git://cavan.codon.org.uk/platform-drivers-x86:
panasonic-laptop: avoid overflow in acpi_pcc_hotkey_add()
acer-wmi: No wifi rfkill on Lenovo machines
Fujitsu tablet extras driver
x86: Add amilo-rfkill driver for some Fujitsu-Siemens Amilo laptops
Pull PCI changes from Jesse Barnes:
"A single fix for a regression that affects some people who try to
disable ASPM for whatever reason."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci:
PCI: ignore pre-1.1 ASPM quirking when ASPM is disabled
Pull SuperH fixes from Paul Mundt.
* tag 'sh-for-linus' of git://github.com/pmundt/linux-sh:
sh-sci / PM: Avoid deadlocking runtime PM
sh: fix up the ubc clock definition for sh7785.
sh: add parameter for RSPI in clock-sh7757
sh: Fix sh2a vbr table for more than 255 irqs
v3: added previously removed sock_put() to the tun_release() callback, because
sk_release_kernel() doesn't drop the socket reference.
v2: sk_release_kernel() used for socket release. Dummy tun_release() is
required for sk_release_kernel() ---> sock_release() ---> sock->ops->release()
call.
TUN was designed to destroy it's socket on network namesapce shutdown. But this
will never happen for persistent device, because it's socket holds network
namespace.
This patch removes of holding network namespace by TUN socket and replaces it
by creating socket in init_net and then changing it's net it to desired one. On
shutdown socket is moved back to init_net prior to final put.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
FCoE statistics ids were distinguished from the L2's statistics ids.
However, not all of the change was committed. This causes a possible
collision of indices when FCoE is present.
This patch fixes the issue.
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>