Commit Graph

362187 Commits

Author SHA1 Message Date
Patrick McHardy 124dff01af netfilter: don't reset nf_trace in nf_reset()
Commit 130549fe ("netfilter: reset nf_trace in nf_reset") added code
to reset nf_trace in nf_reset(). This is wrong and unnecessary.

nf_reset() is used in the following cases:

- when passing packets up the the socket layer, at which point we want to
  release all netfilter references that might keep modules pinned while
  the packet is queued. nf_trace doesn't matter anymore at this point.

- when encapsulating or decapsulating IPsec packets. We want to continue
  tracing these packets after IPsec processing.

- when passing packets through virtual network devices. Only devices on
  that encapsulate in IPv4/v6 matter since otherwise nf_trace is not
  used anymore. Its not entirely clear whether those packets should
  be traced after that, however we've always done that.

- when passing packets through virtual network devices that make the
  packet cross network namespace boundaries. This is the only cases
  where we clearly want to reset nf_trace and is also what the
  original patch intended to fix.

Add a new function nf_reset_trace() and use it in dev_forward_skb() to
fix this properly.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-05 15:38:10 -04:00
Jiri Pirko 34e2ed34a0 net: ipv4: notify when address lifetime changes
if userspace changes lifetime of address, send netlink notification and
call notifier.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-05 00:51:12 -04:00
Jakub Kicinski f01fc1a82c ixgbe: fix registration order of driver and DCA nofitication
ixgbe_notify_dca cannot be called before driver registration
because it expects driver's klist_devices to be allocated and
initialized. While on it make sure debugfs files are removed
when registration fails.

Cc: stable <stable@vger.kernel.org>
Signed-off-by: Jakub Kicinski <jakub.kicinski@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-05 00:49:13 -04:00
Eric W. Biederman 0e82e7f6df af_unix: If we don't care about credentials coallesce all messages
It was reported that the following LSB test case failed
https://lsbbugs.linuxfoundation.org/attachment.cgi?id=2144 because we
were not coallescing unix stream messages when the application was
expecting us to.

The problem was that the first send was before the socket was accepted
and thus sock->sk_socket was NULL in maybe_add_creds, and the second
send after the socket was accepted had a non-NULL value for sk->socket
and thus we could tell the credentials were not needed so we did not
bother.

The unnecessary credentials on the first message cause
unix_stream_recvmsg to start verifying that all messages had the same
credentials before coallescing and then the coallescing failed because
the second message had no credentials.

Ignoring credentials when we don't care in unix_stream_recvmsg fixes a
long standing pessimization which would fail to coallesce messages when
reading from a unix stream socket if the senders were different even if
we did not care about their credentials.

I have tested this and verified that the in the LSB test case mentioned
above that the messages do coallesce now, while the were failing to
coallesce without this change.

Reported-by: Karel Srot <ksrot@redhat.com>
Reported-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-05 00:49:13 -04:00
Eric W. Biederman 25da0e3e9d Revert "af_unix: dont send SCM_CREDENTIAL when dest socket is NULL"
This reverts commit 14134f6584.

The problem that the above patch was meant to address is that af_unix
messages are not being coallesced because we are sending unnecesarry
credentials.  Not sending credentials in maybe_add_creds totally
breaks unconnected unix domain sockets that wish to send credentails
to other sockets.

In practice this break some versions of udev because they receive a
message and the sending uid is bogus so they drop the message.

Reported-by: Sven Joachim <svenjoac@gmx.de>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-05 00:49:03 -04:00
Veaceslav Falico 4de79c737b bonding: remove sysfs before removing devices
We have a race condition if we try to rmmod bonding and simultaneously add
a bond master through sysfs. In bonding_exit() we first remove the devices
(through rtnl_link_unregister() ) and only after that we remove the sysfs.
If we manage to add a device through sysfs after that the devices were
removed - we'll end up with that device/sysfs structure and with the module
unloaded.

Fix this by first removing the sysfs and only after that calling
rtnl_link_unregister().

Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-05 00:46:13 -04:00
Hannes Frederic Sowa 31d1670e73 atl1e: limit gso segment size to prevent generation of wrong ip length fields
The limit of 0x3c00 is taken from the windows driver.

Suggested-by: Huang, Xiong <xiong@qca.qualcomm.com>
Cc: Huang, Xiong <xiong@qca.qualcomm.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-05 00:46:13 -04:00
Vlad Yasevich 4543fbefe6 net: count hw_addr syncs so that unsync works properly.
A few drivers use dev_uc_sync/unsync to synchronize the
address lists from master down to slave/lower devices.  In
some cases (bond/team) a single address list is synched down
to multiple devices.  At the time of unsync, we have a leak
in these lower devices, because "synced" is treated as a
boolean and the address will not be unsynced for anything after
the first device/call.

Treat "synced" as a count (same as refcount) and allow all
unsync calls to work.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-05 00:18:46 -04:00
hayeswang e2409d8343 r8169: fix auto speed down issue
It would cause no link after suspending or shutdowning when the
nic changes the speed to 10M and connects to a link partner which
forces the speed to 100M.

Check the link partner ability to determine which speed to set.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Acked-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-04 17:46:02 -04:00
David S. Miller 4f4ecd5f2a Merge branch 'master' of git://1984.lsi.us.es/nf
Pablo Neira Ayuso says:

====================
The following patchset contains netfilter updates for your net tree,
they are:

* Fix missing the skb->trace reset in nf_reset, noticed by Gao Feng
  while using the TRACE target with several net namespaces.

* Fix prefix translation in IPv6 NPT if non-multiple of 32 prefixes
  are used, from Matthias Schiffer.

* Fix invalid nfacct objects with empty name, they are now rejected
  with -EINVAL, spotted by Michael Zintakis, patch from myself.

* A couple of fixes for wrong return values in the error path of
  nfnetlink_queue and nf_conntrack, from Wei Yongjun.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-04 17:41:53 -04:00
David S. Miller 518314ffe4 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into wireless
John W. Linville says:

====================
Here are some more fixes intended for the 3.9 stream...

Regarding the mac80211 bits, Johannes says:

"I had changed the idle handling to simplify it, but broken the
sequencing of commands, at least for ath9k-htc, one patch restores the
sequence. The other patch fixes a crash Jouni found while stress-testing
the remain-on-channel code, when an item is deleted the work struct can
run twice and crash the second time."

As for the iwlwifi bits, Johannes says:

"The only fix here is to the passive-no-RX firmware regulatory
enforcement driver support code to not drop auth frames in quick
succession, leading to not being able to connect to APs on passive
channels in certain circumstances."

Don't forget the NFC bits, about which Samuel says:

"This time we have:

- A crash fix for when a DGRAM LLCP socket is listening while the NFC adapter
  is physically removed.
- A potential double skb free when the LLCP socket receive queue is full.
- A fix for properly handling multiple and consecutive LLCP connections, and
  not trash the socket ack log.
- A build failure for the MEI microread physical layer, now that the MEI bus
  APIs have been merged into char-misc-next."

On top of that, Stone Piao provides an mwifiex fix to avoid accessing
beyond the end of a buffer.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-04 17:39:06 -04:00
John W. Linville 407ad2b7ef Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem 2013-04-03 13:50:34 -04:00
Matthias Schiffer 906b1c394d netfilter: ip6t_NPT: Fix translation for non-multiple of 32 prefix lengths
The bitmask used for the prefix mangling was being calculated
incorrectly, leading to the wrong part of the address being replaced
when the prefix length wasn't a multiple of 32.

Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-04-03 12:24:56 +02:00
Linus Torvalds da241efcd9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix VSOCK layer handling of context ID changes, from Reilly Grant.

 2) Now that we have a synchronize_net() in netdev_rx_handler_unregister(),
    we can't let any call sites hold locks.  Unfortunately bonding does,
    so we have to drop the rwlock there a little bit earlier, fix from
    Veaceslav Falico.

 3) MAC address setting loop exits one iteration too early in mlx4
    driver, from Yan Burman.

 4) Restore ipv6 routes properly upon ifdown/ifup of loopback, from
    Balakumaran Kannan.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  VSOCK: Handle changes to the VMCI context ID.
  net IPv6 : Fix broken IPv6 routing table after loopback down-up
  cbq: incorrect processing of high limits
  net/mlx4_en: Fix setting initial MAC address
  bonding: get netdev_rx_handler_unregister out of locks
2013-04-02 18:58:01 -07:00
Linus Torvalds 6e8517a90b Merge tag 'regmap-v3.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fixes from Mark Brown:
 "A small collection of fixes.  The most important ones are those from
  Stephen and Lars-Peter both of which fix cache issues that have been
  lurking for a while but not manifesting noticably enough for anyone to
  report them."

* tag 'regmap-v3.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap: async: Add missing return
  regmap: don't corrupt work buffer in _regmap_raw_write()
  regmap: cache Fix regcache-rbtree sync
  regmap: Initialize `map->debugfs' before regcache
2013-04-02 18:53:43 -07:00
Linus Torvalds bd709bd027 Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull DRM fixes from Dave Airlie:
 "Two core fixes, both regressions, along with some intel and some
  nouveau fixes for regressions and oopses"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm: correctly restore mappings if drm_open fails
  drm/nouveau: fix NULL ptr dereference from nv50_disp_intr()
  drm/nouveau: fix handling empty channel list in ioctl's
  drm: don't unlock in the addfb error paths
  drm/i915: Fix build failure
  drm/i915: Be sure to turn hsync/vsync back on at crt enable (v2)
  drm/i915: duct-tape locking when eDP init fails
2013-04-02 18:52:24 -07:00
Linus Torvalds aea7fab8ba Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
 "A collection of fixes pretty much across the MIPS code.  Even the
  change to include/linux/signal.h by David Howells' 2a1486981c ("Fix
  breakage in MIPS siginfo handling") should be considered MIPS-specific
  as it touches an ifdefed segment that is only relevant to MIPS and
  which unfortunately can't be made to go away entirely."

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  Fix breakage in MIPS siginfo handling
  Revert "MIPS: BCM63XX: Call board_register_device from device_initcall()"
  MIPS: BCM63XX: Make nvram checksum failure non fatal
  MIPS: Fix code generation for non-DSP capable CPUs
  MIPS: Fix inconsistent formatting inside /proc/cpuinfo
  MIPS: SEAD3: Enable LL/SC.
  MIPS: Get rid of CONFIG_CPU_HAS_LLSC again
  MIPS: Add dependencies for HAVE_ARCH_TRANSPARENT_HUGEPAGE
  MIPS: VR4133: Fix probe for LL/SC.
  MIPS: Fix logic errors in bitops.c
  MIPS: Use CONFIG_CPU_MIPSR2 in csum_partial.S
  MIPS: compat: Return same error ENOSYS as native for invalid operation.
2013-04-02 18:47:23 -07:00
Ilija Hadzic a8ec3a6629 drm: correctly restore mappings if drm_open fails
If first drm_open fails, the error-handling path will
incorrectly restore inode's mapping to NULL. This can
cause the crash later on. Fix by separately storing
away mapping pointers that drm_open can touch and
restore each from its own respective variable if the
call fails.

Fixes: https://bugzilla.novell.com/show_bug.cgi?id=807850
(thanks to Michal Hocko for investigating investigating and
finding the root cause of the bug)

Reference:
http://lists.freedesktop.org/archives/dri-devel/2013-March/036564.html

v2: Use one variable to store file and inode mapping
    since they are the same at the function entry.
    Fix spelling mistakes in commit message.

v3: Add reference to the original bug report.

Reported-by: Marco Munderloh <munderl@tnt.uni-hannover.de>
Tested-by: Marco Munderloh <munderl@tnt.uni-hannover.de>
Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-04-03 06:44:38 +10:00
Dave Airlie 7cebefe6cc Merge branch 'drm-nouveau-fixes-3.9' of git://anongit.freedesktop.org/git/nouveau/linux-2.6 into drm-next
Oops fixers.
* 'drm-nouveau-fixes-3.9' of git://anongit.freedesktop.org/git/nouveau/linux-2.6:
  drm/nouveau: fix NULL ptr dereference from nv50_disp_intr()
  drm/nouveau: fix handling empty channel list in ioctl's
2013-04-03 06:44:02 +10:00
Dave Airlie 1caa590075 Merge branch 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel into drm-next
One locking regression fix, and a couple of other i915 ones.

* 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel:
  drm: don't unlock in the addfb error paths
  drm/i915: Fix build failure
  drm/i915: Be sure to turn hsync/vsync back on at crt enable (v2)
  drm/i915: duct-tape locking when eDP init fails
2013-04-03 06:41:15 +10:00
Reilly Grant 990454b5a4 VSOCK: Handle changes to the VMCI context ID.
The VMCI context ID of a virtual machine may change at any time. There
is a VMCI event which signals this but datagrams may be processed before
this is handled. It is therefore necessary to be flexible about the
destination context ID of any datagrams received. (It can be assumed to
be correct because it is provided by the hypervisor.) The context ID on
existing sockets should be updated to reflect how the hypervisor is
currently referring to the system.

Signed-off-by: Reilly Grant <grantr@vmware.com>
Acked-by: Andy King <acking@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-02 14:39:17 -04:00
Balakumaran Kannan 25fb6ca4ed net IPv6 : Fix broken IPv6 routing table after loopback down-up
IPv6 Routing table becomes broken once we do ifdown, ifup of the loopback(lo)
interface. After down-up, routes of other interface's IPv6 addresses through
'lo' are lost.

IPv6 addresses assigned to all interfaces are routed through 'lo' for internal
communication. Once 'lo' is down, those routing entries are removed from routing
table. But those removed entries are not being re-created properly when 'lo' is
brought up. So IPv6 addresses of other interfaces becomes unreachable from the
same machine. Also this breaks communication with other machines because of
NDISC packet processing failure.

This patch fixes this issue by reading all interface's IPv6 addresses and adding
them to IPv6 routing table while bringing up 'lo'.

==Testing==
Before applying the patch:
$ route -A inet6
Kernel IPv6 routing table
Destination                    Next Hop                   Flag Met Ref Use If
2000::20/128                   ::                         U    256 0     0 eth0
fe80::/64                      ::                         U    256 0     0 eth0
::/0                           ::                         !n   -1  1     1 lo
::1/128                        ::                         Un   0   1     0 lo
2000::20/128                   ::                         Un   0   1     0 lo
fe80::xxxx:xxxx:xxxx:xxxx/128  ::                         Un   0   1     0 lo
ff00::/8                       ::                         U    256 0     0 eth0
::/0                           ::                         !n   -1  1     1 lo
$ sudo ifdown lo
$ sudo ifup lo
$ route -A inet6
Kernel IPv6 routing table
Destination                    Next Hop                   Flag Met Ref Use If
2000::20/128                   ::                         U    256 0     0 eth0
fe80::/64                      ::                         U    256 0     0 eth0
::/0                           ::                         !n   -1  1     1 lo
::1/128                        ::                         Un   0   1     0 lo
ff00::/8                       ::                         U    256 0     0 eth0
::/0                           ::                         !n   -1  1     1 lo
$

After applying the patch:
$ route -A inet6
Kernel IPv6 routing
table
Destination                    Next Hop                   Flag Met Ref Use If
2000::20/128                   ::                         U    256 0     0 eth0
fe80::/64                      ::                         U    256 0     0 eth0
::/0                           ::                         !n   -1  1     1 lo
::1/128                        ::                         Un   0   1     0 lo
2000::20/128                   ::                         Un   0   1     0 lo
fe80::xxxx:xxxx:xxxx:xxxx/128  ::                         Un   0   1     0 lo
ff00::/8                       ::                         U    256 0     0 eth0
::/0                           ::                         !n   -1  1     1 lo
$ sudo ifdown lo
$ sudo ifup lo
$ route -A inet6
Kernel IPv6 routing table
Destination                    Next Hop                   Flag Met Ref Use If
2000::20/128                   ::                         U    256 0     0 eth0
fe80::/64                      ::                         U    256 0     0 eth0
::/0                           ::                         !n   -1  1     1 lo
::1/128                        ::                         Un   0   1     0 lo
2000::20/128                   ::                         Un   0   1     0 lo
fe80::xxxx:xxxx:xxxx:xxxx/128  ::                         Un   0   1     0 lo
ff00::/8                       ::                         U    256 0     0 eth0
::/0                           ::                         !n   -1  1     1 lo
$

Signed-off-by: Balakumaran Kannan <Balakumaran.Kannan@ap.sony.com>
Signed-off-by: Maruthi Thotad <Maruthi.Thotad@ap.sony.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-02 14:37:19 -04:00
Vasily Averin f0f6ee1f70 cbq: incorrect processing of high limits
currently cbq works incorrectly for limits > 10% real link bandwidth,
and practically does not work for limits > 50% real link bandwidth.
Below are results of experiments taken on 1 Gbit link

 In shaper | Actual Result
-----------+---------------
  100M     | 108 Mbps
  200M     | 244 Mbps
  300M     | 412 Mbps
  500M     | 893 Mbps

This happen because of q->now changes incorrectly in cbq_dequeue():
when it is called before real end of packet transmitting,
L2T is greater than real time delay, q_now gets an extra boost
but never compensate it.

To fix this problem we prevent change of q->now until its synchronization
with real time.

Signed-off-by: Vasily Averin <vvs@openvz.org>
Reviewed-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-02 14:29:20 -04:00
Stanislav Kinsbursky 2dc958fa2f ipc: set msg back to -EAGAIN if copy wasn't performed
Make sure that msg pointer is set back to error value in case of
MSG_COPY flag is set and desired message to copy wasn't found.  This
garantees that msg is either a error pointer or a copy address.

Otherwise the last message in queue will be freed without unlinking from
the queue (which leads to memory corruption) and the dummy allocated
copy won't be released.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-02 10:09:01 -07:00
Yan Burman bab6a9eac0 net/mlx4_en: Fix setting initial MAC address
Commit 6bbb6d9 "net/mlx4_en: Optimize Rx fast path filter checks" introduced a regression
under which the MAC address read from the card was not converted correctly
(the most significant byte was not handled), fix that.

Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-02 12:07:56 -04:00