Commit Graph

413134 Commits

Author SHA1 Message Date
Fan Du
fffc15a501 vxlan: release rt when found circular route
Otherwise causing dst memory leakage.
Have Checked all other type tunnel device transmit implementation,
no such things happens anymore.

Signed-off-by: Fan Du <fan.du@windriver.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-10 22:01:54 -05:00
Sasha Levin
12663bfc97 net: unix: allow set_peek_off to fail
unix_dgram_recvmsg() will hold the readlock of the socket until recv
is complete.

In the same time, we may try to setsockopt(SO_PEEK_OFF) which will hang until
unix_dgram_recvmsg() will complete (which can take a while) without allowing
us to break out of it, triggering a hung task spew.

Instead, allow set_peek_off to fail, this way userspace will not hang.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-10 21:45:15 -05:00
David S. Miller
88b07b3660 Merge branch 'sfc-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc
Ben Hutchings says:

====================
Several fixes for the PTP hardware support added in 3.7:
1. Fix filtering of PTP packets on the TX path to be robust against bad
header lengths.
2. Limit logging on the RX path in case of a PTP packet flood, partly
from Laurence Evans.
3. Disable PTP hardware when the interface is down so that we don't
receive RX timestamp events, from Alexandre Rames.
4. Maintain clock frequency adjustment when a time offset is applied.

Also fixes for the SFC9100 family support added in 3.12:
5. Take the RX prefix length into account when applying NET_IP_ALIGN,
from Andrew Rybchenko.
6. Work around a bug that breaks communication between the driver and
firmware, from Robert Stonehouse.

Please also queue these up for the appropriate stable branches.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-10 21:19:42 -05:00
Maxime Ripard
e9c56f8d2f net: allwinner: emac: Add missing free_irq
The sun4i-emac driver uses devm_request_irq at .ndo_open time, but relies on
the managed device mechanism to actually free it. This causes an issue whenever
someone wants to restart the interface, the interrupt still being held, and not
yet released.

Fall back to using the regular request_irq at .ndo_open time, and introduce a
free_irq during .ndo_stop.

Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: stable@vger.kernel.org # 3.11+
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-10 18:01:10 -05:00
Stefan Tomanek
673498b8ed inet: fix NULL pointer Oops in fib(6)_rule_suppress
This changes ensures that the routing entry investigated by the suppress
function actually does point to a device struct before following that pointer,
fixing a possible kernel oops situation when verifying the interface group
associated with a routing table entry.

According to Daniel Golle, this Oops can be triggered by a user process trying
to establish an outgoing IPv6 connection while having no real IPv6 connectivity
set up (only autoassigned link-local addresses).

Fixes: 6ef94cfafb ("fib_rules: add route suppression based on ifgroup")

Reported-by: Daniel Golle <daniel.golle@gmail.com>
Tested-by: Daniel Golle <daniel.golle@gmail.com>
Signed-off-by: Stefan Tomanek <stefan.tomanek@wertarbyte.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-10 17:54:23 -05:00
Changli Gao
d323e92cc3 net: drop_monitor: fix the value of maxattr
maxattr in genl_family should be used to save the max attribute
type, but not the max command type. Drop monitor doesn't support
any attributes, so we should leave it as zero.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-09 21:10:38 -05:00
Srikanth Thokala
ec21b6b404 net: emaclite: add barriers to support Xilinx Zynq platform
This patch adds barriers at appropriate places to ensure the driver
works on Xilinx Zynq ARM-based SoC platform.

Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-09 21:02:25 -05:00
Srikanth Thokala
243fedd5fa net: emaclite: Remove unnecessary code that enables/disables interrupts on PONG buffers
There are no specific interrupts for the PONG buffer on both
transmit and receive side, same interrupt is valid for both
buffers. So, this patch removes this code.

Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Reviewed-by: Michal Simek <monstr@monstr.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-09 21:02:25 -05:00
Hannes Frederic Sowa
a3300ef4bb ipv6: don't count addrconf generated routes against gc limit
Brett Ciphery reported that new ipv6 addresses failed to get installed
because the addrconf generated dsts where counted against the dst gc
limit. We don't need to count those routes like we currently don't count
administratively added routes.

Because the max_addresses check enforces a limit on unbounded address
generation first in case someone plays with router advertisments, we
are still safe here.

Reported-by: Brett Ciphery <brett.ciphery@windriver.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-09 21:00:39 -05:00
David S. Miller
6a46ff87d4 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
The following patchset contains three Netfilter fixes for your net tree,
they are:

* fix incorrect comparison in the new netnet hash ipset type, from
  Dave Jones.

* fix splat in hashlimit due to missing removal of the content of its
  proc entry in netnamespaces, from Sergey Popovich.

* fix missing rule flushing operation by table in nf_tables. Table
  flushing was already discussed back in October but this got lost and
  no patch has hit the tree to address this issue so far, from me.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-09 20:43:21 -05:00
Daniel Borkmann
66e56cd46b packet: fix send path when running with proto == 0
Commit e40526cb20 introduced a cached dev pointer, that gets
hooked into register_prot_hook(), __unregister_prot_hook() to
update the device used for the send path.

We need to fix this up, as otherwise this will not work with
sockets created with protocol = 0, plus with sll_protocol = 0
passed via sockaddr_ll when doing the bind.

So instead, assign the pointer directly. The compiler can inline
these helper functions automagically.

While at it, also assume the cached dev fast-path as likely(),
and document this variant of socket creation as it seems it is
not widely used (seems not even the author of TX_RING was aware
of that in his reference example [1]). Tested with reproducer
from e40526cb20.

 [1] http://wiki.ipxwarzone.com/index.php5?title=Linux_packet_mmap#Example

Fixes: e40526cb20 ("packet: fix use after free race in send path when dev is released")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Tested-by: Salam Noureddine <noureddine@aristanetworks.com>
Tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-09 20:09:20 -05:00
Pablo Neira Ayuso
cf9dc09d09 netfilter: nf_tables: fix missing rules flushing per table
This patch allows you to atomically remove all rules stored in
a table via the NFT_MSG_DELRULE command. You only need to indicate
the specific table and no chain to flush all rules stored in that
table.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-12-07 22:55:48 +01:00
Sergey Popovich
b4ef4ce093 netfilter: xt_hashlimit: fix proc entry leak in netns destroy path
In (32263dd1b netfilter: xt_hashlimit: fix namespace destroy path)
the hashlimit_net_exit() function is always called right before
hashlimit_mt_destroy() to release netns data. If you use xt_hashlimit
with IPv4 and IPv6 together, this produces the following splat via
netconsole in the netns destroy path:

 Pid: 9499, comm: kworker/u:0 Tainted: G        WC O 3.2.0-5-netctl-amd64-core2
 Call Trace:
  [<ffffffff8104708d>] ? warn_slowpath_common+0x78/0x8c
  [<ffffffff81047139>] ? warn_slowpath_fmt+0x45/0x4a
  [<ffffffff81144a99>] ? remove_proc_entry+0xd8/0x22e
  [<ffffffff810ebbaa>] ? kfree+0x5b/0x6c
  [<ffffffffa043c501>] ? hashlimit_net_exit+0x45/0x8d [xt_hashlimit]
  [<ffffffff8128ab30>] ? ops_exit_list+0x1c/0x44
  [<ffffffff8128b28e>] ? cleanup_net+0xf1/0x180
  [<ffffffff810369fc>] ? should_resched+0x5/0x23
  [<ffffffff8105b8f9>] ? process_one_work+0x161/0x269
  [<ffffffff8105aea5>] ? cwq_activate_delayed_work+0x3c/0x48
  [<ffffffff8105c8c2>] ? worker_thread+0xc2/0x145
  [<ffffffff8105c800>] ? manage_workers.isra.25+0x15b/0x15b
  [<ffffffff8105fa01>] ? kthread+0x76/0x7e
  [<ffffffff813581f4>] ? kernel_thread_helper+0x4/0x10
  [<ffffffff8105f98b>] ? kthread_worker_fn+0x139/0x139
  [<ffffffff813581f0>] ? gs_change+0x13/0x13
 ---[ end trace d8c3cc0ad163ef79 ]---
 ------------[ cut here ]------------
 WARNING: at /usr/src/linux-3.2.52/debian/build/source_netctl/fs/proc/generic.c:849
 remove_proc_entry+0x217/0x22e()
 Hardware name:
 remove_proc_entry: removing non-empty directory 'net/ip6t_hashlimit', leaking at least 'IN-REJECT'

This is due to lack of removal net/ip6t_hashlimit/* entries in
hashlimit_proc_net_exit(), since only IPv4 entries are deleted. Fix
it by always removing the IPv4 and IPv6 entries and their parent
directories in the netns destroy path.

Signed-off-by: Sergey Popovich <popovich_sergei@mail.ru>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-12-07 22:46:51 +01:00
Robert Stonehouse
6b294b8efe sfc: Poll for MCDI completion once before timeout occurs
There is an as-yet unexplained bug that sometimes prevents (or delays)
the driver seeing the completion event for a completed MCDI request on
the SFC9120.  The requested configuration change will have happened
but the driver assumes it to have failed, and this can result in
further failures.  We can mitigate this by polling for completion
after unsuccessfully waiting for an event.

Fixes: 8127d661e7 ('sfc: Add support for Solarflare SFC9100 family')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-12-06 22:27:55 +00:00
Robert Stonehouse
5731d7b35e sfc: Refactor efx_mcdi_poll() by introducing efx_mcdi_poll_once()
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-12-06 22:27:53 +00:00
Andrew Rybchenko
2ec030144f sfc: RX buffer allocation takes prefix size into account in IP header alignment
rx_prefix_size is 4-bytes aligned on Falcon/Siena (16 bytes), but it is equal
to 14 on EF10. So, it should be taken into account if arch requires IP header
to be 4-bytes aligned (via NET_IP_ALIGN).

Fixes: 8127d661e7 ('sfc: Add support for Solarflare SFC9100 family')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-12-06 22:27:52 +00:00
Ben Hutchings
cd6fe65e92 sfc: Maintain current frequency adjustment when applying a time offset
There is a single MCDI PTP operation for setting the frequency
adjustment and applying a time offset to the hardware clock.  When
applying a time offset we should not change the frequency adjustment.

These two operations can now be requested separately but this requires
a flash firmware update.  Keep using the single operation, but
remember and repeat the previous frequency adjustment.

Fixes: 7c236c43b8 ('sfc: Add support for IEEE-1588 PTP')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-12-06 22:27:51 +00:00
Alexandre Rames
2ea4dc28a5 sfc: Stop/re-start PTP when stopping/starting the datapath.
This disables PTP when we bring the interface down to avoid getting
unmatched RX timestamp events, and tries to re-enable it when bringing
the interface up.

[bwh: Make efx_ptp_stop() safe on Falcon. Introduce
 efx_ptp_{start,stop}_datapath() functions; we'll expand them later.]

Fixes: 7c236c43b8 ('sfc: Add support for IEEE-1588 PTP')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-12-06 22:27:41 +00:00
Ben Hutchings
35f9a7a380 sfc: Rate-limit log message for PTP packets without a matching timestamp event
In case of a flood of PTP packets, the timestamp peripheral and MC
firmware on the SFN[56]322F boards may not be able to provide
timestamp events for all packets.  Don't complain too much about this.

Fixes: 7c236c43b8 ('sfc: Add support for IEEE-1588 PTP')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-12-06 22:22:34 +00:00
Laurence Evans
f32116003c sfc: PTP: Moderate log message on event queue overflow
Limit syslog flood if a PTP packet storm occurs.

Fixes: 7c236c43b8 ('sfc: Add support for IEEE-1588 PTP')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-12-06 22:15:55 +00:00
Michael Dalton
98bfd23cdb virtio-net: free bufs correctly on invalid packet length
When a packet with invalid length arrives, ensure that the packet
is freed correctly if mergeable packet buffers and big packets
(GUEST_TSO4) are both enabled.

Signed-off-by: Michael Dalton <mwdalton@google.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Andrew Vagin <avagin@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-06 16:31:43 -05:00
Ezequiel Garcia
a328f3a059 net: mvneta: Fix incorrect DMA unmapping size
The current code unmaps the DMA mapping created for rx skb_buff's by
using the data_size as the the mapping size. This is wrong since the
correct size to specify should match the size used to create the mapping.

This commit removes the following DMA_API_DEBUG warning:

------------[ cut here ]------------
WARNING: at lib/dma-debug.c:887 check_unmap+0x3a8/0x860()
mvneta d0070000.ethernet: DMA-API: device driver frees DMA memory with different size [device address=0x000000002eb80000] [map size=1600 bytes] [unmap size=66 bytes]
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.21-01444-ga88ae13-dirty #92
[<c0013600>] (unwind_backtrace+0x0/0xf8) from [<c0010fb8>] (show_stack+0x10/0x14)
[<c0010fb8>] (show_stack+0x10/0x14) from [<c001afa0>] (warn_slowpath_common+0x48/0x68)
[<c001afa0>] (warn_slowpath_common+0x48/0x68) from [<c001b01c>] (warn_slowpath_fmt+0x30/0x40)
[<c001b01c>] (warn_slowpath_fmt+0x30/0x40) from [<c018d0fc>] (check_unmap+0x3a8/0x860)
[<c018d0fc>] (check_unmap+0x3a8/0x860) from [<c018d734>] (debug_dma_unmap_page+0x64/0x70)
[<c018d734>] (debug_dma_unmap_page+0x64/0x70) from [<c0233f78>] (mvneta_rx+0xec/0x468)
[<c0233f78>] (mvneta_rx+0xec/0x468) from [<c023436c>] (mvneta_poll+0x78/0x16c)
[<c023436c>] (mvneta_poll+0x78/0x16c) from [<c02db468>] (net_rx_action+0x94/0x160)
[<c02db468>] (net_rx_action+0x94/0x160) from [<c0021e68>] (__do_softirq+0xe8/0x1d0)
[<c0021e68>] (__do_softirq+0xe8/0x1d0) from [<c0021ff8>] (do_softirq+0x4c/0x58)
[<c0021ff8>] (do_softirq+0x4c/0x58) from [<c0022228>] (irq_exit+0x58/0x90)
[<c0022228>] (irq_exit+0x58/0x90) from [<c000e7c8>] (handle_IRQ+0x3c/0x94)
[<c000e7c8>] (handle_IRQ+0x3c/0x94) from [<c0008548>] (armada_370_xp_handle_irq+0x4c/0xb4)
[<c0008548>] (armada_370_xp_handle_irq+0x4c/0xb4) from [<c000dc20>] (__irq_svc+0x40/0x50)
Exception stack(0xc04f1f70 to 0xc04f1fb8)
1f60:                                     c1fe46f8 00000000 00001d92 00001d92
1f80: c04f0000 c04f0000 c04f84a4 c03e081c c05220e7 00000001 c05220e7 c04f0000
1fa0: 00000000 c04f1fb8 c000eaf8 c004c048 60000113 ffffffff
[<c000dc20>] (__irq_svc+0x40/0x50) from [<c004c048>] (cpu_startup_entry+0x54/0x128)
[<c004c048>] (cpu_startup_entry+0x54/0x128) from [<c04c1a14>] (start_kernel+0x29c/0x2f0)
[<c04c1a14>] (start_kernel+0x29c/0x2f0) from [<00008074>] (0x8074)
---[ end trace d4955f6acd178110 ]---
Mapped at:
 [<c018d600>] debug_dma_map_page+0x4c/0x11c
 [<c0235d6c>] mvneta_setup_rxqs+0x398/0x598
 [<c0236084>] mvneta_open+0x40/0x17c
 [<c02dbbd4>] __dev_open+0x9c/0x100
 [<c02dbe58>] __dev_change_flags+0x7c/0x134

Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-06 15:43:44 -05:00
Jiri Pirko
859828c0ea br: fix use of ->rx_handler_data in code executed on non-rx_handler path
br_stp_rcv() is reached by non-rx_handler path. That means there is no
guarantee that dev is bridge port and therefore simple NULL check of
->rx_handler_data is not enough. There is need to check if dev is really
bridge port and since only rcu read lock is held here, do it by checking
->rx_handler pointer.

Note that synchronize_net() in netdev_rx_handler_unregister() ensures
this approach as valid.

Introduced originally by:
commit f350a0a873
  "bridge: use rx_handler_data pointer to store net_bridge_port pointer"

Fixed but not in the best way by:
commit b5ed54e94d
  "bridge: fix RCU races with bridge port"

Reintroduced by:
commit 716ec052d2
  "bridge: fix NULL pointer deref of br_port_get_rcu"

Please apply to stable trees as well. Thanks.

RH bugzilla reference: https://bugzilla.redhat.com/show_bug.cgi?id=1025770

Reported-by: Laine Stump <laine@redhat.com>
Debugged-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-06 15:41:40 -05:00
Ben Hutchings
e5a498e943 sfc: Add length checks to efx_xmit_with_hwtstamp() and efx_ptp_is_ptp_tx()
efx_ptp_is_ptp_tx() must be robust against skbs from raw sockets that
have invalid IPv4 and UDP headers.

Add checks that:
- the transport header has been found
- there is enough space between network and transport header offset
  for an IPv4 header
- there is enough space after the transport header offset for a
  UDP header

Fixes: 7c236c43b8 ('sfc: Add support for IEEE-1588 PTP')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-12-06 20:41:22 +00:00
Andrey Vagin
d4fb84eefe virtio: delete napi structures from netdev before releasing memory
free_netdev calls netif_napi_del too, but it's too late, because napi
structures are placed on vi->rq. netif_napi_add() is called from
virtnet_alloc_queues.

general protection fault: 0000 [#1] SMP
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in: ip6table_filter ip6_tables iptable_filter ip_tables virtio_balloon pcspkr virtio_net(-) i2c_pii
CPU: 1 PID: 347 Comm: rmmod Not tainted 3.13.0-rc2+ #171
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
task: ffff8800b779c420 ti: ffff8800379e0000 task.ti: ffff8800379e0000
RIP: 0010:[<ffffffff81322e19>]  [<ffffffff81322e19>] __list_del_entry+0x29/0xd0
RSP: 0018:ffff8800379e1dd0  EFLAGS: 00010a83
RAX: 6b6b6b6b6b6b6b6b RBX: ffff8800379c2fd0 RCX: dead000000200200
RDX: 6b6b6b6b6b6b6b6b RSI: 0000000000000001 RDI: ffff8800379c2fd0
RBP: ffff8800379e1dd0 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: ffff8800379c2f90
R13: ffff880037839160 R14: 0000000000000000 R15: 00000000013352f0
FS:  00007f1400e34740(0000) GS:ffff8800bfb00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f464124c763 CR3: 00000000b68cf000 CR4: 00000000000006e0
Stack:
 ffff8800379e1df0 ffffffff8155beab 6b6b6b6b6b6b6b2b ffff8800378391c0
 ffff8800379e1e18 ffffffff8156499b ffff880037839be0 ffff880037839d20
 ffff88003779d3f0 ffff8800379e1e38 ffffffffa003477c ffff88003779d388
Call Trace:
 [<ffffffff8155beab>] netif_napi_del+0x1b/0x80
 [<ffffffff8156499b>] free_netdev+0x8b/0x110
 [<ffffffffa003477c>] virtnet_remove+0x7c/0x90 [virtio_net]
 [<ffffffff813ae323>] virtio_dev_remove+0x23/0x80
 [<ffffffff813f62ef>] __device_release_driver+0x7f/0xf0
 [<ffffffff813f6ca0>] driver_detach+0xc0/0xd0
 [<ffffffff813f5f28>] bus_remove_driver+0x58/0xd0
 [<ffffffff813f72ec>] driver_unregister+0x2c/0x50
 [<ffffffff813ae65e>] unregister_virtio_driver+0xe/0x10
 [<ffffffffa0036942>] virtio_net_driver_exit+0x10/0x6ce [virtio_net]
 [<ffffffff810d7cf2>] SyS_delete_module+0x172/0x220
 [<ffffffff810a732d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff810f5d4c>] ? __audit_syscall_entry+0x9c/0xf0
 [<ffffffff81677f69>] system_call_fastpath+0x16/0x1b
Code: 00 00 55 48 8b 17 48 b9 00 01 10 00 00 00 ad de 48 8b 47 08 48 89 e5 48 39 ca 74 29 48 b9 00 02 20 00 00 00
RIP  [<ffffffff81322e19>] __list_del_entry+0x29/0xd0
 RSP <ffff8800379e1dd0>
---[ end trace d5931cd3f87c9763 ]---

Fixes: 986a4f4d45 (virtio_net: multiqueue support)
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-06 15:28:29 -05:00