Otherwise causing dst memory leakage.
Have Checked all other type tunnel device transmit implementation,
no such things happens anymore.
Signed-off-by: Fan Du <fan.du@windriver.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
unix_dgram_recvmsg() will hold the readlock of the socket until recv
is complete.
In the same time, we may try to setsockopt(SO_PEEK_OFF) which will hang until
unix_dgram_recvmsg() will complete (which can take a while) without allowing
us to break out of it, triggering a hung task spew.
Instead, allow set_peek_off to fail, this way userspace will not hang.
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings says:
====================
Several fixes for the PTP hardware support added in 3.7:
1. Fix filtering of PTP packets on the TX path to be robust against bad
header lengths.
2. Limit logging on the RX path in case of a PTP packet flood, partly
from Laurence Evans.
3. Disable PTP hardware when the interface is down so that we don't
receive RX timestamp events, from Alexandre Rames.
4. Maintain clock frequency adjustment when a time offset is applied.
Also fixes for the SFC9100 family support added in 3.12:
5. Take the RX prefix length into account when applying NET_IP_ALIGN,
from Andrew Rybchenko.
6. Work around a bug that breaks communication between the driver and
firmware, from Robert Stonehouse.
Please also queue these up for the appropriate stable branches.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The sun4i-emac driver uses devm_request_irq at .ndo_open time, but relies on
the managed device mechanism to actually free it. This causes an issue whenever
someone wants to restart the interface, the interrupt still being held, and not
yet released.
Fall back to using the regular request_irq at .ndo_open time, and introduce a
free_irq during .ndo_stop.
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: stable@vger.kernel.org # 3.11+
Signed-off-by: David S. Miller <davem@davemloft.net>
This changes ensures that the routing entry investigated by the suppress
function actually does point to a device struct before following that pointer,
fixing a possible kernel oops situation when verifying the interface group
associated with a routing table entry.
According to Daniel Golle, this Oops can be triggered by a user process trying
to establish an outgoing IPv6 connection while having no real IPv6 connectivity
set up (only autoassigned link-local addresses).
Fixes: 6ef94cfafb ("fib_rules: add route suppression based on ifgroup")
Reported-by: Daniel Golle <daniel.golle@gmail.com>
Tested-by: Daniel Golle <daniel.golle@gmail.com>
Signed-off-by: Stefan Tomanek <stefan.tomanek@wertarbyte.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
maxattr in genl_family should be used to save the max attribute
type, but not the max command type. Drop monitor doesn't support
any attributes, so we should leave it as zero.
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds barriers at appropriate places to ensure the driver
works on Xilinx Zynq ARM-based SoC platform.
Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are no specific interrupts for the PONG buffer on both
transmit and receive side, same interrupt is valid for both
buffers. So, this patch removes this code.
Signed-off-by: Srikanth Thokala <sthokal@xilinx.com>
Reviewed-by: Michal Simek <monstr@monstr.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Brett Ciphery reported that new ipv6 addresses failed to get installed
because the addrconf generated dsts where counted against the dst gc
limit. We don't need to count those routes like we currently don't count
administratively added routes.
Because the max_addresses check enforces a limit on unbounded address
generation first in case someone plays with router advertisments, we
are still safe here.
Reported-by: Brett Ciphery <brett.ciphery@windriver.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
The following patchset contains three Netfilter fixes for your net tree,
they are:
* fix incorrect comparison in the new netnet hash ipset type, from
Dave Jones.
* fix splat in hashlimit due to missing removal of the content of its
proc entry in netnamespaces, from Sergey Popovich.
* fix missing rule flushing operation by table in nf_tables. Table
flushing was already discussed back in October but this got lost and
no patch has hit the tree to address this issue so far, from me.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit e40526cb20 introduced a cached dev pointer, that gets
hooked into register_prot_hook(), __unregister_prot_hook() to
update the device used for the send path.
We need to fix this up, as otherwise this will not work with
sockets created with protocol = 0, plus with sll_protocol = 0
passed via sockaddr_ll when doing the bind.
So instead, assign the pointer directly. The compiler can inline
these helper functions automagically.
While at it, also assume the cached dev fast-path as likely(),
and document this variant of socket creation as it seems it is
not widely used (seems not even the author of TX_RING was aware
of that in his reference example [1]). Tested with reproducer
from e40526cb20.
[1] http://wiki.ipxwarzone.com/index.php5?title=Linux_packet_mmap#Example
Fixes: e40526cb20 ("packet: fix use after free race in send path when dev is released")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Tested-by: Salam Noureddine <noureddine@aristanetworks.com>
Tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch allows you to atomically remove all rules stored in
a table via the NFT_MSG_DELRULE command. You only need to indicate
the specific table and no chain to flush all rules stored in that
table.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
In (32263dd1b netfilter: xt_hashlimit: fix namespace destroy path)
the hashlimit_net_exit() function is always called right before
hashlimit_mt_destroy() to release netns data. If you use xt_hashlimit
with IPv4 and IPv6 together, this produces the following splat via
netconsole in the netns destroy path:
Pid: 9499, comm: kworker/u:0 Tainted: G WC O 3.2.0-5-netctl-amd64-core2
Call Trace:
[<ffffffff8104708d>] ? warn_slowpath_common+0x78/0x8c
[<ffffffff81047139>] ? warn_slowpath_fmt+0x45/0x4a
[<ffffffff81144a99>] ? remove_proc_entry+0xd8/0x22e
[<ffffffff810ebbaa>] ? kfree+0x5b/0x6c
[<ffffffffa043c501>] ? hashlimit_net_exit+0x45/0x8d [xt_hashlimit]
[<ffffffff8128ab30>] ? ops_exit_list+0x1c/0x44
[<ffffffff8128b28e>] ? cleanup_net+0xf1/0x180
[<ffffffff810369fc>] ? should_resched+0x5/0x23
[<ffffffff8105b8f9>] ? process_one_work+0x161/0x269
[<ffffffff8105aea5>] ? cwq_activate_delayed_work+0x3c/0x48
[<ffffffff8105c8c2>] ? worker_thread+0xc2/0x145
[<ffffffff8105c800>] ? manage_workers.isra.25+0x15b/0x15b
[<ffffffff8105fa01>] ? kthread+0x76/0x7e
[<ffffffff813581f4>] ? kernel_thread_helper+0x4/0x10
[<ffffffff8105f98b>] ? kthread_worker_fn+0x139/0x139
[<ffffffff813581f0>] ? gs_change+0x13/0x13
---[ end trace d8c3cc0ad163ef79 ]---
------------[ cut here ]------------
WARNING: at /usr/src/linux-3.2.52/debian/build/source_netctl/fs/proc/generic.c:849
remove_proc_entry+0x217/0x22e()
Hardware name:
remove_proc_entry: removing non-empty directory 'net/ip6t_hashlimit', leaking at least 'IN-REJECT'
This is due to lack of removal net/ip6t_hashlimit/* entries in
hashlimit_proc_net_exit(), since only IPv4 entries are deleted. Fix
it by always removing the IPv4 and IPv6 entries and their parent
directories in the netns destroy path.
Signed-off-by: Sergey Popovich <popovich_sergei@mail.ru>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
There is an as-yet unexplained bug that sometimes prevents (or delays)
the driver seeing the completion event for a completed MCDI request on
the SFC9120. The requested configuration change will have happened
but the driver assumes it to have failed, and this can result in
further failures. We can mitigate this by polling for completion
after unsuccessfully waiting for an event.
Fixes: 8127d661e7 ('sfc: Add support for Solarflare SFC9100 family')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
rx_prefix_size is 4-bytes aligned on Falcon/Siena (16 bytes), but it is equal
to 14 on EF10. So, it should be taken into account if arch requires IP header
to be 4-bytes aligned (via NET_IP_ALIGN).
Fixes: 8127d661e7 ('sfc: Add support for Solarflare SFC9100 family')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
There is a single MCDI PTP operation for setting the frequency
adjustment and applying a time offset to the hardware clock. When
applying a time offset we should not change the frequency adjustment.
These two operations can now be requested separately but this requires
a flash firmware update. Keep using the single operation, but
remember and repeat the previous frequency adjustment.
Fixes: 7c236c43b8 ('sfc: Add support for IEEE-1588 PTP')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
This disables PTP when we bring the interface down to avoid getting
unmatched RX timestamp events, and tries to re-enable it when bringing
the interface up.
[bwh: Make efx_ptp_stop() safe on Falcon. Introduce
efx_ptp_{start,stop}_datapath() functions; we'll expand them later.]
Fixes: 7c236c43b8 ('sfc: Add support for IEEE-1588 PTP')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
In case of a flood of PTP packets, the timestamp peripheral and MC
firmware on the SFN[56]322F boards may not be able to provide
timestamp events for all packets. Don't complain too much about this.
Fixes: 7c236c43b8 ('sfc: Add support for IEEE-1588 PTP')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Limit syslog flood if a PTP packet storm occurs.
Fixes: 7c236c43b8 ('sfc: Add support for IEEE-1588 PTP')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
When a packet with invalid length arrives, ensure that the packet
is freed correctly if mergeable packet buffers and big packets
(GUEST_TSO4) are both enabled.
Signed-off-by: Michael Dalton <mwdalton@google.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Andrew Vagin <avagin@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current code unmaps the DMA mapping created for rx skb_buff's by
using the data_size as the the mapping size. This is wrong since the
correct size to specify should match the size used to create the mapping.
This commit removes the following DMA_API_DEBUG warning:
------------[ cut here ]------------
WARNING: at lib/dma-debug.c:887 check_unmap+0x3a8/0x860()
mvneta d0070000.ethernet: DMA-API: device driver frees DMA memory with different size [device address=0x000000002eb80000] [map size=1600 bytes] [unmap size=66 bytes]
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.21-01444-ga88ae13-dirty #92
[<c0013600>] (unwind_backtrace+0x0/0xf8) from [<c0010fb8>] (show_stack+0x10/0x14)
[<c0010fb8>] (show_stack+0x10/0x14) from [<c001afa0>] (warn_slowpath_common+0x48/0x68)
[<c001afa0>] (warn_slowpath_common+0x48/0x68) from [<c001b01c>] (warn_slowpath_fmt+0x30/0x40)
[<c001b01c>] (warn_slowpath_fmt+0x30/0x40) from [<c018d0fc>] (check_unmap+0x3a8/0x860)
[<c018d0fc>] (check_unmap+0x3a8/0x860) from [<c018d734>] (debug_dma_unmap_page+0x64/0x70)
[<c018d734>] (debug_dma_unmap_page+0x64/0x70) from [<c0233f78>] (mvneta_rx+0xec/0x468)
[<c0233f78>] (mvneta_rx+0xec/0x468) from [<c023436c>] (mvneta_poll+0x78/0x16c)
[<c023436c>] (mvneta_poll+0x78/0x16c) from [<c02db468>] (net_rx_action+0x94/0x160)
[<c02db468>] (net_rx_action+0x94/0x160) from [<c0021e68>] (__do_softirq+0xe8/0x1d0)
[<c0021e68>] (__do_softirq+0xe8/0x1d0) from [<c0021ff8>] (do_softirq+0x4c/0x58)
[<c0021ff8>] (do_softirq+0x4c/0x58) from [<c0022228>] (irq_exit+0x58/0x90)
[<c0022228>] (irq_exit+0x58/0x90) from [<c000e7c8>] (handle_IRQ+0x3c/0x94)
[<c000e7c8>] (handle_IRQ+0x3c/0x94) from [<c0008548>] (armada_370_xp_handle_irq+0x4c/0xb4)
[<c0008548>] (armada_370_xp_handle_irq+0x4c/0xb4) from [<c000dc20>] (__irq_svc+0x40/0x50)
Exception stack(0xc04f1f70 to 0xc04f1fb8)
1f60: c1fe46f8 00000000 00001d92 00001d92
1f80: c04f0000 c04f0000 c04f84a4 c03e081c c05220e7 00000001 c05220e7 c04f0000
1fa0: 00000000 c04f1fb8 c000eaf8 c004c048 60000113 ffffffff
[<c000dc20>] (__irq_svc+0x40/0x50) from [<c004c048>] (cpu_startup_entry+0x54/0x128)
[<c004c048>] (cpu_startup_entry+0x54/0x128) from [<c04c1a14>] (start_kernel+0x29c/0x2f0)
[<c04c1a14>] (start_kernel+0x29c/0x2f0) from [<00008074>] (0x8074)
---[ end trace d4955f6acd178110 ]---
Mapped at:
[<c018d600>] debug_dma_map_page+0x4c/0x11c
[<c0235d6c>] mvneta_setup_rxqs+0x398/0x598
[<c0236084>] mvneta_open+0x40/0x17c
[<c02dbbd4>] __dev_open+0x9c/0x100
[<c02dbe58>] __dev_change_flags+0x7c/0x134
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
br_stp_rcv() is reached by non-rx_handler path. That means there is no
guarantee that dev is bridge port and therefore simple NULL check of
->rx_handler_data is not enough. There is need to check if dev is really
bridge port and since only rcu read lock is held here, do it by checking
->rx_handler pointer.
Note that synchronize_net() in netdev_rx_handler_unregister() ensures
this approach as valid.
Introduced originally by:
commit f350a0a873
"bridge: use rx_handler_data pointer to store net_bridge_port pointer"
Fixed but not in the best way by:
commit b5ed54e94d
"bridge: fix RCU races with bridge port"
Reintroduced by:
commit 716ec052d2
"bridge: fix NULL pointer deref of br_port_get_rcu"
Please apply to stable trees as well. Thanks.
RH bugzilla reference: https://bugzilla.redhat.com/show_bug.cgi?id=1025770
Reported-by: Laine Stump <laine@redhat.com>
Debugged-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
efx_ptp_is_ptp_tx() must be robust against skbs from raw sockets that
have invalid IPv4 and UDP headers.
Add checks that:
- the transport header has been found
- there is enough space between network and transport header offset
for an IPv4 header
- there is enough space after the transport header offset for a
UDP header
Fixes: 7c236c43b8 ('sfc: Add support for IEEE-1588 PTP')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>