Commit Graph

85 Commits

Author SHA1 Message Date
stephen hemminger
fe5c3561e6 vxlan: add necessary locking on device removal
The socket management is now done in workqueue (outside of RTNL)
and protected by vn->sock_lock. There were two possible bugs, first
the vxlan device was removed from the VNI hash table per socket without
holding lock. And there was a race when device is created and the workqueue
could run after deletion.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-17 12:51:19 -07:00
Pravin B Shelar
f89e57c4f5 vxlan: Fix kernel crash on rmmod.
vxlan exit module unregisters vxlan net and then it unregisters
rtnl ops which triggers vxlan_dellink() from __rtnl_kill_links().
vxlan_dellink() deletes vxlan-dev from vxlan_list which has
list-head in vxlan-net-struct but that is already gone due to
net-unregister. That is how we are getting following crash.

Following commit fixes the crash by fixing module exit path.

BUG: unable to handle kernel paging request at ffff8804102c8000
IP: [<ffffffff812cc5e9>] __list_del_entry+0x29/0xd0
PGD 2972067 PUD 83e019067 PMD 83df97067 PTE 80000004102c8060
Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
Modules linked in: ---
CPU: 19 PID: 6712 Comm: rmmod Tainted: GF            3.10.0+ #95
Hardware name: Dell Inc. PowerEdge R620/0KCKR5, BIOS 1.4.8 10/25/2012
task: ffff88080c47c580 ti: ffff88080ac50000 task.ti: ffff88080ac50000
RIP: 0010:[<ffffffff812cc5e9>]  [<ffffffff812cc5e9>]
__list_del_entry+0x29/0xd0
RSP: 0018:ffff88080ac51e08  EFLAGS: 00010206
RAX: ffff8804102c8000 RBX: ffff88040f0d4b10 RCX: dead000000200200
RDX: ffff8804102c8000 RSI: ffff88080ac51e58 RDI: ffff88040f0d4b10
RBP: ffff88080ac51e08 R08: 0000000000000001 R09: 2222222222222222
R10: 2222222222222222 R11: 2222222222222222 R12: ffff88080ac51e58
R13: ffffffffa07b8840 R14: ffffffff81ae48c0 R15: ffff88080ac51e58
FS:  00007f9ef105c700(0000) GS:ffff88082a800000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff8804102c8000 CR3: 00000008227e5000 CR4: 00000000000407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
 ffff88080ac51e28 ffffffff812cc6a1 2222222222222222 ffff88040f0d4000
 ffff88080ac51e48 ffffffffa07b3311 ffff88040f0d4000 ffffffff81ae49c8
 ffff88080ac51e98 ffffffff81492fc2 ffff88080ac51e58 ffff88080ac51e58
Call Trace:
 [<ffffffff812cc6a1>] list_del+0x11/0x40
 [<ffffffffa07b3311>] vxlan_dellink+0x51/0x70 [vxlan]
 [<ffffffff81492fc2>] __rtnl_link_unregister+0xa2/0xb0
 [<ffffffff8149448e>] rtnl_link_unregister+0x1e/0x30
 [<ffffffffa07b7b7c>] vxlan_cleanup_module+0x1c/0x2f [vxlan]
 [<ffffffff810c9b31>] SyS_delete_module+0x1d1/0x2c0
 [<ffffffff812b8a0e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff81582f42>] system_call_fastpath+0x16/0x1b
Code: eb 9f 55 48 8b 17 48 b9 00 01 10 00 00 00 ad de 48 8b 47 08 48 89
e5 48 39 ca 74 29 48 b9 00 02 20 00 00 00 ad de 48 39 c8 74 7a <4c> 8b
00 4c 39 c7 75 53 4c 8b 42 08 4c 39 c7 75 2b 48 89 42 08
RIP  [<ffffffff812cc5e9>] __list_del_entry+0x29/0xd0
 RSP <ffff88080ac51e08>
CR2: ffff8804102c8000

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-11 11:45:36 -07:00
Stephen Hemminger
ba609e9bf1 vxlan: fix function name spelling
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2013-06-25 17:06:01 -07:00
Mike Rapoport
58e4c76704 vxlan: fdb: allow specifying multiple destinations for zero MAC
The zero MAC entry in the fdb is used as default destination. With
multiple default destinations it is possible to use vxlan in
environments that disable multicast on the infrastructure level, e.g.
public clouds.

Signed-off-by: Mike Rapoport <mike.rapoport@ravellosystems.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2013-06-25 09:31:40 -07:00
Mike Rapoport
bc7892ba39 vxlan: allow removal of single destination from fdb entry
When the last item is deleted from the remote destinations list, the
fdb entry is destroyed.

Signed-off-by: Mike Rapoport <mike.rapoport@ravellosystems.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2013-06-25 09:31:38 -07:00
Mike Rapoport
f0b074be7b vxlan: introduce vxlan_fdb_parse
which will be reused by vxlan_fdb_delete

Signed-off-by: Mike Rapoport <mike.rapoport@ravellosystems.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2013-06-25 09:31:37 -07:00
Mike Rapoport
a5e7c10a7e vxlan: introduce vxlan_fdb_find_rdst
which will be reused by vxlan_fdb_delete

Signed-off-by: Mike Rapoport <mike.rapoport@ravellosystems.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2013-06-25 09:31:36 -07:00
Mike Rapoport
afbd8bae9c vxlan: add implicit fdb entry for default destination
Signed-off-by: Mike Rapoport <mike.rapoport@ravellosystems.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2013-06-25 09:31:35 -07:00
Pravin B Shelar
60d9d4c6db vxlan: Fix sparse warnings.
Fix following sparse warnings.
drivers/net/vxlan.c:238:44: warning: incorrect type in argument 3 (different base types)
drivers/net/vxlan.c:238:44:    expected restricted __be32 [usertype] value
drivers/net/vxlan.c:238:44:    got unsigned int const [unsigned] [usertype] remote_vni
drivers/net/vxlan.c:1735:18: warning: incorrect type in initializer (different signedness)
drivers/net/vxlan.c:1735:18:    expected int *id
drivers/net/vxlan.c:1735:18:    got unsigned int static [toplevel] *<noident>

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2013-06-25 09:30:42 -07:00
Stephen Hemminger
234f5b7379 vxlan: cosmetic cleanup's
Fix whitespace and spelling

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: David L Stevens <dlstevens@us.ibm.com>
2013-06-24 08:40:33 -07:00
Stephen Hemminger
bb3fd6878a vxlan: Use initializer for dummy structures
For the notification code, a couple of places build fdb entries on
the stack, use structure initialization instead and fix formatting.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2013-06-24 08:40:33 -07:00
Stephen Hemminger
9daaa397b3 vxlan: port module param should be ushort
UDP ports are limited to 16 bits.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2013-06-24 08:40:33 -07:00
Stephen Hemminger
3e61aa8f0a vxlan: convert remotes list to list_rcu
Based on initial work by Mike Rapoport <mike.rapoport@ravellosystems.com>
Use list macros and RCU for tracking multiple remotes.

Note: this code assumes list always has at least one entry,
because delete is not supported.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2013-06-24 08:40:32 -07:00
Stephen Hemminger
4ad169300a vxlan: make vxlan_xmit_one void
The function vxlan_xmit_one always returns NETDEV_TX_OK, so there
is no point in keeping track of return values etc.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: David L Stevens <dlstevens@us.ibm.com>
2013-06-24 08:40:32 -07:00
Stephen Hemminger
ebf4063e86 vxlan: move cleanup to uninit
Put destruction of per-cpu statistics removal in
ndo_uninit since it is created by ndo_init.
This also avoids any problems that might be cause by destructor
being called after module removed.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2013-06-24 08:40:32 -07:00
Stephen Hemminger
1c51a9159d vxlan: fix race caused by dropping rtnl_unlock
It is possible for two cpu's to race creating vxlan device.
For most cases this is harmless, but the ability to assign "next
avaliable vxlan device" relies on rtnl lock being held across the
whole operation. Therfore two instances of calling:
  ip li add vxlan%d vxlan ...
could collide and create two devices with same name.

To fix this defer creation of socket to a work queue, and
handle possible races there. Introduce a lock to ensure that
changes to vxlan socket hash list is SMP safe.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2013-06-24 08:40:32 -07:00
Stephen Hemminger
8385f50a03 vxlan: send notification when MAC migrates
When learned entry migrates to another IP send a notification
that entry has changed.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2013-06-24 08:40:32 -07:00
Stephen Hemminger
7c47cedf43 vxlan: move IGMP join/leave to work queue
Do join/leave from work queue to avoid lock inversion problems
between normal socket and RTNL. The code comes out cleaner
as well.

Uses Cong Wang's suggestion to turn refcnt into a real atomic
since now need to handle case where last use of socket is IGMP
worker.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2013-06-24 08:40:32 -07:00
Stephen Hemminger
758c57d16a vxlan: fix crash from work pending on module removal
Switch to using a per module work queue so that all the socket
deletion callbacks are done when module is removed.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2013-06-24 08:40:32 -07:00
Stephen Hemminger
b715398407 vxlan: fix out of order operation on module removal
If vxlan is removed with active vxlan's it would crash because
rtnl_link_unregister (which calls vxlan_dellink), was invoked
before unregister_pernet_device (which calls vxlan_stop).

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2013-06-24 08:40:32 -07:00
Pravin B Shelar
0e6fbc5b6c ip_tunnels: extend iptunnel_xmit()
Refactor various ip tunnels xmit functions and extend iptunnel_xmit()
so that there is more code sharing.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-19 18:07:41 -07:00
David S. Miller
d98cae64e4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/wireless/ath/ath9k/Kconfig
	drivers/net/xen-netback/netback.c
	net/batman-adv/bat_iv_ogm.c
	net/wireless/nl80211.c

The ath9k Kconfig conflict was a change of a Kconfig option name right
next to the deletion of another option.

The xen-netback conflict was overlapping changes involving the
handling of the notify list in xen_netbk_rx_action().

Batman conflict resolution provided by Antonio Quartulli, basically
keep everything in both conflict hunks.

The nl80211 conflict is a little more involved.  In 'net' we added a
dynamic memory allocation to nl80211_dump_wiphy() to fix a race that
Linus reported.  Meanwhile in 'net-next' the handlers were converted
to use pre and post doit handlers which use a flag to determine
whether to hold the RTNL mutex around the operation.

However, the dump handlers to not use this logic.  Instead they have
to explicitly do the locking.  There were apparent bugs in the
conversion of nl80211_dump_wiphy() in that we were not dropping the
RTNL mutex in all the return paths, and it seems we very much should
be doing so.  So I fixed that whilst handling the overlapping changes.

To simplify the initial returns, I take the RTNL mutex after we try
to allocate 'tb'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-19 16:49:39 -07:00
stephen hemminger
eb064c3b49 vxlan: fix check for migration of static entry
The check introduced by:
	commit 26a41ae604
	Author: stephen hemminger <stephen@networkplumber.org>
	Date:   Mon Jun 17 12:09:58 2013 -0700

	    vxlan: only migrate dynamic FDB entries

was not correct because it is checking flag about type of FDB
entry, rather than the state (dynamic versus static). The confusion
arises because vxlan is reusing values from bridge, and bridge is
reusing values from neighbour table, and easy to get lost in translation.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-19 00:50:58 -07:00
stephen hemminger
7aa2723841 vxlan: handle skb_clone failure
If skb_clone fails if out of memory then just skip the fanout.

Problem was introduced in 3.10 with:
  commit 6681712d67
  Author: David Stevens <dlstevens@us.ibm.com>
  Date:   Fri Mar 15 04:35:51 2013 +0000

    vxlan: generalize forwarding tables

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 15:55:47 -07:00
stephen hemminger
26a41ae604 vxlan: only migrate dynamic FDB entries
Only migrate dynamic forwarding table entries, don't modify
static entries. If packet received from incorrect source IP address
assume it is an imposter and drop it.

This patch applies only to -net, a different patch would be needed for earlier
kernels since the NTF_SELF flag was introduced with 3.10.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 15:55:46 -07:00