Commit Graph

119 Commits

Author SHA1 Message Date
David S. Miller c0b458a946 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Minor conflicts in drivers/net/ethernet/mellanox/mlx5/core/en_rep.c,
we had some overlapping changes:

1) In 'net' MLX5E_PARAMS_LOG_{SQ,RQ}_SIZE -->
   MLX5E_REP_PARAMS_LOG_{SQ,RQ}_SIZE

2) In 'net-next' params->log_rq_size is renamed to be
   params->log_rq_mtu_frames.

3) In 'net-next' params->hard_mtu is added.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01 19:49:34 -04:00
David Ahern 82dd0d2a9a vrf: Fix use after free and double free in vrf_finish_output
Miguel reported an skb use after free / double free in vrf_finish_output
when neigh_output returns an error. The vrf driver should return after
the call to neigh_output as it takes over the skb on error path as well.

Patch is a simplified version of Miguel's patch which was written for 4.9,
and updated to top of tree.

Fixes: 8f58336d3f ("net: Add ethernet header for pass through VRF device")
Signed-off-by: Miguel Fadon Perlines <mfadon@teldat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-30 14:20:23 -04:00
Kirill Tkhai 2f635ceeb2 net: Drop pernet_operations::async
Synchronous pernet_operations are not allowed anymore.
All are asynchronous. So, drop the structure member.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27 13:18:09 -04:00
David Ahern b75cc8f90f net/ipv6: Pass skb to route lookup
IPv6 does path selection for multipath routes deep in the lookup
functions. The next patch adds L4 hash option and needs the skb
for the forward path. To get the skb to the relevant FIB lookup
functions it needs to go through the fib rules layer, so add a
lookup_data argument to the fib_lookup_arg struct.

Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-04 13:04:22 -05:00
Kirill Tkhai 02df428ca2 net: Convert simple pernet_operations
These pernet_operations make pretty simple actions
like variable initialization on init, debug checks
on exit, and so on, and they obviously are able
to be executed in parallel with any others:

vrf_net_ops
lockd_net_ops
grace_net_ops
xfrm6_tunnel_net_ops
kcm_net_ops
tcf_net_ops

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-27 11:01:35 -05:00
Donald Sharp 1b71af6053 net: fib_rules: Add new attribute to set protocol
For ages iproute2 has used `struct rtmsg` as the ancillary header for
FIB rules and in the process set the protocol value to RTPROT_BOOT.
Until ca56209a66 ("net: Allow a rule to track originating protocol")
the kernel rules code ignored the protocol value sent from userspace
and always returned 0 in notifications. To avoid incompatibility with
existing iproute2, send the protocol as a new attribute.

Fixes: cac56209a6 ("net: Allow a rule to track originating protocol")
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-23 15:47:20 -05:00
Donald Sharp cac56209a6 net: Allow a rule to track originating protocol
Allow a rule that is being added/deleted/modified or
dumped to contain the originating protocol's id.

The protocol is handled just like a routes originating
protocol is.  This is especially useful because there
is starting to be a plethora of different user space
programs adding rules.

Allow the vrf device to specify that the kernel is the originator
of the rule created for this device.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-21 17:49:24 -05:00
David Ahern 68e813aa43 net/ipv4: Remove fib table id from rtable
Remove rt_table_id from rtable. It was added for getroute to return the
table id that was hit in the lookup. With the changes for fibmatch the
table id can be extracted from the fib_info returned in the fib_result
so it no longer needs to be in rtable directly.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-15 15:41:42 -05:00
David Ahern 1e19c4d689 net: vrf: Add support for sends to local broadcast address
Sukumar reported that sends to the local broadcast address
(255.255.255.255) are broken. Check for the address in vrf driver
and do not redirect to the VRF device - similar to multicast
packets.

With this change sockets can use SO_BINDTODEVICE to specify an
egress interface and receive responses. Note: the egress interface
can not be a VRF device but needs to be the enslaved device.

https://bugzilla.kernel.org/show_bug.cgi?id=198521

Reported-by: Sukumar Gopalakrishnan <sukumarg1973@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 21:51:03 -05:00
David S. Miller 2a171788ba Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Files removed in 'net-next' had their license header updated
in 'net'.  We take the remove from 'net-next'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-04 09:26:51 +09:00
Jeff Barnhill 18129a2498 net: vrf: correct FRA_L3MDEV encode type
FRA_L3MDEV is defined as U8, but is being added as a U32 attribute. On
big endian architecture, this results in the l3mdev entry not being
added to the FIB rules.

Fixes: 1aa6c4f6b8 ("net: vrf: Add l3mdev rules on first device create")
Signed-off-by: Jeff Barnhill <0xeffeff@gmail.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-02 16:20:53 +09:00
David Ahern de3baa3ed7 net: vrf: Add extack messages for enslave errors
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04 21:39:33 -07:00
David Ahern 42ab19ee90 net: Add extack to upper device linking
Add extack arg to netdev_upper_dev_link and netdev_master_upper_dev_link

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04 21:39:33 -07:00
David Ahern 33eaf2a6eb net: Add extack to ndo_add_slave
Pass extack to do_set_master and down to ndo_add_slave

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04 21:39:33 -07:00
Eric Dumazet a424120765 net: vrf: remove skb_dst_force() after skb_dst_set()
skb_dst_set(skb, dst) installs a normal (refcounted) dst, there is no
point using skb_dst_force(skb)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-21 20:36:32 -07:00
Arnd Bergmann ecf091171b net: vrf: avoid gcc-4.6 warning
When building an allmodconfig kernel with gcc-4.6, we get a rather
odd warning:

drivers/net/vrf.c: In function ‘vrf_ip6_input_dst’:
drivers/net/vrf.c:964:3: error: initialized field with side-effects overwritten [-Werror]
drivers/net/vrf.c:964:3: error: (near initialization for ‘fl6’) [-Werror]

I have no idea what this warning is even trying to say, but it does
seem like a false positive. Reordering the initialization in to match
the structure definition gets rid of the warning, and might also avoid
whatever gcc thinks is wrong here.

Fixes: 9ff7438460 ("net: vrf: Handle ipv6 multicast and link-local addresses")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-15 14:22:21 -07:00
David Ahern 4f04256c98 net: vrf: Drop local rtable and rt6_info
The VRF cached rtable and rt6_info for local traffic are no longer
needed and actually prevent local traffic through enslaved devices.
Remove them.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-13 20:05:12 -07:00
David Ahern 53b9483565 net: vrf: Add extack messages for newlink failures
Add extack error messages for failure paths creating vrf devices. Once
extack support is added to iproute2, we go from the unhelpful:
    $  ip li add foobar type vrf
    RTNETLINK answers: Invalid argument

to:
    $ ip li add foobar type vrf
    Error: VRF table id is missing

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 15:16:33 -07:00
Nikolay Aleksandrov f630c38ef0 vrf: fix bug_on triggered by rx when destroying a vrf
When destroying a VRF device we cleanup the slaves in its ndo_uninit()
function, but that causes packets to be switched (skb->dev == vrf being
destroyed) even though we're pass the point where the VRF should be
receiving any packets while it is being dismantled. This causes a BUG_ON
to trigger if we have raw sockets (trace below).
The reason is that the inetdev of the VRF has been destroyed but we're
still sending packets up the stack with it, so let's free the slaves in
the dellink callback as David Ahern suggested.

Note that this fix doesn't prevent packets from going up when the VRF
device is admin down.

[   35.631371] ------------[ cut here ]------------
[   35.631603] kernel BUG at net/ipv4/fib_frontend.c:285!
[   35.631854] invalid opcode: 0000 [#1] SMP
[   35.631977] Modules linked in:
[   35.632081] CPU: 2 PID: 22 Comm: ksoftirqd/2 Not tainted 4.12.0-rc7+ #45
[   35.632247] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[   35.632477] task: ffff88005ad68000 task.stack: ffff88005ad64000
[   35.632632] RIP: 0010:fib_compute_spec_dst+0xfc/0x1ee
[   35.632769] RSP: 0018:ffff88005ad67978 EFLAGS: 00010202
[   35.632910] RAX: 0000000000000001 RBX: ffff880059a7f200 RCX: 0000000000000000
[   35.633084] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff82274af0
[   35.633256] RBP: ffff88005ad679f8 R08: 000000000001ef70 R09: 0000000000000046
[   35.633430] R10: ffff88005ad679f8 R11: ffff880037731cb0 R12: 0000000000000001
[   35.633603] R13: ffff8800599e3000 R14: 0000000000000000 R15: ffff8800599cb852
[   35.634114] FS:  0000000000000000(0000) GS:ffff88005d900000(0000) knlGS:0000000000000000
[   35.634306] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   35.634456] CR2: 00007f3563227095 CR3: 000000000201d000 CR4: 00000000000406e0
[   35.634632] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   35.634865] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   35.635055] Call Trace:
[   35.635271]  ? __lock_acquire+0xf0d/0x1117
[   35.635522]  ipv4_pktinfo_prepare+0x82/0x151
[   35.635831]  raw_rcv_skb+0x17/0x3c
[   35.636062]  raw_rcv+0xe5/0xf7
[   35.636287]  raw_local_deliver+0x169/0x1d9
[   35.636534]  ip_local_deliver_finish+0x87/0x1c4
[   35.636820]  ip_local_deliver+0x63/0x7f
[   35.637058]  ip_rcv_finish+0x340/0x3a1
[   35.637295]  ip_rcv+0x314/0x34a
[   35.637525]  __netif_receive_skb_core+0x49f/0x7c5
[   35.637780]  ? lock_acquire+0x13f/0x1d7
[   35.638018]  ? lock_acquire+0x15e/0x1d7
[   35.638259]  __netif_receive_skb+0x1e/0x94
[   35.638502]  ? __netif_receive_skb+0x1e/0x94
[   35.638748]  netif_receive_skb_internal+0x74/0x300
[   35.639002]  ? dev_gro_receive+0x2ed/0x411
[   35.639246]  ? lock_is_held_type+0xc4/0xd2
[   35.639491]  napi_gro_receive+0x105/0x1a0
[   35.639736]  receive_buf+0xc32/0xc74
[   35.639965]  ? detach_buf+0x67/0x153
[   35.640201]  ? virtqueue_get_buf_ctx+0x120/0x176
[   35.640453]  virtnet_poll+0x128/0x1c5
[   35.640690]  net_rx_action+0x103/0x343
[   35.640932]  __do_softirq+0x1c7/0x4b7
[   35.641171]  run_ksoftirqd+0x23/0x5c
[   35.641403]  smpboot_thread_fn+0x24f/0x26d
[   35.641646]  ? sort_range+0x22/0x22
[   35.641878]  kthread+0x129/0x131
[   35.642104]  ? __list_add+0x31/0x31
[   35.642335]  ? __list_add+0x31/0x31
[   35.642568]  ret_from_fork+0x2a/0x40
[   35.642804] Code: 05 bd 87 a3 00 01 e8 1f ef 98 ff 4d 85 f6 48 c7 c7 f0 4a 27 82 41 0f 94 c4 31 c9 31 d2 41 0f b6 f4 e8 04 71 a1 ff 45 84 e4 74 02 <0f> 0b 0f b7 93 c4 00 00 00 4d 8b a5 80 05 00 00 48 03 93 d0 00
[   35.644342] RIP: fib_compute_spec_dst+0xfc/0x1ee RSP: ffff88005ad67978

Fixes: 193125dbd8 ("net: Introduce VRF device driver")
Reported-by: Chris Cormier <chriscormier@cumulusnetworks.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-06 16:46:07 +01:00
Matthias Schiffer a8b8a889e3 net: add netlink_ext_ack argument to rtnl_link_ops.validate
Add support for extended error reporting.

Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-26 23:13:22 -04:00
Matthias Schiffer 7a3f4a1851 net: add netlink_ext_ack argument to rtnl_link_ops.newlink
Add support for extended error reporting.

Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-26 23:13:21 -04:00
Wei Wang a4c2fd7f78 net: remove DST_NOCACHE flag
DST_NOCACHE flag check has been removed from dst_release() and
dst_hold_safe() in a previous patch because all the dst are now ref
counted properly and can be released based on refcnt only.
Looking at the rest of the DST_NOCACHE use, all of them can now be
removed or replaced with other checks.
So this patch gets rid of all the DST_NOCACHE usage and remove this flag
completely.

Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-17 22:54:01 -04:00
Wei Wang 1cfb71eeb1 ipv6: take dst->__refcnt for insertion into fib6 tree
In IPv6 routing code, struct rt6_info is created for each static route
and RTF_CACHE route and inserted into fib6 tree. In both cases, dst
ref count is not taken.
As explained in the previous patch, this leads to the need of the dst
garbage collector.

This patch holds ref count of dst before inserting the route into fib6
tree and properly releases the dst when deleting it from the fib6 tree
as a preparation in order to fully get rid of dst gc later.

Also, correct fib6_age() logic to check dst->__refcnt to be 1 to indicate
no user is referencing the dst.

And remove dst_hold() in vrf_rt6_create() as ip6_dst_alloc() already puts
dst->__refcnt to 1.

Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-17 22:54:00 -04:00
Johannes Berg d58ff35122 networking: make skb_push & __skb_push return void pointers
It seems like a historic accident that these return unsigned char *,
and in many places that means casts are required, more often than not.

Make these functions return void * and remove all the casts across
the tree, adding a (u8 *) cast only where the unsigned char pointer
was used directly, all done with the following spatch:

    @@
    expression SKB, LEN;
    typedef u8;
    identifier fn = { skb_push, __skb_push, skb_push_rcsum };
    @@
    - *(fn(SKB, LEN))
    + *(u8 *)fn(SKB, LEN)

    @@
    expression E, SKB, LEN;
    identifier fn = { skb_push, __skb_push, skb_push_rcsum };
    type T;
    @@
    - E = ((T *)(fn(SKB, LEN)))
    + E = fn(SKB, LEN)

    @@
    expression SKB, LEN;
    identifier fn = { skb_push, __skb_push, skb_push_rcsum };
    @@
    - fn(SKB, LEN)[0]
    + *(u8 *)fn(SKB, LEN)

Note that the last part there converts from push(...)[0] to the
more idiomatic *(u8 *)push(...).

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-16 11:48:40 -04:00
David Ahern 097d3c9508 net: vrf: Make add_fib_rules per network namespace flag
Commit 1aa6c4f6b8 ("net: vrf: Add l3mdev rules on first device create")
adds the l3mdev FIB rule the first time a VRF device is created. However,
it only creates the rule once and only in the namespace the first device
is created - which may not be init_net. Fix by using the net_generic
capability to make the add_fib_rules flag per network namespace.

Fixes: 1aa6c4f6b8 ("net: vrf: Add l3mdev rules on first device create")
Reported-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-08 19:27:42 -04:00