synchronize_rcu() is very slow in various situations (HZ=100,
CONFIG_NO_HZ=y, CONFIG_PREEMPT=n)
Extract from my (mostly idle) 8 core machine :
synchronize_rcu() in 99985 us
synchronize_rcu() in 79982 us
synchronize_rcu() in 87612 us
synchronize_rcu() in 79827 us
synchronize_rcu() in 109860 us
synchronize_rcu() in 98039 us
synchronize_rcu() in 89841 us
synchronize_rcu() in 79842 us
synchronize_rcu() in 80151 us
synchronize_rcu() in 119833 us
synchronize_rcu() in 99858 us
synchronize_rcu() in 73999 us
synchronize_rcu() in 79855 us
synchronize_rcu() in 79853 us
When we hold RTNL mutex, we would like to spend some cpu cycles but not
block too long other processes waiting for this mutex.
We also want to setup/dismantle network features as fast as possible at
boot/shutdown time.
This patch makes synchronize_net() call the expedited version if RTNL is
locked.
synchronize_rcu_expedited() typical delay is about 20 us on my machine.
synchronize_rcu_expedited() in 18 us
synchronize_rcu_expedited() in 18 us
synchronize_rcu_expedited() in 18 us
synchronize_rcu_expedited() in 18 us
synchronize_rcu_expedited() in 20 us
synchronize_rcu_expedited() in 16 us
synchronize_rcu_expedited() in 20 us
synchronize_rcu_expedited() in 18 us
synchronize_rcu_expedited() in 18 us
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: Ben Greear <greearb@candelatech.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the old days, we used to access dev->master in __netif_receive_skb()
in a rcu_read_lock section.
So one synchronize_net() call was needed in netdev_set_master() to make
sure another cpu could not use old master while/after we release it.
We now use netdev_rx_handler infrastructure and added one
synchronize_net() call in bond_release()/bond_release_all()
Remove the obsolete synchronize_net() from netdev_set_master() and add
one in bridge del_nbp() after its netdev_rx_handler_unregister() call.
This makes enslave -d a bit faster.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Jiri Pirko <jpirko@redhat.com>
CC: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When one macvlan device is dismantled, we can avoid one
synchronize_rcu() call done after deletion from hash list, since caller
will perform a synchronize_net() call after its ndo_stop() call.
Add a new netdev->dismantle field to signal this dismantle intent.
Reduces RTNL hold time.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Patrick McHardy <kaber@trash.net>
CC: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cool, how about we make 'Features changed' debug as well?
This way userspace can't fill up the log just by tweaking tun features
with an ioctl.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using plain hlist_del() in dev_change_name() is wrong since a
concurrent reader can crash trying to dereference LIST_POISON1.
Bug introduced in commit 72c9528bab (net: Introduce
dev_get_by_name_rcu())
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Those reduced to DEBUG can possibly be triggered by unprivileged processes
and are nothing exceptional. Illegal checksum combinations can only be
caused by driver bug, so promote those messages to WARN.
Since GSO without SG will now only cause DEBUG message from
netdev_fix_features(), remove the workaround from register_netdevice().
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
It will be needed by bonding and other drivers changing vlan_features
after ndo_init callback.
As a bonus, this includes kernel-doc for netdev_update_features().
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 443457242b (factorize sync-rcu call in
unregister_netdevice_many) mistakenly removed one test from dev_close()
Following actions trigger a BUG :
modprobe bonding
modprobe dummy
ifconfig bond0 up
ifenslave bond0 dummy0
rmmod dummy
dev_close() must not close a non IFF_UP device.
With help from Frank Blaschka and Einar EL Lueck
Reported-by: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Reported-by: Einar EL Lueck <ELELUECK@de.ibm.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Force dev_alloc_name() to be called from register_netdevice() by
dev_get_valid_name(). That allows to remove multiple explicit
dev_alloc_name() calls.
The possibility to call dev_alloc_name in advance remains.
This also fixes veth creation regresion caused by
84c49d8c3e
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This makes sure that when a driver calls the ethtool's
get/set_settings() callback of another driver, the data passed to it
is clean. This guarantees that speed_hi will be zeroed correctly if
the called callback doesn't explicitely set it: we are sure we don't
get a corrupted speed from the underlying driver. We also take care of
setting the cmd field appropriately (ETHTOOL_GSET/SSET).
This applies to dev_ethtool_get_settings(), which now makes sure it
sets up that ethtool command parameter correctly before passing it to
drivers. This also means that whoever calls dev_ethtool_get_settings()
does not have to clean the ethtool command parameter. This function
also becomes an exported symbol instead of an inline.
All drivers visible to make allyesconfig under x86_64 have been
updated.
Signed-off-by: David Decotigny <decot@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simplify and fix netdev_increment_features() to conform to what is
stated in netdevice.h comments about NETIF_F_ONE_FOR_ALL.
Include FCoE segmentation and VLAN-challedged flags in computation.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since now when bonding uses rx_handler, all traffic going into bond
device goes thru bond_handle_frame. So there's no need to go back into
bonding code later via ptype handlers. This patch converts
original ptype handlers into "bonding receive probes". These functions
are called from bond_handle_frame and they are registered per-mode.
Note that vlan packets are also handled because they are always untagged
thanks to vlan_untag()
Note that this also allows arpmon for eth-bond-bridge-vlan topology.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add const qualifiers to structs iphdr, ipv6hdr and in6_addr pointers
where possible, to make code intention more obvious.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
NETIF_F_TSO_ECN has no effect when TSO is disabled; this just means
that feature state will be accurately reported to user-space.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The feature flags NETIF_F_TSO and NETIF_F_TSO6 independently enable
TSO for IPv4 and IPv6 respectively. However, the test in
netdev_fix_features() and its predecessor functions was never updated
to check for NETIF_F_TSO6, possibly because it was originally proposed
that TSO for IPv6 would be dependent on both feature flags.
Now that these feature flags can be changed independently from
user-space and we depend on netdev_fix_features() to fix invalid
feature combinations, it's important to disable them both if
scatter-gather is disabled. Also disable NETIF_F_TSO_ECN so
user-space sees all TSO features as disabled.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now there are 2 paths for rx vlan frames. When rx-vlan-hw-accel is
enabled, skb is untagged by NIC, vlan_tci is set and the skb gets into
vlan code in __netif_receive_skb - vlan_hwaccel_do_receive.
For non-rx-vlan-hw-accel however, tagged skb goes thru whole
__netif_receive_skb, it's untagged in ptype_base hander and reinjected
This incosistency is fixed by this patch. Vlan untagging happens early in
__netif_receive_skb so the rest of code (ptype_all handlers, rx_handlers)
see the skb like it was untagged by hw.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
v1->v2:
remove "inline" from vlan_core.c functions
Signed-off-by: David S. Miller <davem@davemloft.net>