Commit Graph

71 Commits

Author SHA1 Message Date
Gao feng 5c70ef85a2 veth: allow to setup multicast address for veth device
We can only setup multicast address for network device when
net_device_ops->ndo_set_rx_mode is not null.

Some configurations need to add multicast address for net
device, such as netfilter cluster match module.

Add a fake ndo_set_rx_mode function to allow this operation.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-10 00:08:07 -04:00
David S. Miller b343ca84b4 Revert "veth: Showing peer of veth type dev in ip link (kernel side)"
This reverts commit 612c337306.

As per Stephen Hemminger, the layout of the netlink attribute
is not implemented correctly so revert this for now.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-08 21:52:03 -04:00
Masatake YAMATO 612c337306 veth: Showing peer of veth type dev in ip link (kernel side)
ip link has ability to show extra information of net work device if
kernel provides sunh information. With this patch veth driver can
provide its peer ifindex information to ip command via netlink
interface.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-08 15:23:25 -04:00
Flavio Leitner b69bbddfa1 veth: add vlan features
The veth device doesn't provide the vlan features,
so TSO for example is disabled and that causes
performance issues when using tagged traffic.

The test topology looks like this:

    br0                     br1
  /   \                  /     \
vnet  veth0.10 ----- veth1.10   vnet
VM                               VM

The netperf results with current veth driver:
MIGRATED TCP STREAM TEST from 192.168.1.1 ()
port 0 AF_INET to 192.168.1.2 () port 0 AF_INET
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 87380  16384  16384    10.01    2210.22

Now after applying the proposed patch:
MIGRATED TCP STREAM TEST from 192.168.1.1 ()
port 0 AF_INET to 192.168.1.2 () port 0 AF_INET
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 87380  16384  16384    10.00    13067.47

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-19 17:36:03 -07:00
Hong zhi guo 3f8b96379a veth: remove redundant call of dev_alloc_name
it's called in the following register_netdevice. No need to call it
here.
Tested with "ip link add type veth" and "ip link add xxx%d type veth".

Signed-off-by: Hong Zhiguo <honkiko@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-12 01:21:20 -07:00
Patrick McHardy 28d2b136ca net: vlan: announce STAG offload capability in some drivers
- macvlan: propagate STAG filtering capabilities from underlying device
- ifb: announce STAG tagging support in addition to CTAG tagging support
- veth: announce STAG tagging/stripping support in addition to CTAG support

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19 14:46:06 -04:00
Patrick McHardy f646968f8f net: vlan: rename NETIF_F_HW_VLAN_* feature flags to NETIF_F_HW_VLAN_CTAG_*
Rename the hardware VLAN acceleration features to include "CTAG" to indicate
that they only support CTAGs. Follow up patches will introduce 802.1ad
server provider tagging (STAGs) and require the distinction for hardware not
supporting acclerating both.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19 14:45:26 -04:00
Eric Dumazet f45a5c267d veth: fix NULL dereference in veth_dellink()
commit d0e2c55e7c (veth: avoid a NULL deref in veth_stats_one)
added another NULL deref in veth_dellink().

# ip link add name veth1 type veth peer name veth0
# rmmod veth

We crash because veth_dellink() is called twice, so we must
take care of NULL peer.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-10 20:41:43 -05:00
Eric Dumazet 2efd32ee1b veth: fix a NULL deref in netif_carrier_off
In commit d0e2c55e7c (veth: avoid a NULL deref in veth_stats_one)
we now clear the peer pointers in veth_dellink()

veth_close() must therefore make sure the peer pointer is set.

Reported-by: Tom Parkin <tom.parkin@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-10 14:11:46 -08:00
Eric Dumazet d0e2c55e7c veth: avoid a NULL deref in veth_stats_one
commit 2681128f0c (veth: extend device features) added a NULL deref
in veth_stats_one(), as veth_get_stats64() was not testing if the peer
device was setup or not.

At init time, we call dev_get_stats() before veth pair is fully setup.

[  178.854758]  [<ffffffffa00f5677>] veth_get_stats64+0x47/0x70 [veth]
[  178.861013]  [<ffffffff814f0a2d>] dev_get_stats+0x6d/0x130
[  178.866486]  [<ffffffff81504efc>] rtnl_fill_ifinfo+0x47c/0x930
[  178.872299]  [<ffffffff81505b93>] rtmsg_ifinfo+0x83/0x100
[  178.877678]  [<ffffffff81505cc6>] rtnl_configure_link+0x76/0xa0
[  178.883580]  [<ffffffffa00f52fa>] veth_newlink+0x16a/0x350 [veth]
[  178.889654]  [<ffffffff815061cc>] rtnl_newlink+0x4dc/0x5e0
[  178.895128]  [<ffffffff81505e1e>] ? rtnl_newlink+0x12e/0x5e0
[  178.900769]  [<ffffffff8150587d>] rtnetlink_rcv_msg+0x11d/0x310
[  178.906669]  [<ffffffff81505760>] ? __rtnl_unlock+0x20/0x20
[  178.912225]  [<ffffffff81521f89>] netlink_rcv_skb+0xa9/0xd0
[  178.917779]  [<ffffffff81502d55>] rtnetlink_rcv+0x25/0x40
[  178.923159]  [<ffffffff815218d1>] netlink_unicast+0x1b1/0x230
[  178.928887]  [<ffffffff81521c4e>] netlink_sendmsg+0x2fe/0x3b0
[  178.934615]  [<ffffffff814dbe22>] sock_sendmsg+0xd2/0xf0

So we must check if peer was setup in veth_get_stats64()

As pointed out by Ben Hutchings, priv->peer is missing proper
synchronization. Adding RCU protection is a safe and well documented
way to make sure we don't access about to be freed or already
freed data.

Reported-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
CC: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-07 19:42:50 -08:00
Eric Dumazet 8093315a91 veth: extend device features
veth is lacking most modern facilities, like SG, checksums, TSO.

It makes sense to extend dev->features to get them, or GRO aggregation
is defeated by a forced segmentation.

Reported-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-30 02:31:59 -08:00
Eric Dumazet 2681128f0c veth: reduce stat overhead
veth stats are a bit bloated. There is no need to account transmit
and receive stats, since they are absolutely symmetric.

Also use a per device atomic64_t for the dropped counter, as it
should never be used in fast path.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-30 02:31:58 -08:00
Rami Rosen c07135633b rtnelink: remove unused parameter from rtnl_create_link().
This patch removes an unused parameter (src_net) from rtnl_create_link()
method and from the method single invocation, in veth.
This parameter was used in the past when calling
ops->get_tx_queues(src_net, tb) in rtnl_create_link().
The get_tx_queues() member of rtnl_link_ops was replaced by two methods,
get_num_tx_queues() and get_num_rx_queues(), which do not get any
parameter. This was done in commit d40156aa5e by
Jiri Pirko ("rtnl: allow to specify different num for rx and tx queue count").

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-30 12:24:40 -05:00
Hannes Frederic Sowa 23ea5a9637 veth: allow changing the mac address while interface is up
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-01 12:26:10 -04:00
Pavel Emelyanov e6f8f1a739 veth: Allow to create peer link with given ifindex
The ifinfomsg is in there (thanks kaber@ for foreseeing this long time ago),
so take the given ifidex and register netdev with it.

Ben noticed, that this code path previously ignored ifmp->ifi_index and
userland could be passing in garbage. Thus it may now fail occasionally
because the value clashes with an existing interface.

To address this it's assumed that if the caller specifies the ifindex for
the veth master device, then it's aware of this possibility and should
explicitly specify (or set to 0 for auto-assignment) the peer's ifindex as
well. With this the compatibility with old tools not setting ifindex is
preserved.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-09 16:18:07 -07:00
David S. Miller 32efe08d77 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.c

Small minor conflict in bnx2x, wherein one commit changed how
statistics were stored in software, and another commit
fixed endianness bugs wrt. reading the values provided by
the chip in memory.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-19 16:03:15 -05:00
Danny Kukawka f2cedb63df net: replace random_ether_addr() with eth_hw_addr_random()
Replace usage of random_ether_addr() with eth_hw_addr_random()
to set addr_assign_type correctly to NET_ADDR_RANDOM.

Change the trivial cases.

v2: adapt to renamed eth_hw_addr_random()

Signed-off-by: Danny Kukawka <danny.kukawka@bisect.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-15 15:34:16 -05:00
Thomas Graf 237114384a veth: Enforce minimum size of VETH_INFO_PEER
VETH_INFO_PEER carries struct ifinfomsg plus optional IFLA
attributes. A minimal size of sizeof(struct ifinfomsg) must be
enforced or we may risk accessing that struct beyond the limits
of the netlink message.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-15 14:59:20 -05:00
Rick Jones 84b4050111 Sweep away N/A fw_version dustbunnies from the .get_drvinfo routine of a number of drivers
Per discussion with Ben Hutchings and David Miller, go through and
remove assignments of "N/A" to fw_version in various drivers'
.get_drvinfo routines.  While there clean-up some use of bare
constants and such.

Signed-off-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-22 16:43:32 -05:00
Michał Mirosław 34324dc2bf net: remove NETIF_F_NO_CSUM feature bit
Only distinct use is checking if NETIF_F_NOCACHE_COPY should be
enabled by default. The check heuristics is altered a bit here,
so it hits other people than before. The default shouldn't be
trusted for performance-critical cases anyway.

For all other uses NETIF_F_NO_CSUM is equivalent to NETIF_F_HW_CSUM.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-16 17:43:12 -05:00
Rick Jones 33a5ba144e net: sweep-up some straglers in strlcpy conversion of .get_drvinfo routines
Convert some remaining straglers' .get_drvinfo routines to use strlcpy
rather than strcpy/strncpy.

Signed-off-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-16 17:38:55 -05:00
Eric Dumazet 8ce120f118 net: better pcpu data alignment
Tunnels can force an alignment of their percpu data to reduce number of
cache lines used in fast path, or read in .ndo_get_stats()

percpu_alloc() is a very fine grained allocator, so any small hole will
be used anyway.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-08 15:10:59 -05:00
Paul Gortmaker 9d9779e723 drivers/net: Add module.h to drivers who were implicitly using it
The device.h header was including module.h, making it present for
most of these drivers.  But we want to clean that up.  Call out the
include of module.h in the modular network drivers.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:31:07 -04:00
Neil Horman 550fd08c2c net: Audit drivers to identify those needing IFF_TX_SKB_SHARING cleared
After the last patch, We are left in a state in which only drivers calling
ether_setup have IFF_TX_SKB_SHARING set (we assume that drivers touching real
hardware call ether_setup for their net_devices and don't hold any state in
their skbs.  There are a handful of drivers that violate this assumption of
course, and need to be fixed up.  This patch identifies those drivers, and marks
them as not being able to support the safe transmission of skbs by clearning the
IFF_TX_SKB_SHARING flag in priv_flags

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Karsten Keil <isdn@linux-pingi.de>
CC: "David S. Miller" <davem@davemloft.net>
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Patrick McHardy <kaber@trash.net>
CC: Krzysztof Halasa <khc@pm.waw.pl>
CC: "John W. Linville" <linville@tuxdriver.com>
CC: Greg Kroah-Hartman <gregkh@suse.de>
CC: Marcel Holtmann <marcel@holtmann.org>
CC: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-27 22:39:30 -07:00
Eric Dumazet 81b16ba2f1 veth: Kill unused tx_dropped
Followup to commit f82528bc13 (Exclude duplicated checking for
iface-up) : We no longer need percpu tx_dropped field.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-06 01:51:04 -07:00