Commit Graph

293 Commits

Author SHA1 Message Date
Patrick Mullaney
fc4a748966 netdevice: provide common routine for macvlan and vlan operstate management
Provide common routine for the transition of operational state for a leaf
device during a root device transition.

Signed-off-by: Patrick Mullaney <pmullaney@novell.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-03 15:59:22 -08:00
Eric W. Biederman
dcbccbd4f1 net: Implement for_each_netdev_reverse.
I will need this shortly to implement network namespace shutdown
batching.  For sanity sake network devices should be removed in
the reverse order they were created in.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-01 16:15:50 -08:00
Arnd Bergmann
445409602c veth: move loopback logic to common location
The veth driver contains code to forward an skb
from the start_xmit function of one network
device into the receive path of another device.

Moving that code into a common location lets us
reuse the code for direct forwarding of data
between macvlan ports, and possibly in other
drivers.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-26 15:52:58 -08:00
Eric Dumazet
e014debecd linkwatch: linkwatch_forget_dev() to speedup device dismantle
Herbert Xu a écrit :
> On Tue, Nov 17, 2009 at 04:26:04AM -0800, David Miller wrote:
>> Really, the link watch stuff is just due for a redesign.  I don't
>> think a simple hack is going to cut it this time, sorry Eric :-)
>
> I have no objections against any redesigns, but since the only
> caller of linkwatch_forget_dev runs in process context with the
> RTNL, it could also legally emit those events.

Thanks guys, here an updated version then, before linkwatch surgery ?

In this version, I force the event to be sent synchronously.

[PATCH net-next-2.6] linkwatch: linkwatch_forget_dev() to speedup device dismantle

time ip link del eth3.103 ; time ip link del eth3.104 ; time ip link del eth3.105

real	0m0.266s
user	0m0.000s
sys	0m0.001s

real	0m0.770s
user	0m0.000s
sys	0m0.000s

real	0m1.022s
user	0m0.000s
sys	0m0.000s

One problem of current schem in vlan dismantle phase is the
holding of device done by following chain :

vlan_dev_stop() ->
	netif_carrier_off(dev) ->
		linkwatch_fire_event(dev) ->
			dev_hold() ...

And __linkwatch_run_queue() runs up to one second later...

A generic fix to this problem is to add a linkwatch_forget_dev() method
to unlink the device from the list of watched devices.

dev->link_watch_next becomes dev->link_watch_list (and use a bit more memory),
to be able to unlink device in O(1).

After patch :
time ip link del eth3.103 ; time ip link del eth3.104 ; time ip link del eth3.105

real    0m0.024s
user    0m0.000s
sys     0m0.000s

real    0m0.032s
user    0m0.000s
sys     0m0.001s

real    0m0.033s
user    0m0.000s
sys     0m0.000s

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-18 05:03:11 -08:00
Eric Dumazet
d83345adf9 net: add dev_txq_stats_fold() helper
Some drivers ndo_get_stats() method need to perform txqueue stats folding.

Move folding from dev_get_stats() to a new dev_txq_stats_fold() function

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-17 23:51:52 -08:00
Jarek Poplawski
9a1654ba0b net: Optimize hard_start_xmit() return checking
Recent changes in the TX error propagation require additional checking
and masking of values returned from hard_start_xmit(), mainly to
separate cases where skb was consumed. This aim can be simplified by
changing the order of NETDEV_TX and NET_XMIT codes, because the latter
are treated similarly to negative (ERRNO) values.

After this change much simpler dev_xmit_complete() is also used in
sch_direct_xmit(), so it is moved to netdevice.h.

Additionally NET_RX definitions in netdevice.h are moved up from
between TX codes to avoid confusion while reading the TX comment.

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-15 22:08:33 -08:00
Eric Dumazet
ce81b76a39 ipv6: use RCU to walk list of network devices
No longer need read_lock(&dev_base_lock), use RCU instead.
We also can avoid taking references on inet6_dev structs.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-13 20:38:49 -08:00
Patrick McHardy
572a9d7b6f net: allow to propagate errors through ->ndo_hard_start_xmit()
Currently the ->ndo_hard_start_xmit() callbacks are only permitted to return
one of the NETDEV_TX codes. This prevents any kind of error propagation for
virtual devices, like queue congestion of the underlying device in case of
layered devices, or unreachability in case of tunnels.

This patches changes the NET_XMIT codes to avoid clashes with the NETDEV_TX
codes and changes the two callers of dev_hard_start_xmit() to expect either
errno codes, NET_XMIT codes or NETDEV_TX codes as return value.

In case of qdisc_restart(), all non NETDEV_TX codes are mapped to NETDEV_TX_OK
since no error propagation is possible when using qdiscs. In case of
dev_queue_xmit(), the error is propagated upwards.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-13 14:07:32 -08:00
stephen hemminger
254245d233 netdev: add netdev_continue_rcu
This adds an RCU macro for continuing search, useful for some
network devices like vlan.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-10 22:26:29 -08:00
Eric Dumazet
d94d9fee9f net: cleanup include/linux
This cleanup patch puts struct/union/enum opening braces,
in first line to ease grep games.

struct something
{

becomes :

struct something {

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-04 09:50:58 -08:00
Eric Dumazet
c6d14c8456 net: Introduce for_each_netdev_rcu() iterator
Adds RCU management to the list of netdevices.

Convert some for_each_netdev() users to RCU version, if
it can avoid read_lock-ing dev_base_lock

Ie:
	read_lock(&dev_base_loack);
	for_each_netdev(net, dev)
		some_action();
	read_unlock(&dev_base_lock);

becomes :

	rcu_read_lock();
	for_each_netdev_rcu(net, dev)
		some_action();
	rcu_read_unlock();


Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-04 05:43:23 -08:00
Eric Dumazet
72c9528bab net: Introduce dev_get_by_name_rcu()
Some workloads hit dev_base_lock rwlock pretty hard.
We can use RCU lookups to avoid touching this rwlock
(and avoid touching netdevice refcount)

netdevices are already freed after a RCU grace period, so this patch
adds no penalty at device dismantle time.

However, it adds a synchronize_rcu() call in dev_change_name()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-01 23:55:08 -08:00
Eric W. Biederman
0c509a6c93 net: Allow devices to specify a device specific sysfs group.
This isn't beautifully abstracted, but it is simple,
simplifies uses and so far is only needed for the bonding driver.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-30 12:41:18 -07:00
Ben Hutchings
c7c4b3b6e9 gro: Change all receive functions to return GRO result codes
This will allow drivers to adjust their receive path dynamically
based on whether GRO is being applied successfully.

Currently all in-tree callers ignore the return values of these
functions and do not need to be changed.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29 21:36:53 -07:00
Ben Hutchings
5b252f0c2f gro: Name the GRO result enumeration type
This clarifies which return and parameter types are GRO result codes
and not RX result codes.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29 21:33:55 -07:00
Eric Dumazet
fb699dfd42 net: Introduce dev_get_by_index_rcu()
Some workloads hit dev_base_lock rwlock pretty hard.
We can use RCU lookups to avoid touching this rwlock.

netdevices are already freed after a RCU grace period, so this patch
adds no penalty at device dismantle time.

dev_ifname() converted to dev_get_by_index_rcu()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29 01:42:55 -07:00
Yi Zou
df5c79452f net: Add ndo_fcoe_get_wwn to net_device_ops
Add ndo_fcoe_get_wwn so Fiber Channel over Ethernet (FCoE) can make use of
the provided World Wide Port Name (WWPN) and World Wide Node Name (WWNN)
from the underlying network interface driver.

Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-29 01:04:03 -07:00
Eric Dumazet
9b5e383c11 net: Introduce unregister_netdevice_many()
Introduce rollback_registered_many() and unregister_netdevice_many()

rollback_registered_many() is able to perform necessary steps at device dismantle
time, factorizing two expensive synchronize_net() calls.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-28 02:22:06 -07:00
Eric Dumazet
44a0873d52 net: Introduce unregister_netdevice_queue()
This patchs adds an unreg_list anchor to struct net_device, and
introduces an unregister_netdevice_queue() function, able to queue
a net_device to a list instead of immediately unregister it.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-28 02:22:06 -07:00
David S. Miller
7fe13c5733 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2009-10-11 23:15:47 -07:00
Wolfram Sang
d308e38fa5 include/linux/netdevice.h: fix nanodoc mismatch
nanodoc was missing an ndo_-prefix.

Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-07 13:53:11 -07:00
Stephen Hemminger
3295354322 dcb: data center bridging ops should be r/o
The data center bridging ops structure can be const

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-07 01:10:47 -07:00
Linus Torvalds
f205ce83a7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (66 commits)
  be2net: fix some cmds to use mccq instead of mbox
  atl1e: fix 2.6.31-git4 -- ATL1E 0000:03:00.0: DMA-API: device driver frees DMA
  pkt_sched: Fix qstats.qlen updating in dump_stats
  ipv6: Log the affected address when DAD failure occurs
  wl12xx: Fix print_mac() conversion.
  af_iucv: fix race when queueing skbs on the backlog queue
  af_iucv: do not call iucv_sock_kill() twice
  af_iucv: handle non-accepted sockets after resuming from suspend
  af_iucv: fix race in __iucv_sock_wait()
  iucv: use correct output register in iucv_query_maxconn()
  iucv: fix iucv_buffer_cpumask check when calling IUCV functions
  iucv: suspend/resume error msg for left over pathes
  wl12xx: switch to %pM to print the mac address
  b44: the poll handler b44_poll must not enable IRQ unconditionally
  ipv6: Ignore route option with ROUTER_PREF_INVALID
  bonding: make ab_arp select active slaves as other modes
  cfg80211: fix SME connect
  rc80211_minstrel: fix contention window calculation
  ssb/sdio: fix printk format warnings
  p54usb: add Zcomax XG-705A usbid
  ...
2009-09-17 20:53:52 -07:00
David Brownell
a4dbd6740d driver model: constify attribute groups
Let attribute group vectors be declared "const".  We'd
like to let most attribute metadata live in read-only
sections... this is a start.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-09-15 09:50:47 -07:00
Moni Shoua
75c78500dd bonding: remap muticast addresses without using dev_close() and dev_open()
This patch fixes commit e36b9d16c6. The approach
there is to call dev_close()/dev_open() whenever the device type is changed in
order to remap the device IP multicast addresses to HW multicast addresses.
This approach suffers from 2 drawbacks:

*. It assumes tha the device is UP when calling dev_close(), or otherwise
   dev_close() has no affect. It is worth to mention that initscripts (Redhat)
   and sysconfig (Suse) doesn't act the same in this matter. 
*. dev_close() has other side affects, like deleting entries from the routing
   table, which might be unnecessary.

The fix here is to directly remap the IP multicast addresses to HW multicast
addresses for a bonding device that changes its type, and nothing else.
   
Reported-by:   Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Moni Shoua <monis@voltaire.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-15 02:37:40 -07:00