Commit Graph

65508 Commits

Author SHA1 Message Date
Eric W. Biederman ce286d3273 [NET]: Implement network device movement between namespaces
This patch introduces NETIF_F_NETNS_LOCAL a flag to indicate
a network device is local to a single network namespace and
should never be moved.  Useful for pseudo devices that we
need an instance in each network namespace (like the loopback
device) and for any device we find that cannot handle multiple
network namespaces so we may trap them in the initial network
namespace.

This patch introduces the function dev_change_net_namespace
a function used to move a network device from one network
namespace to another.  To the network device nothing
special appears to happen, to the components of the network
stack it appears as if the network device was unregistered
in the network namespace it is in, and a new device
was registered in the network namespace the device
was moved to.

This patch sets up a namespace device destructor that
upon the exit of a network namespace moves all of the
movable network devices  to the initial network namespace
so they are not lost.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:12 -07:00
Eric W. Biederman b267b17964 [NET]: Factor out __dev_alloc_name from dev_alloc_name
When forcibly changing the network namespace of a device
I need something that can generate a name for the device
in the new namespace without overwriting the old name.

__dev_alloc_name provides me that functionality.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:11 -07:00
Eric W. Biederman 881d966b48 [NET]: Make the device list and device lookups per namespace.
This patch makes most of the generic device layer network
namespace safe.  This patch makes dev_base_head a
network namespace variable, and then it picks up
a few associated variables.  The functions:
dev_getbyhwaddr
dev_getfirsthwbytype
dev_get_by_flags
dev_get_by_name
__dev_get_by_name
dev_get_by_index
__dev_get_by_index
dev_ioctl
dev_ethtool
dev_load
wireless_process_ioctl

were modified to take a network namespace argument, and
deal with it.

vlan_ioctl_set and brioctl_set were modified so their
hooks will receive a network namespace argument.

So basically anthing in the core of the network stack that was
affected to by the change of dev_base was modified to handle
multiple network namespaces.  The rest of the network stack was
simply modified to explicitly use &init_net the initial network
namespace.  This can be fixed when those components of the network
stack are modified to handle multiple network namespaces.

For now the ifindex generator is left global.

Fundametally ifindex numbers are per namespace, or else
we will have corner case problems with migration when
we get that far.

At the same time there are assumptions in the network stack
that the ifindex of a network device won't change.  Making
the ifindex number global seems a good compromise until
the network stack can cope with ifindex changes when
you change namespaces, and the like.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:10 -07:00
Eric W. Biederman b4b510290b [NET]: Support multiple network namespaces with netlink
Each netlink socket will live in exactly one network namespace,
this includes the controlling kernel sockets.

This patch updates all of the existing netlink protocols
to only support the initial network namespace.  Request
by clients in other namespaces will get -ECONREFUSED.
As they would if the kernel did not have the support for
that netlink protocol compiled in.

As each netlink protocol is updated to be multiple network
namespace safe it can register multiple kernel sockets
to acquire a presence in the rest of the network namespaces.

The implementation in af_netlink is a simple filter implementation
at hash table insertion and hash table look up time.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:09 -07:00
Eric W. Biederman e9dc865340 [NET]: Make device event notification network namespace safe
Every user of the network device notifiers is either a protocol
stack or a pseudo device.  If a protocol stack that does not have
support for multiple network namespaces receives an event for a
device that is not in the initial network namespace it quite possibly
can get confused and do the wrong thing.

To avoid problems until all of the protocol stacks are converted
this patch modifies all netdev event handlers to ignore events on
devices that are not in the initial network namespace.

As the rest of the code is made network namespace aware these
checks can be removed.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:09 -07:00
Eric W. Biederman e730c15519 [NET]: Make packet reception network namespace safe
This patch modifies every packet receive function
registered with dev_add_pack() to drop packets if they
are not from the initial network namespace.

This should ensure that the various network stacks do
not receive packets in a anything but the initial network
namespace until the code has been converted and is ready
for them.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:08 -07:00
Eric W. Biederman 6d34b1c27a [NET]: Initialize the network namespace of network devices.
Except for carefully selected pseudo devices all network
interfaces should start out in the initial network namespace.
Ultimately it will be register_netdev that examines what
dev->nd_net is set to and places a device in a network namespace.

This patch modifies alloc_netdev to initialize the network
namespace a device is in with the initial network namespace.
This gets it right for the vast majority of devices so their
drivers need not be modified and for those few pseudo devices
that need something different they can change this parameter
before calling register_netdevice.

The network namespace parameter on a network device is not
reference counted as the devices are inside of a network namespace
and cannot remain in that namespace past the lifetime of the
network namespace.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:07 -07:00
Eric W. Biederman 1b8d7ae42d [NET]: Make socket creation namespace safe.
This patch passes in the namespace a new socket should be created in
and has the socket code do the appropriate reference counting.  By
virtue of this all socket create methods are touched.  In addition
the socket create methods are modified so that they will fail if
you attempt to create a socket in a non-default network namespace.

Failing if we attempt to create a socket outside of the default
network namespace ensures that as we incrementally make the network stack
network namespace aware we will not export functionality that someone
has not audited and made certain is network namespace safe.
Allowing us to partially enable network namespaces before all of the
exotic protocols are supported.

Any protocol layers I have missed will fail to compile because I now
pass an extra parameter into the socket creation code.

[ Integrated AF_IUCV build fixes from Andrew Morton... -DaveM ]

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:07 -07:00
Eric W. Biederman 457c4cbc5a [NET]: Make /proc/net per network namespace
This patch makes /proc/net per network namespace.  It modifies the global
variables proc_net and proc_net_stat to be per network namespace.
The proc_net file helpers are modified to take a network namespace argument,
and all of their callers are fixed to pass &init_net for that argument.
This ensures that all of the /proc/net files are only visible and
usable in the initial network namespace until the code behind them
has been updated to be handle multiple network namespaces.

Making /proc/net per namespace is necessary as at least some files
in /proc/net depend upon the set of network devices which is per
network namespace, and even more files in /proc/net have contents
that are relevant to a single network namespace.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:06 -07:00
Eric W. Biederman 07feaebfcc [NET]: Add a network namespace parameter to struct sock
Sockets need to get a reference to their network namespace,
or possibly a simple hold if someone registers on the network
namespace notifier and will free the sockets when the namespace
is going to be destroyed.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:05 -07:00
Eric W. Biederman 4a1c537113 [NET]: Add a network namespace tag to struct net_device
Please note that network devices do not increase the count
count on the network namespace.  The are inside the network
namespace and so the network namespace tag is in the nature
of a back pointer and so getting and putting the network namespace
is unnecessary.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:04 -07:00
Eric W. Biederman 772698f636 [NET]: Add a network namespace parameter to tasks
This is the network namespace from which all which all sockets
and anything else under user control ultimately get their network
namespace parameters.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:04 -07:00
Eric W. Biederman 5f256becd8 [NET]: Basic network namespace infrastructure.
This is the basic infrastructure needed to support network
namespaces.  This infrastructure is:
- Registration functions to support initializing per network
  namespace data when a network namespaces is created or destroyed.

- struct net.  The network namespace data structure.
  This structure will grow as variables are made per network
  namespace but this is the minimal starting point.

- Functions to grab a reference to the network namespace.
  I provide both get/put functions that keep a network namespace
  from being freed.  And hold/release functions serve as weak references
  and will warn if their count is not zero when the data structure
  is freed.  Useful for dealing with more complicated data structures
  like the ipv4 route cache.

- A list of all of the network namespaces so we can iterate over them.

- A slab for the network namespace data structure allowing leaks
  to be spotted.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:03 -07:00
Eric W. Biederman 32da477a5b [NET]: Don't implement dev_ifname32 inline
The current implementation of dev_ifname makes maintenance difficult
because updates to the implementation of the ioctl have to made in two
places.  So this patch updates dev_ifname32 to do a classic 32/64
structure conversion and call sys_ioctl like the rest of the
compat calls do.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:03 -07:00
Eric W. Biederman 890d52d3f1 [ATALK]: In notifier handlers convert the void pointer to a netdevice
This slightly improves code safety and clarity.

Later network namespace patches touch this code so this is a
preliminary cleanup.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:02 -07:00
Joy Latten ab5f5e8b14 [XFRM]: xfrm audit calls
This patch modifies the current ipsec audit layer
by breaking it up into purpose driven audit calls.

So far, the only audit calls made are when add/delete
an SA/policy. It had been discussed to give each
key manager it's own calls to do this, but I found
there to be much redundnacy since they did the exact
same things, except for how they got auid and sid, so I
combined them. The below audit calls can be made by any
key manager. Hopefully, this is ok.

Signed-off-by: Joy Latten <latten@austin.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:02 -07:00
John Heffner d2e9117c7a [NET]: Change type of owner in sock_lock_t to int, rename
The type of owner in sock_lock_t is currently (struct sock_iocb *),
presumably for historical reasons.  It is never used as this type, only
tested as NULL or set to (void *)1.  For clarity, this changes it to type
int, and renames to owned, to avoid any possible type casting errors.

Signed-off-by: John Heffner <jheffner@psc.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:01 -07:00
John Heffner 02b3d34631 [NET] Cleanup: Use sock_owned_by_user() macro
Changes asserts in sunrpc to use sock_owned_by_user() macro instead of
referencing sock_lock.owner directly.

Signed-off-by: John Heffner <jheffner@psc.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:00 -07:00
Andy Gospodarek ab0049b4a2 [TG3]: remove sparse warnings
Removed sparse warnings from tg3 driver.  The new logic seems fine (I
don't immediately see where we are running over values for any of the
variables that need to be saved).

This patch compiles fine and I'm currently using a tg3 with the patched
driver to post this patch as a basic proof of concept.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:49:00 -07:00
Stephen Hemminger 50f17787e9 [AF_PACKET]: Don't enable global timestamps.
Andi mentioned he did something like this already, but never submitted
it.

The dhcp client application uses AF_PACKET with a packet filter to
receive data. The application doesn't even use timestamps, but because
the AF_PACKET API has timestamps, they get turned on globally which
causes an expensive time of day lookup for every packet received on
any system that uses the standard DHCP client.

The fix is to not enable the timestamp (but use if if available).
This causes the time lookup to only occur on those packets that are
destined for the AF_PACKET socket.  The timestamping occurs after
packet filtering so all packets dropped by filtering to not cause a
clock call.

The one downside of this a a few microseconds additional delay added
from the normal timestamping location (netif_rx) until the receive
callback in AF_PACKET. But since the offset is fairly consistent it
should not upset applications that do want really use timestamps, like
wireshark.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:48:59 -07:00
Micah Gruber c726187225 [DCCP]: Remove unneeded pointer newdp from dccp_v4_request_recv_sock()
This trivial patch removes the unneeded pointer newdp, which is never used.

Signed-off-by: Micah Gruber <micah.gruber@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:48:59 -07:00
Micah Gruber 1dfcae7765 [IPV6]: Remove unneeded pointer iph from ipcomp6_input() in net/ipv6/ipcomp6.c
This trivial patch removes the unneeded pointer iph, which is never used.

Signed-off-by: Micah Gruber <micah.gruber@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:48:58 -07:00
Johannes Berg c9ee23dfac [MAC80211]: make assoc_ap a flag
The sta_info.assoc_ap value is used as a flag, move it
into flags.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Michael Wu <flamingice@sourmilk.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:48:58 -07:00
Johannes Berg 52aa944a18 [MAC80211]: remove hostapd interface stuff
This removes some definitions that are used only within ioctls
that will never make it into mainline.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Michael Wu <flamingice@sourmilk.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:48:57 -07:00
Johannes Berg 8dc06a1c61 [MAC80211]: improve key selection comment
When I changed the code there I forgot to mention what happens
with multicast frames in a regular BSS and keep wondering myself
if the code is correct. Add appropriate comments.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Michael Wu <flamingice@sourmilk.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:48:56 -07:00