Commit Graph

14833 Commits

Author SHA1 Message Date
David S. Miller 80bb3a00fa Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6 2010-03-25 11:48:58 -07:00
Eric Dumazet 8f59922914 netfilter: xt_hashlimit: IPV6 bugfix
A missing break statement in hashlimit_ipv6_mask(), and masks
between /64 and /95 are not working at all...

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-03-25 17:25:11 +01:00
Jozsef Kadlecsik 9c13886665 netfilter: ip6table_raw: fix table priority
The order of the IPv6 raw table is currently reversed, that makes impossible
to use the NOTRACK target in IPv6: for example if someone enters

ip6tables -t raw -A PREROUTING -p tcp --dport 80 -j NOTRACK

and if we receive fragmented packets then the first fragment will be
untracked and thus skip nf_ct_frag6_gather (and conntrack), while all
subsequent fragments enter nf_ct_frag6_gather and reassembly will never
successfully be finished.

Singed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-03-25 11:17:26 +01:00
Eric Dumazet 55e0d7cf27 netfilter: xt_hashlimit: dl_seq_stop() fix
If dl_seq_start() memory allocation fails, we crash later in
dl_seq_stop(), trying to kfree(ERR_PTR(-ENOMEM))

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-03-25 11:00:22 +01:00
Dan Carpenter 9a127aad4d af_key: return error if pfkey_xfrm_policy2msg_prep() fails
The original code saved the error value but just returned 0 in the end.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Jamal Hadi Salim <hadi@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-24 13:28:27 -07:00
Vasu Dev f6b9f4b263 vlan: updates vlan real_num_tx_queues
Updates real_num_tx_queues in case underlying real device
has changed real_num_tx_queues.

-v2
 As per Eric Dumazet<eric.dumazet@gmail.com> comment:-
   -- adds BUG_ON to catch case of real_num_tx_queues exceeding num_tx_queues.
   -- created this self contained patch to just update real_num_tx_queues.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-24 11:11:38 -07:00
Vasu Dev 669d3e0bab vlan: adds vlan_dev_select_queue
This is required to correctly select vlan tx queue for a driver
supporting multi tx queue with ndo_select_queue implemented since
currently selected vlan tx queue is unaligned to selected queue by
real net_devce ndo_select_queue.

Unaligned vlan tx queue selection causes thrash with higher vlan
tx lock contention for least fcoe traffic and wrong socket tx
queue_mapping for ixgbe having ndo_select_queue implemented.

-v2

As per Eric Dumazet<eric.dumazet@gmail.com> comments, mirrored
vlan net_device_ops to have them with and without vlan_dev_select_queue
and then select according to real dev ndo_select_queue present or not
for a vlan net_device. This is to completely skip vlan_dev_select_queue
calling for real net_device not supporting ndo_select_queue.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-24 11:11:38 -07:00
Ben Blum 8e039d84b3 cgroups: net_cls as module
Allows the net_cls cgroup subsystem to be compiled as a module

This patch modifies net/sched/cls_cgroup.c to allow the net_cls subsystem
to be optionally compiled as a module instead of builtin.  The
cgroup_subsys struct is moved around a bit to allow the subsys_id to be
either declared as a compile-time constant by the cgroup_subsys.h include
in cgroup.h, or, if it's a module, initialized within the struct by
cgroup_load_subsys.

Signed-off-by: Ben Blum <bblum@andrew.cmu.edu>
Acked-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Paul Menage <menage@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-23 13:06:14 -07:00
Amerigo Wang 5fc05f8764 netpoll: warn when there are spaces in parameters
v2: update according to Frans' comments.

Currently, if we leave spaces before dst port,
netconsole will silently accept it as 0. Warn about this.

Also, when spaces appear in other places, make them
visible in error messages.

Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: David Miller <davem@davemloft.net>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-22 20:05:45 -07:00
Patrick McHardy ef1691504c netfilter: xt_recent: fix regression in rules using a zero hit_count
Commit 8ccb92ad (netfilter: xt_recent: fix false match) fixed supposedly
false matches in rules using a zero hit_count. As it turns out there is
nothing false about these matches and people are actually using entries
with a hit_count of zero to make rules dependant on addresses inserted
manually through /proc.

Since this slipped past the eyes of three reviewers, instead of
reverting the commit in question, this patch explicitly checks
for a hit_count of zero to make the intentions more clear.

Reported-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
Tested-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
Cc: stable@kernel.org
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-03-22 18:25:20 +01:00
Linus Torvalds 258152acc0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (38 commits)
  ip_gre: include route header_len in max_headroom calculation
  if_tunnel.h: add missing ams/byteorder.h include
  ipv4: Don't drop redirected route cache entry unless PTMU actually expired
  net: suppress lockdep-RCU false positive in FIB trie.
  Bluetooth: Fix kernel crash on L2CAP stress tests
  Bluetooth: Convert debug files to actually use debugfs instead of sysfs
  Bluetooth: Fix potential bad memory access with sysfs files
  netfilter: ctnetlink: fix reliable event delivery if message building fails
  netlink: fix NETLINK_RECV_NO_ENOBUFS in netlink_set_err()
  NET_DMA: free skbs periodically
  netlink: fix unaligned access in nla_get_be64()
  tcp: Fix tcp_mark_head_lost() with packets == 0
  net: ipmr/ip6mr: fix potential out-of-bounds vif_table access
  KS8695: update ksp->next_rx_desc_read at the end of rx loop
  igb: Add support for 82576 ET2 Quad Port Server Adapter
  ixgbevf: Message formatting cleanups
  ixgbevf: Shorten up delay timer for watchdog task
  ixgbevf: Fix VF Stats accounting after reset
  ixgbe: Set IXGBE_RSC_CB(skb)->DMA field to zero after unmapping the address
  ixgbe: fix for real_num_tx_queues update issue
  ...
2010-03-22 10:01:58 -07:00
Tetsuo Handa c3824d21eb rxrpc: Check allocation failure.
alloc_skb() can return NULL.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-22 09:57:19 -07:00
Timo Teräs 243aad830e ip_gre: include route header_len in max_headroom calculation
Taking route's header_len into account, and updating gre device
needed_headroom will give better hints on upper bound of required
headroom. This is useful if the gre traffic is xfrm'ed.

Signed-off-by: Timo Teras <timo.teras@iki.fi>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-21 21:23:28 -07:00
Guenter Roeck 5e016cbf6c ipv4: Don't drop redirected route cache entry unless PTMU actually expired
TCP sessions over IPv4 can get stuck if routers between endpoints
do not fragment packets but implement PMTU instead, and we are using
those routers because of an ICMP redirect.

Setup is as follows

       MTU1    MTU2   MTU1
    A--------B------C------D

with MTU1 > MTU2. A and D are endpoints, B and C are routers. B and C
implement PMTU and drop packets larger than MTU2 (for example because
DF is set on all packets). TCP sessions are initiated between A and D.
There is packet loss between A and D, causing frequent TCP
retransmits.

After the number of retransmits on a TCP session reaches tcp_retries1,
tcp calls dst_negative_advice() prior to each retransmit. This results
in route cache entries for the peer to be deleted in
ipv4_negative_advice() if the Path MTU is set.

If the outstanding data on an affected TCP session is larger than
MTU2, packets sent from the endpoints will be dropped by B or C, and
ICMP NEEDFRAG will be returned. A and D receive NEEDFRAG messages and
update PMTU.

Before the next retransmit, tcp will again call dst_negative_advice(),
causing the route cache entry (with correct PMTU) to be deleted. The
retransmitted packet will be larger than MTU2, causing it to be
dropped again.

This sequence repeats until the TCP session aborts or is terminated.

Problem is fixed by removing redirected route cache entries in
ipv4_negative_advice() only if the PMTU is expired.

Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-21 20:55:13 -07:00
David S. Miller e3a61d47cc Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/holtmann/bluetooth-2.6 2010-03-21 18:03:11 -07:00
Paul E. McKenney 634a4b2038 net: suppress lockdep-RCU false positive in FIB trie.
Allow fib_find_node() to be called either under rcu_read_lock()
protection or with RTNL held.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-21 18:01:05 -07:00
Andrei Emeltchenko c2c77ec83b Bluetooth: Fix kernel crash on L2CAP stress tests
Added very simple check that req buffer has enough space to
fit configuration parameters. Shall be enough to reject packets
with configuration size more than req buffer.

Crash trace below

[ 6069.659393] Unable to handle kernel paging request at virtual address 02000205
[ 6069.673034] Internal error: Oops: 805 [#1] PREEMPT
...
[ 6069.727172] PC is at l2cap_add_conf_opt+0x70/0xf0 [l2cap]
[ 6069.732604] LR is at l2cap_recv_frame+0x1350/0x2e78 [l2cap]
...
[ 6070.030303] Backtrace:
[ 6070.032806] [<bf1c2880>] (l2cap_add_conf_opt+0x0/0xf0 [l2cap]) from
[<bf1c6624>] (l2cap_recv_frame+0x1350/0x2e78 [l2cap])
[ 6070.043823]  r8:dc5d3100 r7:df2a91d6 r6:00000001 r5:df2a8000 r4:00000200
[ 6070.050659] [<bf1c52d4>] (l2cap_recv_frame+0x0/0x2e78 [l2cap]) from
[<bf1c8408>] (l2cap_recv_acldata+0x2bc/0x350 [l2cap])
[ 6070.061798] [<bf1c814c>] (l2cap_recv_acldata+0x0/0x350 [l2cap]) from
[<bf0037a4>] (hci_rx_task+0x244/0x478 [bluetooth])
[ 6070.072631]  r6:dc647700 r5:00000001 r4:df2ab740
[ 6070.077362] [<bf003560>] (hci_rx_task+0x0/0x478 [bluetooth]) from
[<c006b9fc>] (tasklet_action+0x78/0xd8)
[ 6070.087005] [<c006b984>] (tasklet_action+0x0/0xd8) from [<c006c160>]

Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@nokia.com>
Acked-by: Gustavo F. Padovan <gustavo@padovan.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-03-21 05:49:36 +01:00
Marcel Holtmann aef7d97cc6 Bluetooth: Convert debug files to actually use debugfs instead of sysfs
Some of the debug files ended up wrongly in sysfs, because at that point
of time, debugfs didn't exist. Convert these files to use debugfs and
also seq_file. This patch converts all of these files at once and then
removes the exported symbol for the Bluetooth sysfs class.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-03-21 05:49:35 +01:00
Marcel Holtmann 101545f6fe Bluetooth: Fix potential bad memory access with sysfs files
When creating a high number of Bluetooth sockets (L2CAP, SCO
and RFCOMM) it is possible to scribble repeatedly on arbitrary
pages of memory. Ensure that the content of these sysfs files is
always less than one page. Even if this means truncating. The
files in question are scheduled to be moved over to debugfs in
the future anyway.

Based on initial patches from Neil Brown and Linus Torvalds

Reported-by: Neil Brown <neilb@suse.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2010-03-21 05:49:32 +01:00
Pablo Neira Ayuso 37b7ef7203 netfilter: ctnetlink: fix reliable event delivery if message building fails
This patch fixes a bug that allows to lose events when reliable
event delivery mode is used, ie. if NETLINK_BROADCAST_SEND_ERROR
and NETLINK_RECV_NO_ENOBUFS socket options are set.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-20 14:29:03 -07:00
Pablo Neira Ayuso 1a50307ba1 netlink: fix NETLINK_RECV_NO_ENOBUFS in netlink_set_err()
Currently, ENOBUFS errors are reported to the socket via
netlink_set_err() even if NETLINK_RECV_NO_ENOBUFS is set. However,
that should not happen. This fixes this problem and it changes the
prototype of netlink_set_err() to return the number of sockets that
have set the NETLINK_RECV_NO_ENOBUFS socket option. This return
value is used in the next patch in these bugfix series.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-20 14:29:03 -07:00
Steven J. Magnani 73852e8151 NET_DMA: free skbs periodically
Under NET_DMA, data transfer can grind to a halt when userland issues a
large read on a socket with a high RCVLOWAT (i.e., 512 KB for both).
This appears to be because the NET_DMA design queues up lots of memcpy
operations, but doesn't issue or wait for them (and thus free the
associated skbs) until it is time for tcp_recvmesg() to return.
The socket hangs when its TCP window goes to zero before enough data is
available to satisfy the read.

Periodically issue asynchronous memcpy operations, and free skbs for ones
that have completed, to prevent sockets from going into zero-window mode.

Signed-off-by: Steven J. Magnani <steve@digidescorp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-20 14:29:02 -07:00
Lennart Schulte 6830c25b7d tcp: Fix tcp_mark_head_lost() with packets == 0
A packet is marked as lost in case packets == 0, although nothing should be done.
This results in a too early retransmitted packet during recovery in some cases.
This small patch fixes this issue by returning immediately.

Signed-off-by: Lennart Schulte <lennart.schulte@nets.rwth-aachen.de>
Signed-off-by: Arnd Hannemann <hannemann@nets.rwth-aachen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-19 22:47:22 -07:00
Patrick McHardy a50436f2cd net: ipmr/ip6mr: fix potential out-of-bounds vif_table access
mfc_parent of cache entries is used to index into the vif_table and is
initialised from mfcctl->mfcc_parent. This can take values of to 2^16-1,
while the vif_table has only MAXVIFS (32) entries. The same problem
affects ip6mr.

Refuse invalid values to fix a potential out-of-bounds access. Unlike
the other validity checks, this is checked in ipmr_mfc_add() instead of
the setsockopt handler since its unused in the delete path and might be
uninitialized.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-19 22:47:22 -07:00
stephen hemminger 97e3ecd112 TCP: check min TTL on received ICMP packets
This adds RFC5082 checks for TTL on received ICMP packets.
It adds some security against spoofed ICMP packets
disrupting GTSM protected sessions.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-19 21:00:42 -07:00