Simply replace proc_create and further data assigned with proc_create_data.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 0794935e "[NETFILTER]: nf_conntrack: optimize hash_conntrack()"
results in ARM platforms hashing uninitialised padding. This padding
doesn't exist on other architectures.
Fix this by replacing NF_CT_TUPLE_U_BLANK() with memset() to ensure
everything is initialised. There were only 4 bytes that
NF_CT_TUPLE_U_BLANK() wasn't clearing anyway (or 12 bytes on ARM).
Signed-off-by: Philip Craig <philipc@snapgear.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While reinjecting *bigger* modified versions of IPv6 packets using
libnetfilter_queue, things work fine on a 2.6.24 kernel (2.6.22 too)
but I get the following on recents kernels (2.6.25, trace below is
against today's net-2.6 git tree):
skb_over_panic: text:c04fddb0 len:696 put:632 head:f7592c00 data:f7592c00 tail:0xf7592eb8 end:0xf7592e80 dev:eth0
------------[ cut here ]------------
invalid opcode: 0000 [#1] PREEMPT
Process sendd (pid: 3657, ti=f6014000 task=f77c31d0 task.ti=f6014000)
Stack: c071e638 c04fddb0 000002b8 00000278 f7592c00 f7592c00 f7592eb8 f7592e80
f763c000 f6bc5200 f7592c40 f6015c34 c04cdbfc f6bc5200 00000278 f6015c60
c04fddb0 00000020 f72a10c0 f751b420 00000001 0000000a 000002b8 c065582c
Call Trace:
[<c04fddb0>] ? nfqnl_recv_verdict+0x1c0/0x2e0
[<c04cdbfc>] ? skb_put+0x3c/0x40
[<c04fddb0>] ? nfqnl_recv_verdict+0x1c0/0x2e0
[<c04fd115>] ? nfnetlink_rcv_msg+0xf5/0x160
[<c04fd03e>] ? nfnetlink_rcv_msg+0x1e/0x160
[<c04fd020>] ? nfnetlink_rcv_msg+0x0/0x160
[<c04f8ed7>] ? netlink_rcv_skb+0x77/0xa0
[<c04fcefc>] ? nfnetlink_rcv+0x1c/0x30
[<c04f8c73>] ? netlink_unicast+0x243/0x2b0
[<c04cfaba>] ? memcpy_fromiovec+0x4a/0x70
[<c04f9406>] ? netlink_sendmsg+0x1c6/0x270
[<c04c8244>] ? sock_sendmsg+0xc4/0xf0
[<c011970d>] ? set_next_entity+0x1d/0x50
[<c0133a80>] ? autoremove_wake_function+0x0/0x40
[<c0118f9e>] ? __wake_up_common+0x3e/0x70
[<c0342fbf>] ? n_tty_receive_buf+0x34f/0x1280
[<c011d308>] ? __wake_up+0x68/0x70
[<c02cea47>] ? copy_from_user+0x37/0x70
[<c04cfd7c>] ? verify_iovec+0x2c/0x90
[<c04c837a>] ? sys_sendmsg+0x10a/0x230
[<c011967a>] ? __dequeue_entity+0x2a/0xa0
[<c011970d>] ? set_next_entity+0x1d/0x50
[<c0345397>] ? pty_write+0x47/0x60
[<c033d59b>] ? tty_default_put_char+0x1b/0x20
[<c011d2e9>] ? __wake_up+0x49/0x70
[<c033df99>] ? tty_ldisc_deref+0x39/0x90
[<c033ff20>] ? tty_write+0x1a0/0x1b0
[<c04c93af>] ? sys_socketcall+0x7f/0x260
[<c0102ff9>] ? sysenter_past_esp+0x6a/0x91
[<c05f0000>] ? snd_intel8x0m_probe+0x270/0x6e0
=======================
Code: 00 00 89 5c 24 14 8b 98 9c 00 00 00 89 54 24 0c 89 5c 24 10 8b 40 50 89 4c 24 04 c7 04 24 38 e6 71 c0 89 44 24 08 e8 c4 46 c5 ff <0f> 0b eb fe 55 89 e5 56 89 d6 53 89 c3 83 ec 0c 8b 40 50 39 d0
EIP: [<c04ccdfc>] skb_over_panic+0x5c/0x60 SS:ESP 0068:f6015bf8
Looking at the code, I ended up in nfq_mangle() function (called by
nfqnl_recv_verdict()) which performs a call to skb_copy_expand() due to
the increased size of data passed to the function. AFAICT, it should ask
for 'diff' instead of 'diff - skb_tailroom(e->skb)'. Because the
resulting sk_buff has not enough space to support the skb_put(skb, diff)
call a few lines later, this results in the call to skb_over_panic().
The patch below asks for allocation of a copy with enough space for
mangled packet and the same amount of headroom as old sk_buff. While
looking at how the regression appeared (e2b58a67), I noticed the same
pattern in ipq_mangle_ipv6() and ipq_mangle_ipv4(). The patch corrects
those locations too.
Tested with bigger reinjected IPv6 packets (nfqnl_mangle() path), things
are ok (2.6.25 and today's net-2.6 git tree).
Signed-off-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Directly call IPv4 and IPv6 variants where the address family is
easily known.
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Connection tracking helpers (specifically FTP) need to be called
before NAT sequence numbers adjustments are performed to be able
to compare them against previously seen ones. We've introduced
two new hooks around 2.6.11 to maintain this ordering when NAT
modules were changed to get called from conntrack helpers directly.
The cost of netfilter hooks is quite high and sequence number
adjustments are only rarely needed however. Add a RCU-protected
sequence number adjustment function pointer and call it from
IPv4 conntrack after calling the helper.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Adding extensions to confirmed conntracks is not allowed to avoid races
on reallocation. Don't setup NAT for confirmed conntracks in case NAT
module is loaded late.
The has one side-effect, the connections existing before the NAT module
was loaded won't enter the bysource hash. The only case where this actually
makes a difference is in case of SNAT to a multirange where the IP before
NAT is also part of the range. Since old connections don't enter the
bysource hash the first new connection from the IP will have a new address
selected. This shouldn't matter at all.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Locally generated ICMP packets have a reference to the conntrack entry
of the original packet manually attached by icmp_send(). Therefore the
check for locally originated untracked ICMP redirects can never be
true.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Move responsibility for setting the IP_NAT_RANGE_PROTO_SPECIFIED flag
to the NAT protocol, properly propagate errors and get rid of ugly
return value convention.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Move to nf_nat_proto_common and rename to nf_nat_proto_... since they're
also used by protocols that don't have port numbers.
Signed-off-by: Patrick McHardy <kaber@trash.net>
The port rover should not get overwritten when using random mode,
otherwise other rules will also use more or less random ports.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Rule dumping is performed in two steps: first userspace gets the
ruleset size using getsockopt(SO_GET_INFO) and allocates memory,
then it calls getsockopt(SO_GET_ENTRIES) to actually dump the
ruleset. When another process changes the ruleset in between the
sizes from the first getsockopt call doesn't match anymore and
the kernel aborts. Unfortunately it returns EAGAIN, as for multiple
other possible errors, so userspace can't distinguish this case
from real errors.
Return EAGAIN so userspace can retry the operation.
Fixes (with current iptables SVN version) netfilter bugzilla #104.
Signed-off-by: Patrick McHardy <kaber@trash.net>