Add __percpu sparse annotations to net.
These annotations are to make sparse consider percpu variables to be
in a different address space and warn if accessed without going
through percpu accessors. This patch doesn't affect normal builds.
The macro and type tricks around snmp stats make things a bit
interesting. DEFINE/DECLARE_SNMP_STAT() macros mark the target field
as __percpu and SNMP_UPD_PO_STATS() macro is updated accordingly. All
snmp_mib_*() users which used to cast the argument to (void **) are
updated to cast it to (void __percpu **).
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Vlad Yasevich <vladislav.yasevich@hp.com>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
DCCP is datagram-oriented but lacks UDP's support for MSG_TRUNC as defined in
recvmsg(2)/recv(2). Hence the following 'Hello world\0' receiver
len = recv(fd, buf, 10, MSG_PEEK | MSG_TRUNC);
wrongly (always) returns 10, while in UDP it returns 12 as expected.
This patch adds the missing MSG_TRUNC support to recvmsg().
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes a problem in the DCCP getsockopt() API: currently there is no way
for a user to a priori know the number of built-in CCIDs, other than trying
DCCP_SOCKOPT_AVAILABLE_CCIDS in a loop, incrementing the option length until
EINVAL is no longer returned.
This patch truncates the array to the user-provided length. No copy is made
when the length is <= 0.
Due to the length restriction in do_dccp_getsockopt() to sizeof(int), the
minimum array length remains 4, which is a reasonable default (only 3
CCIDs, CCID-2..4, are currently defined).
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes commit (38ff3e6bb9) ("dccp_probe:
Fix module load dependencies between dccp and dccp_probe", from 15 Jan).
It fixes the construction of the first argument of try_then_request_module(),
where only valid return codes from the first argument should be returned.
What we do now is assign the result of register_jprobe() to ret, without
the side effect of the comparison.
Acked-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes a bug introduced in commit de4ef86cfc
("dccp: fix dccp rmmod when kernel configured to use slub", 17 Jan): the
vsnprintf used sizeof(slab_name_fmt), which became truncated to 4 bytes, since
slab_name_fmt is now a 4-byte pointer and no longer a 32-character array.
This lead to error messages such as
FATAL: Error inserting dccp: No buffer space available
>> kernel: [ 1456.341501] kmem_cache_create: duplicate cache cci
generated due to the truncation after the 3rd character.
Fixed for the moment by introducing a symbolic constant. Tested to fix the bug.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hey all-
I was tinkering with dccp recently and noticed that I BUG halted the
kernel when I rmmod-ed the dccp module. The bug halt occured because the page
that I passed to kfree failed the PageCompound and PageSlab test in the slub
implementation of kfree. I tracked the problem down to the following set of
events:
1) dccp, unlike all other uses of kmem_cache_create, allocates a string
dynamically when registering a slab cache. This allocated string is freed when
the cache is destroyed.
2) Normally, (1) is not an issue, but when Slub is in use, it is possible that
caches are 'merged'. This process causes multiple caches of simmilar
configuration to use the same cache data structure. When this happens, the new
name of the cache is effectively dropped.
3) (2) results in kmem_cache_name returning an ambigous value (i.e.
ccid_kmem_cache_destroy, which uses this fuction to retrieve the name pointer
for freeing), is no longer guaranteed that the string it assigned is what is
returned.
4) If such merge event occurs, ccid_kmem_cache_destroy frees the wrong pointer,
which trips over the BUG in the slub implementation of kfree (since its likely
not a slab allocation, but rather a pointer into the static string table
section.
So, what to do about this. At first blush this is pretty clearly a leak in the
information that slub owns, and as such a slub bug. Unfortunately, theres no
really good way to fix it, without exposing slub specific implementation details
to the generic slab interface. Also, even if we could fix this in slub cleanly,
I think the RCU free option would force us to do lots of string duplication, not
only in slub, but in every slab allocator. As such, I'd like to propose this
solution. Basically, I just move the storage for the kmem cache name to the
ccid_operations structure. In so doing, we don't have to do the kstrdup or
kfree when we allocate/free the various caches for dccp, and so we avoid the
problem, by storing names with static memory, rather than heap, the way all
other calls to kmem_cache_create do.
I've tested this out myself here, and it solves the problem quite well.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
__net_init/__net_exit are apparently not going away, so use them
to full extent.
In some cases __net_init was removed, because it was called from
__net_exit code.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This was just recently reported to me. When built as modules, the
dccp_probe module has a silent dependency on the dccp module. This
stems from the fact that the module_init routine of dccp_probe
registers a jprobe on the dccp_sendmsg symbol. Since the symbol is
only referenced as a text string (the .symbol_name field in the jprobe
struct) rather than the address of the symbol itself, depmod never
picks this dependency up, and so if you load the dccp_probe module
without the dccp module loaded, the register_jprobe call fails with an
-EINVAL, and the whole module load fails.
The fix is pretty easy, we can just wrap the register_jprobe call in a
try_then_request_module call, which forces the dependency to get
satisfied prior to the probe registration.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
rename kfifo_put... into kfifo_in... to prevent miss use of old non in
kernel-tree drivers
ditto for kfifo_get... -> kfifo_out...
Improve the prototypes of kfifo_in and kfifo_out to make the kerneldoc
annotations more readable.
Add mini "howto porting to the new API" in kfifo.h
Signed-off-by: Stefani Seibold <stefani@seibold.net>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move the pointer to the spinlock out of struct kfifo. Most users in
tree do not actually use a spinlock, so the few exceptions now have to
call kfifo_{get,put}_locked, which takes an extra argument to a
spinlock.
Signed-off-by: Stefani Seibold <stefani@seibold.net>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is a new generic kernel FIFO implementation.
The current kernel fifo API is not very widely used, because it has to
many constrains. Only 17 files in the current 2.6.31-rc5 used it.
FIFO's are like list's a very basic thing and a kfifo API which handles
the most use case would save a lot of development time and memory
resources.
I think this are the reasons why kfifo is not in use:
- The API is to simple, important functions are missing
- A fifo can be only allocated dynamically
- There is a requirement of a spinlock whether you need it or not
- There is no support for data records inside a fifo
So I decided to extend the kfifo in a more generic way without blowing up
the API to much. The new API has the following benefits:
- Generic usage: For kernel internal use and/or device driver.
- Provide an API for the most use case.
- Slim API: The whole API provides 25 functions.
- Linux style habit.
- DECLARE_KFIFO, DEFINE_KFIFO and INIT_KFIFO Macros
- Direct copy_to_user from the fifo and copy_from_user into the fifo.
- The kfifo itself is an in place member of the using data structure, this save an
indirection access and does not waste the kernel allocator.
- Lockless access: if only one reader and one writer is active on the fifo,
which is the common use case, no additional locking is necessary.
- Remove spinlock - give the user the freedom of choice what kind of locking to use if
one is required.
- Ability to handle records. Three type of records are supported:
- Variable length records between 0-255 bytes, with a record size
field of 1 bytes.
- Variable length records between 0-65535 bytes, with a record size
field of 2 bytes.
- Fixed size records, which no record size field.
- Preserve memory resource.
- Performance!
- Easy to use!
This patch:
Since most users want to have the kfifo as part of another object,
reorganize the code to allow including struct kfifo in another data
structure. This requires changing the kfifo_alloc and kfifo_init
prototypes so that we pass an existing kfifo pointer into them. This
patch changes the implementation and all existing users.
[akpm@linux-foundation.org: fix warning]
Signed-off-by: Stefani Seibold <stefani@seibold.net>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
First patch changes __inet_hash_nolisten() and __inet6_hash()
to get a timewait parameter to be able to unhash it from ehash
at same time the new socket is inserted in hash.
This makes sure timewait socket wont be found by a concurrent
writer in __inet_check_established()
Reported-by: kapil dakhane <kdakhane@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1815 commits)
mac80211: fix reorder buffer release
iwmc3200wifi: Enable wimax core through module parameter
iwmc3200wifi: Add wifi-wimax coexistence mode as a module parameter
iwmc3200wifi: Coex table command does not expect a response
iwmc3200wifi: Update wiwi priority table
iwlwifi: driver version track kernel version
iwlwifi: indicate uCode type when fail dump error/event log
iwl3945: remove duplicated event logging code
b43: fix two warnings
ipw2100: fix rebooting hang with driver loaded
cfg80211: indent regulatory messages with spaces
iwmc3200wifi: fix NULL pointer dereference in pmkid update
mac80211: Fix TX status reporting for injected data frames
ath9k: enable 2GHz band only if the device supports it
airo: Fix integer overflow warning
rt2x00: Fix padding bug on L2PAD devices.
WE: Fix set events not propagated
b43legacy: avoid PPC fault during resume
b43: avoid PPC fault during resume
tcp: fix a timewait refcnt race
...
Fix up conflicts due to sysctl cleanups (dead sysctl_check code and
CTL_UNNUMBERED removed) in
kernel/sysctl_check.c
net/ipv4/sysctl_net_ipv4.c
net/ipv6/addrconf.c
net/sctp/sysctl.c
Add optional function parameters associated with sending SYNACK.
These parameters are not needed after sending SYNACK, and are not
used for retransmission. Avoids extending struct tcp_request_sock,
and avoids allocating kernel memory.
Also affects DCCP as it uses common struct request_sock_ops,
but this parameter is currently reserved for future use.
Signed-off-by: William.Allen.Simpson@gmail.com
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that sys_sysctl is a compatiblity wrapper around /proc/sys
all sysctl strategy routines, and all ctl_name and strategy
entries in the sysctl tables are unused, and can be
revmoed.
In addition neigh_sysctl_register has been modified to no longer
take a strategy argument and it's callers have been modified not
to pass one.
Cc: "David Miller" <davem@davemloft.net>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: netdev@vger.kernel.org
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
struct can_proto had a capability field which wasn't ever used. It is
dropped entirely.
struct inet_protosw had a capability field which can be more clearly
expressed in the code by just checking if sock->type = SOCK_RAW.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dst_negative_advice() should check for changed dst and reset
sk_tx_queue_mapping accordingly. Pass sock to the callers of
dst_negative_advice.
(sk_reset_txq is defined just for use by dst_negative_advice. The
only way I could find to get around this is to move dst_negative_()
from dst.h to dst.c, include sock.h in dst.c, etc)
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to have better cache layouts of struct sock (separate zones
for rx/tx paths), we need this preliminary patch.
Goal is to transfert fields used at lookup time in the first
read-mostly cache line (inside struct sock_common) and move sk_refcnt
to a separate cache line (only written by rx path)
This patch adds inet_ prefix to daddr, rcv_saddr, dport, num, saddr,
sport and id fields. This allows a future patch to define these
fields as macros, like sk_refcnt, without name clashes.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Storing the mask (size - 1) instead of the size allows fast path to be
a bit faster.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Might as well use the ipv6_addr_set_v4mapped() inline we created last
year.
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>