Race between bonding_store_slaves_active() and slave manipulation
functions. The bond_for_each_slave use in bonding_store_slaves_active()
is not protected by any synchronization mechanism.
NULL pointer dereference is easy to reach.
Fixed by acquiring the bond->lock for the slave walk.
v2: Make description text < 75 columns
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The module can be loaded with arp_ip_target="255.255.255.255" which makes
it impossible to remove as the function in sysfs checks for that value,
so we make the parameter checks consistent with sysfs.
v2: Fix formatting
v3: Make description text < 75 columns
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
First I would give three observations which will be used later.
Observation 1: if (delayed_work_pending(wq)) cancel_delayed_work(wq)
This usage is wrong because the pending bit is cleared just before the
work's fn is executed and if the function re-arms itself we might end up
with the work still running. It's safe to call cancel_delayed_work_sync()
even if the work is not queued at all.
Observation 2: Use of INIT_DELAYED_WORK()
Work needs to be initialized only once prior to (de/en)queueing.
Observation 3: IFF_UP is set only after ndo_open is called
Related race conditions:
1. Race between bonding_store_miimon() and bonding_store_arp_interval()
Because of Obs.1 we can end up having both works enqueued.
2. Multiple races with INIT_DELAYED_WORK()
Since the works are not protected by anything between INIT_DELAYED_WORK()
and calls to (en/de)queue it is possible for races between the following
functions:
(races are also possible between the calls to INIT_DELAYED_WORK()
and workqueue code)
bonding_store_miimon() - bonding_store_arp_interval(), bond_close(),
bond_open(), enqueued functions
bonding_store_arp_interval() - bonding_store_miimon(), bond_close(),
bond_open(), enqueued functions
3. By Obs.1 we need to change bond_cancel_all()
Bugs 1 and 2 are fixed by moving all work initializations in bond_open
which by Obs. 2 and Obs. 3 and the fact that we make sure that all works
are cancelled in bond_close(), is guaranteed not to have any work
enqueued.
Also RTNL lock is now acquired in bonding_store_miimon/arp_interval so
they can't race with bond_close and bond_open. The opposing work is
cancelled only if the IFF_UP flag is set and it is cancelled
unconditionally. The opposing work is already cancelled if the interface
is down so no need to cancel it again. This way we don't need new
synchronizations for the bonding workqueue. These bugs (and fixes) are
tied together and belong in the same patch.
Note: I have left 1 line intentionally over 80 characters (84) because I
didn't like how it looks broken down. If you'd prefer it otherwise,
then simply break it.
v2: Make description text < 75 columns
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
"Some more fixes trickled in over the past few days:
1) PIM device names can overflow the IFNAMSIZ buffer unless we
properly limit the allowed indexes, fix from Eric Dumazet.
2) Under heavy load we can OOPS in icmp reply processing due to an
unchecked inet_putpeer() call. Fix from Neal Cardwell.
3) SCTP round trip calculations need to use 64-bit math to avoid
overflows, fix from Schoch Christian.
4) Fix a memory leak and an error return flub in SCTP and IRDA
triggerable by userspace. Fix from Tommi Rantala and found by the
syscall fuzzer (trinity).
5) MLX4 driver gives bogus size to memcpy() call, fix from Amir
Vadai.
6) Fix length calculation in VHOST descriptor translation, from
Michael S Tsirkin.
7) Ambassador ATM driver loops forever while loading firmware, fix
from Dan Carpenter.
8) Over MTU packets in openvswitch warn about wrong device, fix from
Jesse Gross.
9) Netfilter IPSET's netlink code can overrun a string buffer because
it's not properly limited to IFNAMSIZ. Fix from Florian Westphal.
10) PCAN USB driver sets wrong timestamp in SKB, from Oliver Hartkopp.
11) Make sure the RX ifindex always has a valid value in the CAN BCM
driver, even if we haven't received a frame yet. Fix also from
Oliver Hartkopp."
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
team: fix hw_features setup
atm: forever loop loading ambassador firmware
vhost: fix length for cross region descriptor
irda: irttp: fix memory leak in irttp_open_tsap() error path
net: qmi_wwan: add Huawei E173
net/mlx4_en: Can set maxrate only for TC0
sctp: Error in calculation of RTTvar
sctp: fix -ENOMEM result with invalid user space pointer in sendto() syscall
sctp: fix memory leak in sctp_datamsg_from_user() when copy from user space fails
net: ipmr: limit MRT_TABLE identifiers
ipv4: avoid passing NULL to inet_putpeer() in icmpv4_xrlim_allow()
can: bcm: initialize ifindex for timeouts without previous frame reception
can: peak_usb: fix hwtstamp assignment
netfilter: ipset: fix netiface set name overflow
openvswitch: Store flow key len if ARP opcode is not request or reply.
openvswitch: Print device when warning about over MTU packets.
Do this in the same way bonding does. This fixed setup resolves performance
issues when using some cards with certain offloading.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
There was a forever loop introduced here when we converted this to
request_firmware() back in 2008.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
If a single descriptor crosses a region, the
second chunk length should be decremented
by size translated so far, instead it includes
the full descriptor length.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cleanup the memory we allocated earlier in irttp_open_tsap() when we hit
this error path. The leak goes back to at least 1da177e4
("Linux-2.6.12-rc2").
Discovered with Trinity (the syscall fuzzer).
Signed-off-by: Tommi Rantala <tt.rantala@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Huawei E173 is a QMI/wwan device which normally appear
as 12d1:1436 in Linux. The descriptors displayed in that
mode will be picked up by cdc_ether. But the modem has
another mode with a different device ID and a slightly
different set of descriptors. This is the mode used by
Windows like this:
3Modem: USB\VID_12D1&PID_140C&MI_00\6&3A1D2012&0&0000
Networkcard: USB\VID_12D1&PID_140C&MI_01\6&3A1D2012&0&0001
Appli.Inter: USB\VID_12D1&PID_140C&MI_02\6&3A1D2012&0&0002
PC UI Inter: USB\VID_12D1&PID_140C&MI_03\6&3A1D2012&0&0003
Reported-by: Thomas Schäfer <tschaefer@t-online.de>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
The calculation of RTTVAR involves the subtraction of two unsigned
numbers which
may causes rollover and results in very high values of RTTVAR when RTT > SRTT.
With this patch it is possible to set RTOmin = 1 to get the minimum of RTO at
4 times the clock granularity.
Change Notes:
v2)
*Replaced abs() by abs64() and long by __s64, changed patch
description.
Signed-off-by: Christian Schoch <e0326715@student.tuwien.ac.at>
CC: Vlad Yasevich <vyasevich@gmail.com>
CC: Sridhar Samudrala <sri@us.ibm.com>
CC: Neil Horman <nhorman@tuxdriver.com>
CC: linux-sctp@vger.kernel.org
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Consider the following program, that sets the second argument to the
sendto() syscall incorrectly:
#include <string.h>
#include <arpa/inet.h>
#include <sys/socket.h>
int main(void)
{
int fd;
struct sockaddr_in sa;
fd = socket(AF_INET, SOCK_STREAM, 132 /*IPPROTO_SCTP*/);
if (fd < 0)
return 1;
memset(&sa, 0, sizeof(sa));
sa.sin_family = AF_INET;
sa.sin_addr.s_addr = inet_addr("127.0.0.1");
sa.sin_port = htons(11111);
sendto(fd, NULL, 1, 0, (struct sockaddr *)&sa, sizeof(sa));
return 0;
}
We get -ENOMEM:
$ strace -e sendto ./demo
sendto(3, NULL, 1, 0, {sa_family=AF_INET, sin_port=htons(11111), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ENOMEM (Cannot allocate memory)
Propagate the error code from sctp_user_addto_chunk(), so that we will
tell user space what actually went wrong:
$ strace -e sendto ./demo
sendto(3, NULL, 1, 0, {sa_family=AF_INET, sin_port=htons(11111), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EFAULT (Bad address)
Noticed while running Trinity (the syscall fuzzer).
Signed-off-by: Tommi Rantala <tt.rantala@gmail.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Trinity (the syscall fuzzer) discovered a memory leak in SCTP,
reproducible e.g. with the sendto() syscall by passing invalid
user space pointer in the second argument:
#include <string.h>
#include <arpa/inet.h>
#include <sys/socket.h>
int main(void)
{
int fd;
struct sockaddr_in sa;
fd = socket(AF_INET, SOCK_STREAM, 132 /*IPPROTO_SCTP*/);
if (fd < 0)
return 1;
memset(&sa, 0, sizeof(sa));
sa.sin_family = AF_INET;
sa.sin_addr.s_addr = inet_addr("127.0.0.1");
sa.sin_port = htons(11111);
sendto(fd, NULL, 1, 0, (struct sockaddr *)&sa, sizeof(sa));
return 0;
}
As far as I can tell, the leak has been around since ~2003.
Signed-off-by: Tommi Rantala <tt.rantala@gmail.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use synchronize_sched_expedited() instead of synchronize_sched()
to improve mount speed.
This patch improves mount time from 0.500s to 0.013s for Jeff's
test-case.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reported-and-tested-by: Jeff Chua <jeff.chua.linux@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull media fixes from Mauro Carvalho Chehab:
"For some media fixes:
- dvb_usb_v2: some fixes at the core
- Some fixes on some embedded drivers: soc_camera, adv7604, omap3isp,
exynos/s5p
- Several Exynos4/5 camera fixes
- a fix at stv0900 driver
- a few USB ID additions to detect more variants of rtl28xxu-based
sticks"
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (25 commits)
[media] rtl28xxu: 0ccd:00d7 TerraTec Cinergy T Stick+
[media] rtl28xxu: 1d19:1102 Dexatek DK mini DVB-T Dongle
[media] mt9v022: fix the V4L2_CID_EXPOSURE control
[media] mx2_camera: fix missing unlock on error in mx2_start_streaming()
[media] media: omap1_camera: fix const cropping related warnings
[media] media: mx1_camera: use the default .set_crop() implementation
[media] media: mx2_camera: fix const cropping related warnings
[media] media: mx3_camera: fix const cropping related warnings
[media] media: pxa_camera: fix const cropping related warnings
[media] media: sh_mobile_ceu_camera: fix const cropping related warnings
[media] media: sh_vou: fix const cropping related warnings
[media] adv7604: restart STDI once if format is not found
[media] adv7604: use presets where possible
[media] adv7604: Replace prim_mode by mode
[media] adv7604: cleanup references
[media] dvb_usb_v2: switch interruptible mutex to normal
[media] dvb_usb_v2: fix pid_filter callback error logging
[media] exynos-gsc: change driver compatible string
[media] omap3isp: Fix warning caused by bad subdev events operations prototypes
[media] omap3isp: video: Fix warning caused by bad vidioc_s_crop prototype
...
Merge misc fixes from Andrew Morton:
"8 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (8 patches)
futex: avoid wake_futex() for a PI futex_q
watchdog: using u64 in get_sample_period()
writeback: put unused inodes to LRU after writeback completion
mm: vmscan: check for fatal signals iff the process was throttled
Revert "mm: remove __GFP_NO_KSWAPD"
proc: check vma->vm_file before dereferencing
UAPI: strip the _UAPI prefix from header guards during header installation
include/linux/bug.h: fix sparse warning related to BUILD_BUG_ON_INVALID
Pull TTY fix from Greg Kroah-Hartman:
"Here is a single fix for a reported regression in 3.7-rc5 for the tty
layer. This fix has been in the linux-next tree and solves the
reported problem.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'tty-3.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
tty vt: Fix a regression in command line edition
Pull MFD fixes from Samuel Ortiz:
- A twl fix preventing a buffer overflow.
- A wm5102 register patch fix.
- A wm5110 error misreport fix.
- Arizona fixes: Use the right array size when adding subdevices,
correctly report underclocked events, synchronize register cache
after reset.
- A twl4030 fix for preventing the system to hang from an interrupt
flood.
* tag 'mfd-for-linus-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6:
mfd: twl4030: Fix chained irq handling on resume from suspend
mfd: arizona: Sync regcache after reset
mfd: arizona: Correctly report when AIF2/AIF1 is underclocked
mfd: arizona: Use correct array for ARRAY_SIZE in mfd_add_devices call
mfd: wm5110: Disable control interface error report for WM5110 rev B
mfd: wm5102: Update register patch for latest evaluation
mfd: twl-core: Fix chip ID for the twl6030-pwm module
Pull ARM fixes from Russell King:
"Not much here, just a couple minor/cosmetic fixes and a patch for the
decompressor which fixes problems with modern GCC and CPUs."
* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
ARM: 7583/1: decompressor: Enable unaligned memory access for v6 and above
ARM: 7572/1: proc-v6.S: fix comment
ARM: 7570/1: quiet down the non make -s output
Pull ext3 regression fix from Jan Kara:
"Fix an ext3 regression introduced during 3.7 merge window. It leads
to deadlock if you stress the filesystem in the right way (luckily
only if blocksize < pagesize)."
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
jbd: Fix lock ordering bug in journal_unmap_buffer()
Dave Jones reported a bug with futex_lock_pi() that his trinity test
exposed. Sometime between queue_me() and taking the q.lock_ptr, the
lock_ptr became NULL, resulting in a crash.
While futex_wake() is careful to not call wake_futex() on futex_q's with
a pi_state or an rt_waiter (which are either waiting for a
futex_unlock_pi() or a PI futex_requeue()), futex_wake_op() and
futex_requeue() do not perform the same test.
Update futex_wake_op() and futex_requeue() to test for q.pi_state and
q.rt_waiter and abort with -EINVAL if detected. To ensure any future
breakage is caught, add a WARN() to wake_futex() if the same condition
is true.
This fix has seen 3 hours of testing with "trinity -c futex" on an
x86_64 VM with 4 CPUS.
[akpm@linux-foundation.org: tidy up the WARN()]
Signed-off-by: Darren Hart <dvhart@linux.intel.com>
Reported-by: Dave Jones <davej@redat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: John Kacur <jkacur@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In get_sample_period(), unsigned long is not enough:
watchdog_thresh * 2 * (NSEC_PER_SEC / 5)
case1:
watchdog_thresh is 10 by default, the sample value will be: 0xEE6B2800
case2:
set watchdog_thresh is 20, the sample value will be: 0x1 DCD6 5000
In case2, we need use u64 to express the sample period. Otherwise,
changing the threshold thru proc often can not be successful.
Signed-off-by: liu chuansheng <chuansheng.liu@intel.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 169ebd9013 ("writeback: Avoid iput() from flusher thread")
removed iget-iput pair from inode writeback. As a side effect, inodes
that are dirty during iput_final() call won't be ever added to inode LRU
(iput_final() doesn't add dirty inodes to LRU and later when the inode
is cleaned there's noone to add the inode there). Thus inodes are
effectively unreclaimable until someone looks them up again.
The practical effect of this bug is limited by the fact that inodes are
pinned by a dentry for long enough that the inode gets cleaned. But
still the bug can have nasty consequences leading up to OOM conditions
under certain circumstances. Following can easily reproduce the
problem:
for (( i = 0; i < 1000; i++ )); do
mkdir $i
for (( j = 0; j < 1000; j++ )); do
touch $i/$j
echo 2 > /proc/sys/vm/drop_caches
done
done
then one needs to run 'sync; ls -lR' to make inodes reclaimable again.
We fix the issue by inserting unused clean inodes into the LRU after
writeback finishes in inode_sync_complete().
Signed-off-by: Jan Kara <jack@suse.cz>
Reported-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: <stable@vger.kernel.org> [3.5+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>