Commit Graph

233187 Commits

Author SHA1 Message Date
Kashyap, Desai
bcfe42e980 [SCSI] mptfusion: Fix Incorrect return value in mptscsih_dev_reset
There's a branch at the end of this function that
is supposed to normalize the return value with what
the mid-layer expects. In this one case, we get it wrong.

Also increase the verbosity of the INFO level printk
at the end of mptscsih_abort to include the actual return value
and the scmd->serial_number. The reason being success
or failure is actually determined by the state of
the internal tag list when a TMF is issued, and not the
return value of the TMF cmd. The serial_number is also
used in this decision, thus it's useful to know for debugging
purposes.

Cc: stable@kernel.org
Reported-by: Peter M. Petrakis <peter.petrakis@canonical.com>
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-02-12 12:51:08 -06:00
Kashyap, Desai
84857c8bf8 [SCSI] mptfusion: mptctl_release is required in mptctl.c
Added missing release callback for file_operations mptctl_fops.
Without release callback there will be never freed. It remains on
mptctl's eent list even after the file is closed and released.

Relavent RHEL bugzilla is 660871

Cc: stable@kernel.org
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-02-12 12:50:48 -06:00
Nicholas Bellinger
1f6fe7cba1 [SCSI] target: fix use after free detected by SLUB poison
This patch moves a large number of memory release paths inside of the
configfs callback target_core_hba_item_ops->release() called from
within fs/configfs/item.c: config_item_cleanup() context.  This patch
resolves the SLUB 'Poison overwritten' warnings.

Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-02-12 12:32:41 -06:00
Nicholas Bellinger
e89d15eead [SCSI] target: Remove procfs based target_core_mib.c code
This patch removes the legacy procfs based target_core_mib.c code,
and moves the necessary scsi_index_tables functions and defines into
target_core_transport.c and target_core_base.h code to allow existing
fabric independent statistics to function.

This includes the removal of a handful of 'atomic_t mib_ref_count'
counters used in struct se_node_acl, se_session and se_hba to prevent
removal while using seq_list procfs walking logic.

[jejb: fix up compile failures]
Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-02-12 12:15:47 -06:00
Nicholas Bellinger
e63af95888 [SCSI] target: Fix SCF_SCSI_CONTROL_SG_IO_CDB breakage
This patch fixes a bug introduced during the v4 control CDB emulation
refactoring that broke SCF_SCSI_CONTROL_SG_IO_CDB operation within
transport_map_control_cmd_to_task().  It moves the BUG_ON() into
transport_do_se_mem_map() after the TRANSPORT(dev)->do_se_mem_map()
RAMDISK_DR special case, and adds the proper struct se_mem assignment
when !list_empty() for normal non RAMDISK_DR backend device cases.

Reported-by: Kai-Thorsten Hambrecht <kai@hambrecht.org>
Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-02-12 12:01:42 -06:00
Nicholas Bellinger
7c2bf6e925 [SCSI] target: Fix top-level configfs_subsystem default_group shutdown breakage
This patch fixes two bugs uncovered during testing with
slub_debug=FPUZ during module_exit() -> target_core_exit_configfs()
with release of configfs subsystem consumer default groups, namely how
this should be working with
fs/configfs/dir.c:configfs_unregister_subsystem() release logic for
struct config_group->default_group.

The first issue involves configfs_unregister_subsystem() expecting to
walk+drain the top-level subsys->su_group.default_groups directly in
unlink_group(), and not directly from the configfs subsystem consumer
for the top level struct config_group->default_groups.  This patch
drops the walk+drain of subsys->su_group.default_groups from TCM
configfs subsystem consumer code, and moves the top-level
->default_groups kfree() after configfs_unregister_subsystem() has
been called.

The second issue involves calling
core_alua_free_lu_gp(se_global->default_lu_gp) to release the
default_lu_gp->lu_gp_group before configfs_unregister_subsystem() has
been called.  This patches also moves the core_alua_free_lu_gp() call
to release default_lu_group->lu_gp_group after the subsys has been
unregistered.

Finally, this patch explictly clears the
[lu_gp,alua,hba]_cg->default_groups pointers after kfree() to ensure
that no stale memory is picked up from child struct
config_group->default_group[] while configfs_unregister_subsystem() is
called.

Reported-by: Fubo Chen <fubo.chen@gmail.com>
Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-02-12 11:39:14 -06:00
Fubo Chen
85dc98d93f [SCSI] target: fixed missing lock drop in error path
The struct se_node_acl->device_list_lock needs to be released if either
sanity check for struct se_dev_entry->se_lun_acl or deve->se_lun fails.

Signed-off-by: Fubo Chen <fubo.chen@gmail.com>
Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-02-12 11:38:17 -06:00
Nicholas Bellinger
29fe609d12 [SCSI] target: Fix demo-mode MappedLUN shutdown UA/PR breakage
This patch fixes a bug in core_update_device_list_for_node() where
individual demo-mode generated MappedLUN's UA + Persistent
Reservations metadata where being leaked, instead of falling through
and calling existing core_scsi3_ua_release_all() and
core_scsi3_free_pr_reg_from_nacl() at the end of
core_update_device_list_for_node().

This bug would manifest itself with the following OOPs w/ TPG
demo-mode endpoints (tfo->tpg_check_demo_mode()=1), and PROUT
REGISTER+RESERVE -> explict struct se_session logout -> struct
se_device shutdown:

[  697.021139] LIO_iblock used greatest stack depth: 2704 bytes left
[  702.235017] general protection fault: 0000 [#1] SMP
[  702.235074] last sysfs file: /sys/devices/virtual/net/lo/operstate
[  704.372695] CPU 0
[  704.372725] Modules linked in: crc32c target_core_stgt scsi_tgt target_core_pscsi target_core_file target_core_iblock target_core_mod configfs sr_mod cdrom sd_mod ata_piix mptspi mptscsih libata mptbase [last unloaded: iscsi_target_mod]
[  704.375442]
[  704.375563] Pid: 4964, comm: tcm_node Not tainted 2.6.37+ #1 440BX Desktop Reference Platform/VMware Virtual Platform
[  704.375912] RIP: 0010:[<ffffffffa00aaa16>]  [<ffffffffa00aaa16>] __core_scsi3_complete_pro_release+0x31/0x133 [target_core_mod]
[  704.376017] RSP: 0018:ffff88001e5ffcb8  EFLAGS: 00010296
[  704.376017] RAX: 6d32335b1b0a0d0a RBX: ffff88001d952cb0 RCX: 0000000000000015
[  704.376017] RDX: ffff88001b428000 RSI: ffff88001da5a4c0 RDI: ffff88001e5ffcd8
[  704.376017] RBP: ffff88001e5ffd28 R08: ffff88001e5ffcd8 R09: ffff88001d952080
[  704.377116] R10: ffff88001dfc5480 R11: ffff88001df8abb0 R12: ffff88001d952cb0
[  704.377319] R13: 0000000000000000 R14: ffff88001df8abb0 R15: ffff88001b428000
[  704.377521] FS:  00007f033d15c6e0(0000) GS:ffff88001fa00000(0000) knlGS:0000000000000000
[  704.377861] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  704.378043] CR2: 00007fff09281510 CR3: 000000001e5db000 CR4: 00000000000006f0
[  704.378110] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  704.378110] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  704.378110] Process tcm_node (pid: 4964, threadinfo ffff88001e5fe000, task ffff88001d99c260)
[  704.378110] Stack:
[  704.378110]  ffffea0000678980 ffff88001da5a4c0 ffffea0000678980 ffff88001f402b00
[  704.378110]  ffff88001e5ffd08 ffffffff810ea236 ffff88001e5ffd18 0000000000000282
[  704.379772]  ffff88001d952080 ffff88001d952cb0 ffff88001d952cb0 ffff88001dc79010
[  704.380082] Call Trace:
[  704.380220]  [<ffffffff810ea236>] ? __slab_free+0x89/0x11c
[  704.380403]  [<ffffffffa00ab781>] core_scsi3_free_all_registrations+0x3e/0x157 [target_core_mod]
[  704.380479]  [<ffffffffa00a752b>] se_release_device_for_hba+0xa6/0xd8 [target_core_mod]
[  704.380479]  [<ffffffffa00a7598>] se_free_virtual_device+0x3b/0x45 [target_core_mod]
[  704.383750]  [<ffffffffa00a3177>] target_core_drop_subdev+0x13a/0x18d [target_core_mod]
[  704.384068]  [<ffffffffa00960db>] client_drop_item+0x25/0x31 [configfs]
[  704.384263]  [<ffffffffa00967b5>] configfs_rmdir+0x1a1/0x223 [configfs]
[  704.384459]  [<ffffffff810fa8cd>] vfs_rmdir+0x7e/0xd3
[  704.384631]  [<ffffffff810fc3be>] do_rmdir+0xa3/0xf4
[  704.384895]  [<ffffffff810eed15>] ? filp_close+0x67/0x72
[  704.386485]  [<ffffffff810fc446>] sys_rmdir+0x11/0x13
[  704.387893]  [<ffffffff81002a92>] system_call_fastpath+0x16/0x1b
[  704.388083] Code: 4c 8d 45 b0 41 56 49 89 d7 41 55 41 89 cd 41 54 b9 15 00 00 00 53 48 89 fb 48 83 ec 48 4c 89 c7 48 89 75 98 48 8b 86 28 01 00 00 <48> 8b 80 90 01 00 00 48 89 45 a0 31 c0 f3 aa c7 45 ac 00 00 00
[  704.388763] RIP  [<ffffffffa00aaa16>] __core_scsi3_complete_pro_release+0x31/0x133 [target_core_mod]
[  704.389142]  RSP <ffff88001e5ffcb8>
[  704.389572] ---[ end trace 2a3614f3cd6261a5 ]---

Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-02-12 11:37:29 -06:00
Nicholas Bellinger
bc66552476 [SCSI] target/iblock: Fix failed bd claim NULL pointer dereference
This patch adds an explict check for struct iblock_dev->ibd_bd in
iblock_free_device() before calling blkdev_put(), which will otherwise hit
the following NULL pointer dereference @ ib_dev->ibd_bd when iblock_create_virtdevice()
fails to claim an already in-use struct block_device via blkdev_get_by_path().

[  112.528578] Target_Core_ConfigFS: Allocated struct se_subsystem_dev: ffff88001e750000 se_dev_su_ptr: ffff88001dd05d70
[  112.534681] Target_Core_ConfigFS: Calling t->free_device() for se_dev_su_ptr: ffff88001dd05d70
[  112.535029] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
[  112.535029] IP: [<ffffffff814987a3>] mutex_lock+0x14/0x35
[  112.535029] PGD 1e5d0067 PUD 1e274067 PMD 0
[  112.535029] Oops: 0002 [#1] SMP
[  112.535029] last sysfs file: /sys/devices/pci0000:00/0000:00:07.1/host2/target2:0:0/2:0:0:0/type
[  112.535029] CPU 0
[  112.535029] Modules linked in: iscsi_target_mod target_core_stgt scsi_tgt target_core_pscsi target_core_file target_core_iblock target_core_mod configfs sr_mod cdrom sd_mod ata_piix mptspi mptscsih libata mptbase [last unloaded: scsi_wait_scan]
[  112.535029]
[  112.535029] Pid: 3345, comm: python2.5 Not tainted 2.6.37+ #1 440BX Desktop Reference Platform/VMware Virtual Platform
[  112.535029] RIP: 0010:[<ffffffff814987a3>]  [<ffffffff814987a3>] mutex_lock+0x14/0x35
[  112.535029] RSP: 0018:ffff88001e6d7d58  EFLAGS: 00010246
[  112.535029] RAX: 0000000000000000 RBX: 0000000000000020 RCX: 0000000000000082
[  112.535029] RDX: ffff88001e6d7fd8 RSI: 0000000000000083 RDI: 0000000000000020
[  112.535029] RBP: ffff88001e6d7d68 R08: 0000000000000000 R09: 0000000000000000
[  112.535029] R10: ffff8800000be860 R11: ffff88001f420000 R12: 0000000000000020
[  112.535029] R13: 0000000000000083 R14: ffff88001d809430 R15: ffff88001d8094f8
[  112.535029] FS:  00007ff17ca7d6e0(0000) GS:ffff88001fa00000(0000) knlGS:0000000000000000
[  112.535029] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  112.535029] CR2: 0000000000000020 CR3: 000000001e5d2000 CR4: 00000000000006f0
[  112.535029] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  112.535029] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  112.535029] Process python2.5 (pid: 3345, threadinfo ffff88001e6d6000, task ffff88001e2d0760)
[  112.535029] Stack:
[  112.535029]  ffff88001e6d7d88 0000000000000000 ffff88001e6d7d98 ffffffff811187fc
[  112.535029]  ffff88001d809430 ffff88001dd05d70 ffff88001e750860 ffff88001e750000
[  112.535029]  ffff88001e6d7db8 ffffffffa00e3757 ffff88001e6d7db8 0000000000000004
[  112.535029] Call Trace:
[  112.535029]  [<ffffffff811187fc>] blkdev_put+0x28/0x107
[  112.535029]  [<ffffffffa00e3757>] iblock_free_device+0x1d/0x36 [target_core_iblock]
[  112.535029]  [<ffffffffa00a319c>] target_core_drop_subdev+0x15f/0x18d [target_core_mod]
[  112.535029]  [<ffffffffa00960db>] client_drop_item+0x25/0x31 [configfs]
[  112.535029]  [<ffffffffa00967b5>] configfs_rmdir+0x1a1/0x223 [configfs]
[  112.535029]  [<ffffffff810fa8cd>] vfs_rmdir+0x7e/0xd3
[  112.535029]  [<ffffffff810fc3be>] do_rmdir+0xa3/0xf4
[  112.535029]  [<ffffffff810fc446>] sys_rmdir+0x11/0x13
[  112.535029]  [<ffffffff81002a92>] system_call_fastpath+0x16/0x1b
[  112.535029] Code: 8b 04 25 88 b5 00 00 48 2d d8 1f 00 00 48 89 43 18 31 c0 5e 5b c9 c3 55 48 89 e5 53 48 89 fb 48 83 ec 08 e8 c4 f7 ff ff 48 89 df <3e> ff 0f 79 05 e8 1e ff ff ff 65 48 8b 04 25 88 b5 00 00 48 2d
[  112.535029] RIP  [<ffffffff814987a3>] mutex_lock+0x14/0x35
[  112.535029]  RSP <ffff88001e6d7d58>
[  112.535029] CR2: 0000000000000020
[  132.679636] ---[ end trace 05754bb48eb828f0 ]---

Note it also adds an second explict check for ib_dev->ibd_bio_set before calling
bioset_free() to fix the same possible NULL pointer deference during an early
iblock_create_virtdevice() failure.

Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-02-12 11:37:00 -06:00
Dan Carpenter
3ae279d259 [SCSI] target: iblock/pscsi claim checking for NULL instead of IS_ERR
blkdev_get_by_path() returns an ERR_PTR() or error and it doesn't return
a NULL.  It looks like this bug would be easy to trigger by mistake.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-02-12 11:29:07 -06:00
Darrick J. Wong
a361cc0025 [SCSI] scsi_debug: Fix 32-bit overflow in do_device_access causing memory corruption
If I create a scsi_debug device that is larger than 4GB, the multiplication of
(block * scsi_debug_sector_size) can produce a 64-bit value.  Unfortunately,
the compiler sees two 32-bit quantities and performs a 32-bit multiplication,
thus truncating the bits above 2^32.  This causes the wrong memory location to
be read or written.  Change block and rest to be unsigned long long.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-02-12 11:21:56 -06:00
Madhuranath Iyengar
044d78e1ac [SCSI] qla2xxx: Change from irq to irqsave with host_lock
Make the driver safer by using irqsave/irqrestore with host_lock.

Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-02-12 10:52:40 -06:00
James Bottomley
563585ec4b [SCSI] qla2xxx: Fix race that could hang kthread_stop()
There is a small race window in qla2x00_do_dpc() between
checking for kthread_should_stop() and going to sleep after
setting TASK_INTERRUPTIBLE. If qla2x00_free_device() is called
in this window, kthread_stop will wait forever because there
will be no one to wake up the process.

Fix by making sure we only set TASK_INTERRUPTIBLE before checking
kthread_stop().

Reported-by: Bandan Das <bandan.das@stratus.com>
Acked-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-02-12 10:17:13 -06:00
Linus Torvalds
3c6c0d6ca3 Merge branch 'kvm-updates/2.6.38' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/2.6.38' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: SVM: Make sure KERNEL_GS_BASE is valid when loading gs_index
2011-02-11 16:30:09 -08:00
Linus Torvalds
5b49378ec1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
  amd64_edac: Fix DIMMs per DCTs output
2011-02-11 16:30:05 -08:00
Linus Torvalds
d40b0c3482 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm:
  dlm: use single thread workqueues
2011-02-11 16:29:57 -08:00
Linus Torvalds
3aec46c1e0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  cifs: don't always drop malformed replies on the floor (try #3)
  cifs: clean up checks in cifs_echo_request
  [CIFS] Do not send SMBEcho requests on new sockets until SMBNegotiate
2011-02-11 16:29:50 -08:00
Linus Torvalds
68c3d4b266 Merge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/staging
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/staging:
  hwmon: (emc1403) Fix I2C address range
  hwmon: (lm63) Consider LM64 temperature offset
2011-02-11 16:16:25 -08:00
Linus Torvalds
f7909fb835 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
  pci: use security_capable() when checking capablities during config space read
  security: add cred argument to security_capable()
  tpm_tis: Use timeouts returned from TPM
2011-02-11 16:16:03 -08:00
Linus Torvalds
c41d40b533 Merge branch 's5p-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung
* 's5p-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
  ARM: SAMSUNG: Ensure struct sys_device is declared in plat/pm.h
  ARM: S5PV310: Cleanup System MMU
  ARM: S5PV310: Add support System MMU on SMDKV310
2011-02-11 16:15:15 -08:00
Linus Torvalds
a288465fa8 Merge branch 'next' of git://git.monstr.eu/linux-2.6-microblaze
* 'next' of git://git.monstr.eu/linux-2.6-microblaze:
  microblaze: Fix msr instruction detection
  microblaze: Fix pte_update function
  microblaze: Fix asm compilation warning
  microblaze: Fix IRQ flag handling for MSR=0
2011-02-11 16:13:53 -08:00
Julia Lawall
80d02d2736 drivers/w1/masters/omap_hdq.c: add missing clk_put
This code makes two calls to clk_get, then test both return values and
fails if either failed.

The problem is that in the first inner if, where the first call to
clk_get has failed, it don't know if the second call has failed as well.
So it don't know whether clk_get should be called on the result of the
second call.  Of course, it would be possible to test that value again.
A simpler solution is just to test the result of calling clk_get
directly after each call.

The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@r@
position p1,p2;
expression e;
statement S;
@@

e = clk_get@p1(...)
...
if@p2 (IS_ERR(e)) S

@@
expression e;
statement S;
identifier l;
position r.p1, p2 != r.p2;
@@

*e = clk_get@p1(...)
... when != clk_put(e)
*if@p2 (...)
{
  ... when != clk_put(e)
* return ...;
}// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Acked-by: Tony Lindgren <tony@atomide.com>
Acked-by: Amit Kucheria <amit.kucheria@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-11 16:12:20 -08:00
KAMEZAWA Hiroyuki
678ff896a3 memcg: fix leak of accounting at failure path of hugepage collapsing
mem_cgroup_uncharge_page() should be called in all failure cases after
mem_cgroup_charge_newpage() is called in huge_memory.c::collapse_huge_page()

 [ 4209.076861] BUG: Bad page state in process khugepaged  pfn:1e9800
 [ 4209.077601] page:ffffea0006b14000 count:0 mapcount:0 mapping:          (null) index:0x2800
 [ 4209.078674] page flags: 0x40000000004000(head)
 [ 4209.079294] pc:ffff880214a30000 pc->flags:2146246697418756 pc->mem_cgroup:ffffc9000177a000
 [ 4209.082177] (/A)
 [ 4209.082500] Pid: 31, comm: khugepaged Not tainted 2.6.38-rc3-mm1 #1
 [ 4209.083412] Call Trace:
 [ 4209.083678]  [<ffffffff810f4454>] ? bad_page+0xe4/0x140
 [ 4209.084240]  [<ffffffff810f53e6>] ? free_pages_prepare+0xd6/0x120
 [ 4209.084837]  [<ffffffff8155621d>] ? rwsem_down_failed_common+0xbd/0x150
 [ 4209.085509]  [<ffffffff810f5462>] ? __free_pages_ok+0x32/0xe0
 [ 4209.086110]  [<ffffffff810f552b>] ? free_compound_page+0x1b/0x20
 [ 4209.086699]  [<ffffffff810fad6c>] ? __put_compound_page+0x1c/0x30
 [ 4209.087333]  [<ffffffff810fae1d>] ? put_compound_page+0x4d/0x200
 [ 4209.087935]  [<ffffffff810fb015>] ? put_page+0x45/0x50
 [ 4209.097361]  [<ffffffff8113f779>] ? khugepaged+0x9e9/0x1430
 [ 4209.098364]  [<ffffffff8107c870>] ? autoremove_wake_function+0x0/0x40
 [ 4209.099121]  [<ffffffff8113ed90>] ? khugepaged+0x0/0x1430
 [ 4209.099780]  [<ffffffff8107c236>] ? kthread+0x96/0xa0
 [ 4209.100452]  [<ffffffff8100dda4>] ? kernel_thread_helper+0x4/0x10
 [ 4209.101214]  [<ffffffff8107c1a0>] ? kthread+0x0/0xa0
 [ 4209.101842]  [<ffffffff8100dda0>] ? kernel_thread_helper+0x0/0x10

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-11 16:12:20 -08:00
Johannes Weiner
f0fdc5e8e6 vmscan: fix zone shrinking exit when scan work is done
Commit 3e7d344970 ("mm: vmscan: reclaim order-0 and use compaction
instead of lumpy reclaim") introduced an indefinite loop in
shrink_zone().

It meant to break out of this loop when no pages had been reclaimed and
not a single page was even scanned.  The way it would detect the latter
is by taking a snapshot of sc->nr_scanned at the beginning of the
function and comparing it against the new sc->nr_scanned after the scan
loop.  But it would re-iterate without updating that snapshot, looping
forever if sc->nr_scanned changed at least once since shrink_zone() was
invoked.

This is not the sole condition that would exit that loop, but it
requires other processes to change the zone state, as the reclaimer that
is stuck obviously can not anymore.

This is only happening for higher-order allocations, where reclaim is
run back to back with compaction.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Michal Hocko <mhocko@suse.cz>
Tested-by: Kent Overstreet<kent.overstreet@gmail.com>
Reported-by: Kent Overstreet <kent.overstreet@gmail.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-11 16:12:20 -08:00
Michel Lespinasse
419d8c96db mlock: do not munlock pages in __do_fault()
If the page is going to be written to, __do_page needs to break COW.

However, the old page (before breaking COW) was never mapped mapped into
the current pte (__do_fault is only called when the pte is not present),
so vmscan can't have marked the old page as PageMlocked due to being
mapped in __do_fault's VMA.  Therefore, __do_fault() does not need to
worry about clearing PageMlocked() on the old page.

Signed-off-by: Michel Lespinasse <walken@google.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-11 16:12:20 -08:00