Commit Graph

399502 Commits

Author SHA1 Message Date
Mark Tinguely
2900a579ab xfs: add the inode directory type support to XFS_IOC_FSGEOM
Add the inode type directory type support to XFS_IOC_FSGEOM
so that xfs_repair/xfs_info knows if the superblock v4 filesystem
enabled the feature.

Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-10-08 14:28:09 -05:00
Ben Myers
d948709b8e xfs: remove usage of is_bad_inode
XFS never calls mark_inode_bad or iget_failed, so it will never see a
bad inode.  Remove all checks for is_bad_inode because they are
unnecessary.

Signed-off-by: Ben Myers <bpm@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2013-10-01 17:38:16 -05:00
Jie Liu
17ec81c15f xfs: fix the wrong new_size/rnew_size at xfs_iext_realloc_direct()
At xfs_iext_realloc_direct(), the new_size is changed by adding
if_bytes if originally the extent records are stored at the inline
extent buffer, and we have to switch from it to a direct extent
list for those new allocated extents, this is wrong. e.g,

Create a file with three extents which was showing as following,

xfs_io -f -c "truncate 100m" /xfs/testme

for i in $(seq 0 5 10); do
	offset=$(($i * $((1 << 20))))
	xfs_io -c "pwrite $offset 1m" /xfs/testme
done

Inline
------
irec:	if_bytes	bytes_diff	new_size
1st	0		16		16
2nd	16		16		32

Switching
---------						rnew_size
3rd	32		16		48 + 32 = 80	roundup=128

In this case, the desired value of new_size should be 48, and then
it will be roundup to 64 and be assigned to rnew_size.

However, this issue has been covered by resetting the if_bytes to
the new_size which is calculated at the begnning of xfs_iext_add()
before leaving out this function, and in turn make the rnew_size
correctly again. Hence, this can not be detected via xfstestes.

This patch fix above problem and revise the new_size comments at
xfs_iext_realloc_direct() to make it more readable.  Also, fix the
comments while switching from the inline extent buffer to a direct
extent list to reflect this change.

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-10-01 17:33:10 -05:00
Jie Liu
0799a3e808 xfs: get rid of count from xfs_iomap_write_allocate()
Get rid of function variable count from xfs_iomap_write_allocate() as
it is unused.

Additionally, checkpatch warn me of the following for this change:
WARNING: extern prototypes should be avoided in .h files
+extern int xfs_iomap_write_allocate(struct xfs_inode *, xfs_off_t,

So this patch also remove all extern function prototypes at xfs_iomap.h
to suppress it to make this code style in consistent manner in this file.

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-10-01 15:42:34 -05:00
Thierry Reding
aaaae98022 xfs: Use kmem_free() instead of free()
This fixes a build failure caused by calling the free() function which
does not exist in the Linux kernel.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-10-01 10:26:24 -05:00
tinguely@sgi.com
519ccb81ac xfs: fix memory leak in xlog_recover_add_to_trans
Free the memory in error path of xlog_recover_add_to_trans().
Normally this memory is freed in recovery pass2, but is leaked
in the error path.

Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-09-30 17:52:43 -05:00
Dave Chinner
367993e7c6 xfs: dirent dtype presence is dependent on directory magic numbers
The determination of whether a directory entry contains a dtype
field originally was dependent on the filesystem having CRCs
enabled. This meant that the format for dtype beign enabled could be
determined by checking the directory block magic number rather than
doing a feature bit check. This was useful in that it meant that we
didn't need to pass a struct xfs_mount around to functions that
were already supplied with a directory block header.

Unfortunately, the introduction of dtype fields into the v4
structure via a feature bit meant this "use the directory block
magic number" method of discriminating the dirent entry sizes is
broken. Hence we need to convert the places that use magic number
checks to use feature bit checks so that they work correctly and not
by chance.

The current code works on v4 filesystems only because the dirent
size roundup covers the extra byte needed by the dtype field in the
places where this problem occurs.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-09-30 17:49:28 -05:00
Dave Chinner
f112a04971 xfs: lockdep needs to know about 3 dquot-deep nesting
Michael Semon reported that xfs/299 generated this lockdep warning:

=============================================
[ INFO: possible recursive locking detected ]
3.12.0-rc2+ #2 Not tainted
---------------------------------------------
touch/21072 is trying to acquire lock:
 (&xfs_dquot_other_class){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64

but task is already holding lock:
 (&xfs_dquot_other_class){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&xfs_dquot_other_class);
  lock(&xfs_dquot_other_class);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

7 locks held by touch/21072:
 #0:  (sb_writers#10){++++.+}, at: [<c11185b6>] mnt_want_write+0x1e/0x3e
 #1:  (&type->i_mutex_dir_key#4){+.+.+.}, at: [<c11078ee>] do_last+0x245/0xe40
 #2:  (sb_internal#2){++++.+}, at: [<c122c9e0>] xfs_trans_alloc+0x1f/0x35
 #3:  (&(&ip->i_lock)->mr_lock/1){+.+...}, at: [<c126cd1b>] xfs_ilock+0x100/0x1f1
 #4:  (&(&ip->i_lock)->mr_lock){++++-.}, at: [<c126cf52>] xfs_ilock_nowait+0x105/0x22f
 #5:  (&dqp->q_qlock){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64
 #6:  (&xfs_dquot_other_class){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64

The lockdep annotation for dquot lock nesting only understands
locking for user and "other" dquots, not user, group and quota
dquots. Fix the annotations to match the locking heirarchy we now
have.

Reported-by: Michael L. Semon <mlsemon35@gmail.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-09-30 17:48:25 -05:00
Mark Tinguely
997def25e4 xfs: fix node forward in xfs_node_toosmall
Commit f5ea1100 cleans up the disk to host conversions for
node directory entries, but because a variable is reused in
xfs_node_toosmall() the next node is not correctly found.
If the original node is small enough (<= 3/8 of the node size),
this change may incorrectly cause a node collapse when it should
not. That will cause an assert in xfstest generic/319:

   Assertion failed: first <= last && last < BBTOB(bp->b_length),
   file: /root/newest/xfs/fs/xfs/xfs_trans_buf.c, line: 569

Keep the original node header to get the correct forward node.

(When a node is considered for a merge with a sibling, it overwrites the
 sibling pointers of the original incore nodehdr with the sibling's
 pointers.  This leads to loop considering the original node as a merge
 candidate with itself in the second pass, and so it incorrectly
 determines a merge should occur.)

Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

[v3: added Dave Chinner's (slightly modified) suggestion to the commit header,
	cleaned up whitespace.  -bpm]
2013-09-26 10:38:17 -05:00
Dave Chinner
566055d33a xfs: log recovery lsn ordering needs uuid check
After a fair number of xfstests runs, xfs/182 started to fail
regularly with a corrupted directory - a directory read verifier was
failing after recovery because it found a block with a XARM magic
number (remote attribute block) rather than a directory data block.

The first time I saw this repeated failure I did /something/ and the
problem went away, so I was never able to find the underlying
problem. Test xfs/182 failed again today, and I found the root
cause before I did /something else/ that made it go away.

Tracing indicated that the block in question was being correctly
logged, the log was being flushed by sync, but the buffer was not
being written back before the shutdown occurred. Tracing also
indicated that log recovery was also reading the block, but then
never writing it before log recovery invalidated the cache,
indicating that it was not modified by log recovery.

More detailed analysis of the corpse indicated that the filesystem
had a uuid of "a4131074-1872-4cac-9323-2229adbcb886" but the XARM
block had a uuid of "8f32f043-c3c9-e7f8-f947-4e7f989c05d3", which
indicated it was a block from an older filesystem. The reason that
log recovery didn't replay it was that the LSN in the XARM block was
larger than the LSN of the transaction being replayed, and so the
block was not overwritten by log recovery.

Hence, log recovery cant blindly trust the magic number and LSN in
the block - it must verify that it belongs to the filesystem being
recovered before using the LSN. i.e. if the UUIDs don't match, we
need to unconditionally recovery the change held in the log.

This patch was first tested on a block device that was repeatedly
causing xfs/182 to fail with the same failure on the same block with
the same directory read corruption signature (i.e. XARM block). It
did not fail, and hasn't failed since.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-09-24 12:35:57 -05:00
Dave Chinner
b771af2fcb xfs: fix XFS_IOC_FREE_EOFBLOCKS definition
It uses a kernel internal structure in it's definition rather than
the user visible structure that is passed to the ioctl.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-09-24 12:35:08 -05:00
Dave Chinner
b313a5f1cb xfs: asserting lock not held during freeing not valid
When we free an inode, we do so via RCU. As an RCU lookup can occur
at any time before we free an inode, and that lookup takes the inode
flags lock, we cannot safely assert that the flags lock is not held
just before marking it dead and running call_rcu() to free the
inode.

We check on allocation of a new inode structre that the lock is not
held, so we still have protection against locks being leaked and
hence not correctly initialised when allocated out of the slab.
Hence just remove the assert...

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-09-24 12:32:57 -05:00
Dave Chinner
4885235806 xfs: lock the AIL before removing the buffer item
Regression introduced by commit 46f9d2e ("xfs: aborted buf items can
be in the AIL") which fails to lock the AIL before removing the
item. Spinlock debugging throws a warning about this.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-09-24 12:31:41 -05:00
Linus Torvalds
272b98c645 Linux 3.12-rc1 2013-09-16 16:17:51 -04:00
Linus Torvalds
a4ae54f90e Merge branch 'timers/core' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer code update from Thomas Gleixner:
 - armada SoC clocksource overhaul with a trivial merge conflict
 - Minor improvements to various SoC clocksource drivers

* 'timers/core' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource: armada-370-xp: Add detailed clock requirements in devicetree binding
  clocksource: armada-370-xp: Get reference fixed-clock by name
  clocksource: armada-370-xp: Replace WARN_ON with BUG_ON
  clocksource: armada-370-xp: Fix device-tree binding
  clocksource: armada-370-xp: Introduce new compatibles
  clocksource: armada-370-xp: Use CLOCKSOURCE_OF_DECLARE
  clocksource: armada-370-xp: Simplify TIMER_CTRL register access
  clocksource: armada-370-xp: Use BIT()
  ARM: timer-sp: Set dynamic irq affinity
  ARM: nomadik: add dynamic irq flag to the timer
  clocksource: sh_cmt: 32-bit control register support
  clocksource: em_sti: Convert to devm_* managed helpers
2013-09-16 16:10:26 -04:00
Linus Torvalds
3369d11693 Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Pull CIFS fixes from Steve French:
 "Two minor cifs fixes and a minor documentation cleanup for cifs.txt"

* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: update cifs.txt and remove some outdated infos
  cifs: Avoid calling unlock_page() twice in cifs_readpage() when using fscache
  cifs: Do not take a reference to the page in cifs_readpage_worker()
2013-09-16 15:39:21 -04:00
Linus Torvalds
f1da3458e9 Merge tag 'upstream-3.12-rc1' of git://git.infradead.org/linux-ubi
Pull UBI fixes from Artem Bityutskiy:
 "Just a single fastmap fix plus a regression fix"

* tag 'upstream-3.12-rc1' of git://git.infradead.org/linux-ubi:
  UBI: Fix invalidate_fastmap()
  UBI: Fix PEB leak in wear_leveling_worker()
2013-09-16 15:37:52 -04:00
Linus Torvalds
098e7f1665 Merge tag 'upstream-3.12-rc1' of git://git.infradead.org/linux-ubifs
Pull ubifs fix from Artem Bityutskiy:
 "Just one patch which fixes the power-cut recovery testing mode.

  I'll start using a single UBI/UBIFS tree instead of 2 trees from now
  on.  So in the future you'll get 1 small pull request instead of 2
  tiny ones"

* tag 'upstream-3.12-rc1' of git://git.infradead.org/linux-ubifs:
  UBIFS: remove invalid warn msg with tst_recovery enabled
2013-09-16 15:36:55 -04:00
Linus Torvalds
d8efd82eec Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
 "These are four patches for three construction sites:

   - Fix register decoding for the combination of multi-core processors
     and multi-threading.

   - Two more fixes that are part of the ongoing DECstation resurrection
     work.  One of these touches a DECstation-only network driver.

   - Finally Markos' trivial build fix for the AP/SP support.

  (With this applied now all MIPS defconfigs are building again)"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: kernel: vpe: Make vpe_attrs an array of pointers.
  MIPS: Fix SMP core calculations when using MT support.
  MIPS: DECstation I/O ASIC DMA interrupt handling fix
  MIPS: DECstation HRT initialization rearrangement
2013-09-15 17:45:52 -04:00
Linus Torvalds
cd619e21ea Merge branch 'for_linus' of git://cavan.codon.org.uk/platform-drivers-x86
Pull x86 platform updates from Matthew Garrett:
 "Nothing amazing here, almost entirely cleanups and minor bugfixes and
  one bit of hardware enablement in the amilo-rfkill driver"

* 'for_linus' of git://cavan.codon.org.uk/platform-drivers-x86:
  platform/x86: panasonic-laptop: reuse module_acpi_driver
  samsung-laptop: fix config build error
  platform: x86: remove unnecessary platform_set_drvdata()
  amilo-rfkill: Enable using amilo-rfkill with the FSC Amilo L1310.
  wmi: parse_wdg() should return kernel error codes
  hp_wmi: Fix unregister order in hp_wmi_rfkill_setup()
  platform: replace strict_strto*() with kstrto*()
  x86: irst: use module_acpi_driver to simplify the code
  x86: smartconnect: use module_acpi_driver to simplify the code
  platform samsung-q10: use ACPI instead of direct EC calls
  thinkpad_acpi: add the ability setting TPACPI_LED_NONE by quirk
  thinkpad_acpi: return -NODEV while operating uninitialized LEDs
2013-09-15 17:42:59 -04:00
Linus Torvalds
0375ec5899 Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull misc SCSI driver updates from James Bottomley:
 "This patch set is a set of driver updates (megaraid_sas, fnic, lpfc,
  ufs, hpsa) we also have a couple of bug fixes (sd out of bounds and
  ibmvfc error handling) and the first round of esas2r checker fixes and
  finally the much anticipated big endian additions for megaraid_sas"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (47 commits)
  [SCSI] fnic: fnic Driver Tuneables Exposed through CLI
  [SCSI] fnic: Kernel panic while running sh/nosh with max lun cfg
  [SCSI] fnic: Hitting BUG_ON(io_req->abts_done) in fnic_rport_exch_reset
  [SCSI] fnic: Remove QUEUE_FULL handling code
  [SCSI] fnic: On system with >1.1TB RAM, VIC fails multipath after boot up
  [SCSI] fnic: FC stat param seconds_since_last_reset not getting updated
  [SCSI] sd: Fix potential out-of-bounds access
  [SCSI] lpfc 8.3.42: Update lpfc version to driver version 8.3.42
  [SCSI] lpfc 8.3.42: Fixed issue of task management commands having a fixed timeout
  [SCSI] lpfc 8.3.42: Fixed inconsistent spin lock usage.
  [SCSI] lpfc 8.3.42: Fix driver's abort loop functionality to skip IOs already getting aborted
  [SCSI] lpfc 8.3.42: Fixed failure to allocate SCSI buffer on PPC64 platform for SLI4 devices
  [SCSI] lpfc 8.3.42: Fix WARN_ON when driver unloads
  [SCSI] lpfc 8.3.42: Avoided making pci bar ioremap call during dual-chute WQ/RQ pci bar selection
  [SCSI] lpfc 8.3.42: Fixed driver iocbq structure's iocb_flag field running out of space
  [SCSI] lpfc 8.3.42: Fix crash on driver load due to cpu affinity logic
  [SCSI] lpfc 8.3.42: Fixed logging format of setting driver sysfs attributes hard to interpret
  [SCSI] lpfc 8.3.42: Fixed back to back RSCNs discovery failure.
  [SCSI] lpfc 8.3.42: Fixed race condition between BSG I/O dispatch and timeout handling
  [SCSI] lpfc 8.3.42: Fixed function mode field defined too small for not recognizing dual-chute mode
  ...
2013-09-15 17:41:30 -04:00
Linus Torvalds
bff157b3ad Merge branch 'slab/next' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux
Pull SLAB update from Pekka Enberg:
 "Nothing terribly exciting here apart from Christoph's kmalloc
  unification patches that brings sl[aou]b implementations closer to
  each other"

* 'slab/next' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux:
  slab: Use correct GFP_DMA constant
  slub: remove verify_mem_not_deleted()
  mm/sl[aou]b: Move kmallocXXX functions to common code
  mm, slab_common: add 'unlikely' to size check of kmalloc_slab()
  mm/slub.c: beautify code for removing redundancy 'break' statement.
  slub: Remove unnecessary page NULL check
  slub: don't use cpu partial pages on UP
  mm/slub: beautify code for 80 column limitation and tab alignment
  mm/slub: remove 'per_cpu' which is useless variable
2013-09-15 07:15:06 -04:00
Linus Torvalds
8bf5e36d04 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input update from Dmitry Torokhov:
 "The only change is David Hermann's new EVIOCREVOKE evdev ioctl that
  allows safely passing file descriptors to input devices to session
  processes and later being able to stop delivery of events through
  these fds so that inactive sessions will no longer receive user input
  that does not belong to them"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: evdev - add EVIOCREVOKE ioctl
2013-09-15 07:13:39 -04:00
Linus Torvalds
05a8252bde vfs: fix typo in comment in recent dentry work
Sedat points out that I transposed some letters in "LRU" and wrote "RLU"
instead in one of the new comments explaining the flow.  Let's just fix
it.

Reported-by: Sedat Dilek <sedat.dilek@jpberlin.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-15 07:11:01 -04:00
Davidlohr Bueso
6b02fa59a7 partitions/efi: loosen check fot pmbr size in lba
Matt found that commit 27a7c64217 ("partitions/efi: account for pmbr
size in lba") caused his GPT formatted eMMC device not to boot.  The
reason is that this commit enforced Linux to always check the lesser of
the whole disk or 2Tib for the pMBR size in LBA.  While most disk
partitioning tools out there create a pMBR with these characteristics,
Microsoft does not, as it always sets the entry to the maximum 32-bit
limitation - even though a drive may be smaller than that[1].

Loosen this check and only verify that the size is either the whole disk
or 0xFFFFFFFF.  No tool in its right mind would set it to any value
other than these.

[1] http://thestarman.pcministry.com/asm/mbr/GPT.htm#GPTPT

Reported-and-tested-by: Matt Porter <matt.porter@linaro.org>
Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-15 07:10:16 -04:00