Commit Graph

362023 Commits

Author SHA1 Message Date
Theodore Ts'o 996bb9fddd ext4: support simple conversion of extent-mapped inodes to use i_blocks
In order to make it simpler to test the code which support
i_blocks/indirect-mapped inodes, support the conversion of inodes
which are less than 12 blocks and which are contained in no more than
a single extent.

The primary intended use of this code is to converting freshly created
zero-length files and empty directories.

Note that the version of chattr in e2fsprogs 1.42.7 and earlier has a
check that prevents the clearing of the extent flag.  A simple patch
which allows "chattr -e <file>" to work will be checked into the
e2fsprogs git repository.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-04-03 22:04:52 -04:00
Theodore Ts'o d76a3a7711 ext4/jbd2: don't wait (forever) for stale tid caused by wraparound
In the case where an inode has a very stale transaction id (tid) in
i_datasync_tid or i_sync_tid, it's possible that after a very large
(2**31) number of transactions, that the tid number space might wrap,
causing tid_geq()'s calculations to fail.

Commit deeeaf13 "jbd2: fix fsync() tid wraparound bug", later modified
by commit e7b04ac0 "jbd2: don't wake kjournald unnecessarily",
attempted to fix this problem, but it only avoided kjournald spinning
forever by fixing the logic in jbd2_log_start_commit().

Unfortunately, in the codepaths in fs/ext4/fsync.c and fs/ext4/inode.c
that might call jbd2_log_start_commit() with a stale tid, those
functions will subsequently call jbd2_log_wait_commit() with the same
stale tid, and then wait for a very long time.  To fix this, we
replace the calls to jbd2_log_start_commit() and
jbd2_log_wait_commit() with a call to a new function,
jbd2_complete_transaction(), which will correctly handle stale tid's.

As a bonus, jbd2_complete_transaction() will avoid locking
j_state_lock for writing unless a commit needs to be started.  This
should have a small (but probably not measurable) improvement for
ext4's scalability.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reported-by: Ben Hutchings <ben@decadent.org.uk>
Reported-by: George Barnett <gbarnett@atlassian.com>
Cc: stable@vger.kernel.org
2013-04-03 22:02:52 -04:00
Theodore Ts'o b10a44c369 ext4: add might_sleep() annotations
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
2013-04-03 22:00:52 -04:00
Theodore Ts'o 19b5ef6157 ext4: add mutex_is_locked() assertion to ext4_truncate()
[ Added fixup from Lukáš Czerner which only checks the assertion when
  the inode is not new and is not being freed. ]

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-04-03 21:58:52 -04:00
Theodore Ts'o 819c4920b7 ext4: refactor truncate code
Move common code in ext4_ind_truncate() and ext4_ext_truncate() into
ext4_truncate().  This saves over 60 lines of code.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-04-03 12:47:17 -04:00
Theodore Ts'o 26a4c0c6cc ext4: refactor punch hole code
Move common code in ext4_ind_punch_hole() and ext4_ext_punch_hole()
into ext4_punch_hole().  This saves over 150 lines of code.

This also fixes a potential bug when the punch_hole() code is racing
against indirect-to-extents or extents-to-indirect migation.  We are
currently using i_mutex to protect against changes to the inode flag;
specifically, the append-only, immutable, and extents inode flags.  So
we need to take i_mutex before deciding whether to use the
extents-specific or indirect-specific punch_hole code.

Also, there was a missing call to ext4_inode_block_unlocked_dio() in
the indirect punch codepath.  This was added in commit 02d262dffc
to block DIO readers racing against the punch operation in the
codepath for extent-mapped inodes, but it was missing for
indirect-block mapped inodes.  One of the advantages of refactoring
the code is that it makes such oversights much less likely.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-04-03 12:45:17 -04:00
Theodore Ts'o 781f143ea0 ext4: fold ext4_alloc_blocks() in ext4_alloc_branch()
The older code was far more complicated than it needed to be because
of how we spliced in the ext4's new multiblock allocator into ext3's
indirect block code.  By folding ext4_alloc_blocks() into
ext4_alloc_branch(), we make the code far more understable, shave off
over 130 lines of code and half a kilobyte of compiled object code.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-04-03 12:43:17 -04:00
Zheng Liu eed4333f08 ext4: fold ext4_generic_write_end() into ext4_write_end()
After collapsing the handling of data ordered and data writeback
codepath, ext4_generic_write_end() has only one caller,
ext4_write_end().  So we fold it into ext4_write_end().

Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
2013-04-03 12:41:17 -04:00
Theodore Ts'o 74d553aad7 ext4: collapse handling of data=ordered and data=writeback codepaths
The only difference between how we handle data=ordered and
data=writeback is a single call to ext4_jbd2_file_inode().  Eliminate
code duplication by factoring out redundant the code paths.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
2013-04-03 12:39:17 -04:00
Zheng Liu 8cde7ad17e ext4: fix big-endian bugs which could cause fs corruptions
When an extent was zeroed out, we forgot to do convert from cpu to le16.
It could make us hit a BUG_ON when we try to write dirty pages out.  So
fix it.

[ Also fix a bug found by Dmitry Monakhov where we were missing
  le32_to_cpu() calls in the new indirect punch hole code.

  There are a number of other big endian warnings found by static code
  analyzers, but we'll wait for the next merge window to fix them all
  up.  These fixes are designed to be Obviously Correct by code
  inspection, and easy to demonstrate that it won't make any
  difference (and hence, won't introduce any bugs) on little endian
  architectures such as x86.  --tytso ]

Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reported-by: CAI Qian <caiqian@redhat.com>
Reported-by: Christian Kujau <lists@nerdbynature.de>
Cc: Dmitry Monakhov <dmonakhov@openvz.org>
2013-04-03 12:37:17 -04:00
Linus Torvalds 07961ac7c0 Linux 3.9-rc5 2013-03-31 15:12:43 -07:00
Linus Torvalds 0bb44280b5 Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma
Pull slave-dmaengine fixes from Vinod Koul:
 "Two fixes for slave-dmaengine.

  The first one is for making slave_id value correct for dw_dmac and
  the other one fixes the endieness in DT parsing"

* 'fixes' of git://git.infradead.org/users/vkoul/slave-dma:
  dw_dmac: adjust slave_id accordingly to request line base
  dmaengine: dw_dma: fix endianess for DT xlate function
2013-03-31 11:41:47 -07:00
Linus Torvalds a7b436d356 Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
 "For a some fixes for Kernel 3.9:
   - subsystem build fix when VIDEO_DEV=y, VIDEO_V4L2=m and I2C=m
   - compilation fix for arm multiarch preventing IR_RX51 to be selected
   - regression fix at bttv crop logic
   - s5p-mfc/m5mols/exynos: a few fixes for cameras on exynos hardware"

* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  [media] [REGRESSION] bt8xx: Fix too large height in cropcap
  [media] fix compilation with both V4L2 and I2C as 'm'
  [media] m5mols: Fix bug in stream on handler
  [media] s5p-fimc: Do not attempt to disable not enabled media pipeline
  [media] s5p-mfc: Fix encoder control 15 issue
  [media] s5p-mfc: Fix frame skip bug
  [media] s5p-fimc: send valid m2m ctx to fimc_m2m_job_finish
  [media] exynos-gsc: send valid m2m ctx to gsc_m2m_job_finish
  [media] fimc-lite: Fix the variable type to avoid possible crash
  [media] fimc-lite: Initialize 'step' field in fimc_lite_ctrl structure
  [media] ir: IR_RX51 only works on OMAP2
2013-03-31 11:40:33 -07:00
Linus Torvalds d299c29039 Merge tag 'for-linus-20130331' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "Alright, this time from 10K up in the air.

  Collection of fixes that have been queued up since the merge window
  opened, hence postponed until later in the cycle.  The pull request
  contains:

   - A bunch of fixes for the xen blk front/back driver.

   - A round of fixes for the new IBM RamSan driver, fixing various
     nasty issues.

   - Fixes for multiple drives from Wei Yongjun, bad handling of return
     values and wrong pointer math.

   - A fix for loop properly killing partitions when being detached."

* tag 'for-linus-20130331' of git://git.kernel.dk/linux-block: (25 commits)
  mg_disk: fix error return code in mg_probe()
  rsxx: remove unused variable
  rsxx: enable error return of rsxx_eeh_save_issued_dmas()
  block: removes dynamic allocation on stack
  Block: blk-flush: Fixed indent code style
  cciss: fix invalid use of sizeof in cciss_find_cfgtables()
  loop: cleanup partitions when detaching loop device
  loop: fix error return code in loop_add()
  mtip32xx: fix error return code in mtip_pci_probe()
  xen-blkfront: remove frame list from blk_shadow
  xen-blkfront: pre-allocate pages for requests
  xen-blkback: don't store dev_bus_addr
  xen-blkfront: switch from llist to list
  xen-blkback: fix foreach_grant_safe to handle empty lists
  xen-blkfront: replace kmalloc and then memcpy with kmemdup
  xen-blkback: fix dispatch_rw_block_io() error path
  rsxx: fix missing unlock on error return in rsxx_eeh_remap_dmas()
  Adding in EEH support to the IBM FlashSystem 70/80 device driver
  block: IBM RamSan 70/80 error message bug fix.
  block: IBM RamSan 70/80 branding changes.
  ...
2013-03-31 11:38:59 -07:00
Paul Walmsley dbf520a9d7 Revert "lockdep: check that no locks held at freeze time"
This reverts commit 6aa9707099.

Commit 6aa9707099 ("lockdep: check that no locks held at freeze time")
causes problems with NFS root filesystems.  The failures were noticed on
OMAP2 and 3 boards during kernel init:

  [ BUG: swapper/0/1 still has locks held! ]
  3.9.0-rc3-00344-ga937536 #1 Not tainted
  -------------------------------------
  1 lock held by swapper/0/1:
   #0:  (&type->s_umount_key#13/1){+.+.+.}, at: [<c011e84c>] sget+0x248/0x574

  stack backtrace:
    rpc_wait_bit_killable
    __wait_on_bit
    out_of_line_wait_on_bit
    __rpc_execute
    rpc_run_task
    rpc_call_sync
    nfs_proc_get_root
    nfs_get_root
    nfs_fs_mount_common
    nfs_try_mount
    nfs_fs_mount
    mount_fs
    vfs_kern_mount
    do_mount
    sys_mount
    do_mount_root
    mount_root
    prepare_namespace
    kernel_init_freeable
    kernel_init

Although the rootfs mounts, the system is unstable.  Here's a transcript
from a PM test:

  http://www.pwsan.com/omap/testlogs/test_v3.9-rc3/20130317194234/pm/37xxevm/37xxevm_log.txt

Here's what the test log should look like:

  http://www.pwsan.com/omap/testlogs/test_v3.8/20130218214403/pm/37xxevm/37xxevm_log.txt

Mailing list discussion is here:

  http://lkml.org/lkml/2013/3/4/221

Deal with this for v3.9 by reverting the problem commit, until folks can
figure out the right long-term course of action.

Signed-off-by: Paul Walmsley <paul@pwsan.com>
Cc: Mandeep Singh Baines <msb@chromium.org>
Cc: Jeff Layton <jlayton@redhat.com>
Cc: Shawn Guo <shawn.guo@linaro.org>
Cc: <maciej.rutecki@gmail.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ben Chan <benchan@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-31 11:38:33 -07:00
Linus Torvalds 13d2080db3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target fixes from Nicholas Bellinger:
 "This includes the bug-fix for a >= v3.8-rc1 regression specific to
  iscsi-target persistent reservation conflict handling (CC'ed to
  stable), and a tcm_vhost patch to drop VIRTIO_RING_F_EVENT_IDX usage
  so that in-flight qemu vhost-scsi-pci device code can detect the
  proper vhost feature bits.

  Also, there are two more tcm_vhost patches still being discussed by
  MST and Asias for v3.9 that will be required for the in-flight qemu
  vhost-scsi-pci device patch to function properly, and that should
  (hopefully) be the last target fixes for this round."

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  target: Fix RESERVATION_CONFLICT status regression for iscsi-target special case
  tcm_vhost: Avoid VIRTIO_RING_F_EVENT_IDX feature bit
2013-03-30 13:13:05 -07:00
Andy Shevchenko bce95c63ef dw_dmac: adjust slave_id accordingly to request line base
On some hardware configurations we have got the request line with the offset.
The patch introduces convert_slave_id() helper for that cases. The request line
base is came from the driver data provided by the platform_device_id table.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-03-30 04:34:07 +05:30
Arnd Bergmann f73bb9b355 dmaengine: dw_dma: fix endianess for DT xlate function
As reported by Wu Fengguang's build robot tracking sparse warnings, the
dma_spec arguments in the dw_dma_xlate are already byte swapped on
little-endian platforms and must not get swapped again. This code is
currently not used anywhere, but will be used in Linux 3.10 when the
ARM SPEAr platform starts using the generic DMA DT binding.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-03-30 04:34:07 +05:30
Rafael J. Wysocki 46a1f21a67 PNP: List Rafael Wysocki as a maintainer
The Adam Belay's e-mail address in MAINTAINERS under PNP SUPPORT
is not valid any more and I started to maintain that code in the
meantime as a matter of fact, so list myself as a maintainer of it
along with Bjorn and remove the Adam's entry from it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-29 15:28:33 -07:00
Linus Torvalds b92eded4b7 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull ceph fix from Sage Weil:
 "This fixes a regression introduced during the last merge window when
  mapping non-existent images."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  rbd: don't zero-fill non-image object requests
2013-03-29 11:47:43 -07:00
Alex Elder 6e2a4505db rbd: don't zero-fill non-image object requests
A result of ENOENT from a read request for an object that's part of
an rbd image indicates that there is a hole in that portion of the
image.  Similarly, a short read for such an object indicates that
the remainder of the read should be interpreted a full read with
zeros filling out the end of the request.

This behavior is not correct for objects that are not backing rbd
image data.  Currently rbd_img_obj_request_callback() assumes it
should be done for all objects.

Change rbd_img_obj_request_callback() so it only does this zeroing
for image objects.  Encapsulate that special handling in its own
function.  Add an assertion that the image object request is a bio
request, since we assume that (and we currently don't support any
other types).

This resolves a problem identified here:
    http://tracker.ceph.com/issues/4559

The regression was introduced by bf0d5f503d.

Reported-by: Dan van der Ster <dan@vanderster.com>
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-off-by: Sage Weil <sage@inktank.com>
2013-03-29 11:32:07 -07:00
Linus Torvalds 3615db41c4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
 "We've had a busy two weeks of bug fixing.  The biggest patches in here
  are some long standing early-enospc problems (Josef) and a very old
  race where compression and mmap combine forces to lose writes (me).
  I'm fairly sure the mmap bug goes all the way back to the introduction
  of the compression code, which is proof that fsx doesn't trigger every
  possible mmap corner after all.

  I'm sure you'll notice one of these is from this morning, it's a small
  and isolated use-after-free fix in our scrub error reporting.  I
  double checked it here."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  Btrfs: don't drop path when printing out tree errors in scrub
  Btrfs: fix wrong return value of btrfs_lookup_csum()
  Btrfs: fix wrong reservation of csums
  Btrfs: fix double free in the btrfs_qgroup_account_ref()
  Btrfs: limit the global reserve to 512mb
  Btrfs: hold the ordered operations mutex when waiting on ordered extents
  Btrfs: fix space accounting for unlink and rename
  Btrfs: fix space leak when we fail to reserve metadata space
  Btrfs: fix EIO from btrfs send in is_extent_unchanged for punched holes
  Btrfs: fix race between mmap writes and compression
  Btrfs: fix memory leak in btrfs_create_tree()
  Btrfs: fix locking on ROOT_REPLACE operations in tree mod log
  Btrfs: fix missing qgroup reservation before fallocating
  Btrfs: handle a bogus chunk tree nicely
  Btrfs: update to use fs_state bit
2013-03-29 11:13:25 -07:00
Len Brown ed176886b6 ia64 idle: delete stale (*idle)() function pointer
Commit 3e7fc708eb ("ia64 idle: delete pm_idle") in 3.9-rc1 didn't
finish the job, leaving an un-initialized reference to (*idle)().

[ Haven't seen a crash from this - but seems like we are just being
  lucky that "idle" is zero so it does get initialized before we jump to
  randomland  - Len ]

Reported-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-29 11:12:25 -07:00
Linus Torvalds 67e17c1100 Merge branch 'for-curr' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull arc architecture fixes from Vineet Gupta:
 "This includes fix for a serious bug in DMA mapping API, make
  allyesconfig wreckage, removal of bogus email-list placeholder in
  MAINTAINERS, a typo in ptrace helper code and last remaining changes
  for syscall ABI v3 which we are finally starting to transition-to
  internally.

  The request is late than I intended to - but I was held up with
  debugging a timer link list corruption, for which a proposed fix to
  generic timer code was sent out to lkml/tglx earlier today."

* 'for-curr' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: Fix the typo in event identifier flags used by ptrace
  arc: fix dma_address assignment during dma_map_sg()
  ARC: Remove SET_PERSONALITY (tracks cross-arch change)
  ARC: ABIv3: fork/vfork wrappers not needed in "no-legacy-syscall" ABI
  ARC: ABIv3: Print the correct ABI ver
  ARC: make allyesconfig build breakages
  ARC: MAINTAINERS update for ARC
2013-03-29 11:00:43 -07:00
Josef Bacik d8fe29e9de Btrfs: don't drop path when printing out tree errors in scrub
A user reported a panic where we were panicing somewhere in
tree_backref_for_extent from scrub_print_warning.  He only captured the trace
but looking at scrub_print_warning we drop the path right before we mess with
the extent buffer to print out a bunch of stuff, which isn't right.  So fix this
by dropping the path after we use the eb if we need to.  Thanks,

Cc: stable@vger.kernel.org
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-03-29 10:18:59 -04:00