Commit Graph

518938 Commits

Author SHA1 Message Date
Christoph Hellwig
ac7cdff00a block: remove REQ_TYPE_PM_SHUTDOWN
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-05-05 13:40:10 -06:00
Christoph Hellwig
b0b93b48a3 block: move REQ_TYPE_SENSE to the ide driver
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-05-05 13:40:07 -06:00
Christoph Hellwig
b42171ef7d block: move REQ_TYPE_ATA_TASKFILE and REQ_TYPE_ATA_PC to ide.h
These values are only used by the IDE driver, so move them into it
by allowing drivers to take cmd_type values after the first private
one.  Note that we have to turn cmd_type into a plain unsigned integer
so that gcc doesn't complain about mismatching enum types.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-05-05 13:40:05 -06:00
Christoph Hellwig
4f8c9510ba block: rename REQ_TYPE_SPECIAL to REQ_TYPE_DRV_PRIV
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-05-05 13:40:03 -06:00
Jens Axboe
dac56212e8 bio: skip atomic inc/dec of ->bi_cnt for most use cases
Struct bio has a reference count that controls when it can be freed.
Most uses cases is allocating the bio, which then returns with a
single reference to it, doing IO, and then dropping that single
reference. We can remove this atomic_dec_and_test() in the completion
path, if nobody else is holding a reference to the bio.

If someone does call bio_get() on the bio, then we flag the bio as
now having valid count and that we must properly honor the reference
count when it's being put.

Tested-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-05-05 13:32:49 -06:00
Jens Axboe
c4cf5261f8 bio: skip atomic inc/dec of ->bi_remaining for non-chains
Struct bio has an atomic ref count for chained bio's, and we use this
to know when to end IO on the bio. However, most bio's are not chained,
so we don't need to always introduce this atomic operation as part of
ending IO.

Add a helper to elevate the bi_remaining count, and flag the bio as
now actually needing the decrement at end_io time. Rename the field
to __bi_remaining to catch any current users of this doing the
incrementing manually.

For high IOPS workloads, this reduces the overhead of bio_endio()
substantially.

Tested-by: Robert Elliott <elliott@hp.com>
Acked-by: Kent Overstreet <kent.overstreet@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-05-05 13:32:47 -06:00
Linus Torvalds
d9cee5d4f6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
 "This fixes a build problem with bcm63xx and yet another fix to the
  memzero_explicit function to ensure that the memset is not elided"

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  hwrng: bcm63xx - Fix driver compilation
  lib: make memzero_explicit more robust against dead store elimination
2015-05-05 09:03:52 -07:00
Linus Torvalds
c02d7da3dd Merge tag 'media/v4.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
 "Three driver fixes:

   - fix for omap4, fixing a regression due to a subsystem API that got
     removed for 4.1 (commit efde234674);

   - fix for one of the formats supported by Marvel ccic driver;

   - fix rcar_vin driver that, when stopping abnormally, the driver
     can't return from wait_for_completion"

* tag 'media/v4.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  [media] v4l: omap4iss: Replace outdated OMAP4 control pad API with syscon
  [media] media: soc_camera: rcar_vin: Fix wait_for_completion
  [media] marvell-ccic: fix Y'CbCr ordering
2015-05-05 08:42:06 -07:00
Álvaro Fernández Rojas
f440c4ee3e hwrng: bcm63xx - Fix driver compilation
- s/clk_didsable_unprepare/clk_disable_unprepare
- s/prov/priv
- s/error/ret (bcm63xx_rng_probe)

Fixes: 6229c16060 ("hwrng: bcm63xx - make use of devm_hwrng_register")
Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-05-04 17:49:52 +08:00
Daniel Borkmann
7829fb09a2 lib: make memzero_explicit more robust against dead store elimination
In commit 0b053c9518 ("lib: memzero_explicit: use barrier instead
of OPTIMIZER_HIDE_VAR"), we made memzero_explicit() more robust in
case LTO would decide to inline memzero_explicit() and eventually
find out it could be elimiated as dead store.

While using barrier() works well for the case of gcc, recent efforts
from LLVMLinux people suggest to use llvm as an alternative to gcc,
and there, Stephan found in a simple stand-alone user space example
that llvm could nevertheless optimize and thus elimitate the memset().
A similar issue has been observed in the referenced llvm bug report,
which is regarded as not-a-bug.

Based on some experiments, icc is a bit special on its own, while it
doesn't seem to eliminate the memset(), it could do so with an own
implementation, and then result in similar findings as with llvm.

The fix in this patch now works for all three compilers (also tested
with more aggressive optimization levels). Arguably, in the current
kernel tree it's more of a theoretical issue, but imho, it's better
to be pedantic about it.

It's clearly visible with gcc/llvm though, with the below code: if we
would have used barrier() only here, llvm would have omitted clearing,
not so with barrier_data() variant:

  static inline void memzero_explicit(void *s, size_t count)
  {
    memset(s, 0, count);
    barrier_data(s);
  }

  int main(void)
  {
    char buff[20];
    memzero_explicit(buff, sizeof(buff));
    return 0;
  }

  $ gcc -O2 test.c
  $ gdb a.out
  (gdb) disassemble main
  Dump of assembler code for function main:
   0x0000000000400400  <+0>: lea   -0x28(%rsp),%rax
   0x0000000000400405  <+5>: movq  $0x0,-0x28(%rsp)
   0x000000000040040e <+14>: movq  $0x0,-0x20(%rsp)
   0x0000000000400417 <+23>: movl  $0x0,-0x18(%rsp)
   0x000000000040041f <+31>: xor   %eax,%eax
   0x0000000000400421 <+33>: retq
  End of assembler dump.

  $ clang -O2 test.c
  $ gdb a.out
  (gdb) disassemble main
  Dump of assembler code for function main:
   0x00000000004004f0  <+0>: xorps  %xmm0,%xmm0
   0x00000000004004f3  <+3>: movaps %xmm0,-0x18(%rsp)
   0x00000000004004f8  <+8>: movl   $0x0,-0x8(%rsp)
   0x0000000000400500 <+16>: lea    -0x18(%rsp),%rax
   0x0000000000400505 <+21>: xor    %eax,%eax
   0x0000000000400507 <+23>: retq
  End of assembler dump.

As gcc, clang, but also icc defines __GNUC__, it's sufficient to define
this in compiler-gcc.h only to be picked up. For a fallback or otherwise
unsupported compiler, we define it as a barrier. Similarly, for ecc which
does not support gcc inline asm.

Reference: https://llvm.org/bugs/show_bug.cgi?id=15495
Reported-by: Stephan Mueller <smueller@chronox.de>
Tested-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Stephan Mueller <smueller@chronox.de>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: mancha security <mancha1@zoho.com>
Cc: Mark Charlebois <charlebm@gmail.com>
Cc: Behan Webster <behanw@converseincode.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-05-04 17:49:51 +08:00
Linus Torvalds
5ebe6afaf0 Linux 4.1-rc2 2015-05-03 19:22:23 -07:00
Linus Torvalds
8663da2c09 Merge tag 'for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 fixes from Ted Ts'o:
 "Some miscellaneous bug fixes and some final on-disk and ABI changes
  for ext4 encryption which provide better security and performance"

* tag 'for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: fix growing of tiny filesystems
  ext4: move check under lock scope to close a race.
  ext4: fix data corruption caused by unwritten and delayed extents
  ext4 crypto: remove duplicated encryption mode definitions
  ext4 crypto: do not select from EXT4_FS_ENCRYPTION
  ext4 crypto: add padding to filenames before encrypting
  ext4 crypto: simplify and speed up filename encryption
2015-05-03 18:23:53 -07:00
Linus Torvalds
101a6fd387 Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
 "One intel fix, one rockchip fix, and a bunch of radeon fixes for some
  regressions from audio rework and vm stability"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/i915/chv: Implement WaDisableShadowRegForCpd
  drm/radeon: fix userptr return value checking (v2)
  drm/radeon: check new address before removing old one
  drm/radeon: reset BOs address after clearing it.
  drm/radeon: fix lockup when BOs aren't part of the VM on release
  drm/radeon: add SI DPM quirk for Sapphire R9 270 Dual-X 2G GDDR5
  drm/radeon: adjust pll when audio is not enabled
  drm/radeon: only enable audio streams if the monitor supports it
  drm/radeon: only mark audio as connected if the monitor supports it (v3)
  drm/radeon/audio: don't enable packets until the end
  drm/radeon: drop dce6_dp_enable
  drm/radeon: fix ordering of AVI packet setup
  drm/radeon: Use drm_calloc_ab for CS relocs
  drm/rockchip: fix error check when getting irq
  MAINTAINERS: add entry for Rockchip drm drivers
2015-05-03 18:15:48 -07:00
Dave Airlie
71aee81937 Merge tag 'drm-intel-fixes-2015-04-30' of git://anongit.freedesktop.org/drm-intel into drm-fixes
Just a single intel fix
* tag 'drm-intel-fixes-2015-04-30' of git://anongit.freedesktop.org/drm-intel:
  drm/i915/chv: Implement WaDisableShadowRegForCpd
2015-05-04 08:56:47 +10:00
Dave Airlie
df9ebeb2da Merge branch 'drm-next0420' of https://github.com/markyzq/kernel-drm-rockchip into drm-fixes
one fix and maintainers update
* 'drm-next0420' of https://github.com/markyzq/kernel-drm-rockchip:
  drm/rockchip: fix error check when getting irq
  MAINTAINERS: add entry for Rockchip drm drivers
2015-05-04 08:56:27 +10:00
Linus Torvalds
61f06db00e Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
 "This is three logical fixes (as 5 patches).

  The 3ware class of drivers were causing an oops with multiqueue by
  tearing down the command mappings after completing the command (where
  the variables in the command used to tear down the mapping were
  no-longer valid).  There's also a fix for the qnap iscsi target which
  was choking on us sending it commands that were too long and a fix for
  the reworked aha1542 allocating GFP_KERNEL under a lock"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  3w-9xxx: fix command completion race
  3w-xxxx: fix command completion race
  3w-sas: fix command completion race
  aha1542: Allocate memory before taking a lock
  SCSI: add 1024 max sectors black list flag
2015-05-03 13:22:32 -07:00
Linus Torvalds
3333222484 Merge branch 'next' of git://git.infradead.org/users/vkoul/slave-dma
Pull slave dmaengine fixes from Vinod Koul:
 "Here are the fixes in dmaengine subsystem for rc2:

   - privatecnt fix for slave dma request API by Christopher

   - warn fix for PM ifdef in usb-dmac by Geert

   - fix hardware dependency for xgene by Jean"

* 'next' of git://git.infradead.org/users/vkoul/slave-dma:
  dmaengine: increment privatecnt when using dma_get_any_slave_channel
  dmaengine: xgene: Set hardware dependency
  dmaengine: usb-dmac: Protect PM-only functions to kill warning
2015-05-03 10:49:04 -07:00
Linus Torvalds
180d89f6ef Merge tag 'powerpc-4.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux
Pull powerpc fixes from Michael Ellerman:
 - build fix for SMP=n in book3s_xics.c
 - fix for Daniel's pci_controller_ops on powernv.
 - revert the TM syscall abort patch for now.
 - CPU affinity fix from Nathan.
 - two EEH fixes from Gavin.
 - fix for CR corruption from Sam.
 - selftest build fix.

* tag 'powerpc-4.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux:
  powerpc/powernv: Restore non-volatile CRs after nap
  powerpc/eeh: Delay probing EEH device during hotplug
  powerpc/eeh: Fix race condition in pcibios_set_pcie_reset_state()
  powerpc/pseries: Correct cpu affinity for dlpar added cpus
  selftests/powerpc: Fix the pmu install rule
  Revert "powerpc/tm: Abort syscalls in active transactions"
  powerpc/powernv: Fix early pci_controller_ops loading.
  powerpc/kvm: Fix SMP=n build error in book3s_xics.c
2015-05-03 10:28:36 -07:00
Jan Kara
2c869b262a ext4: fix growing of tiny filesystems
The estimate of necessary transaction credits in ext4_flex_group_add()
is too pessimistic. It reserves credit for sb, resize inode, and resize
inode dindirect block for each group added in a flex group although they
are always the same block and thus it is enough to account them only
once. Also the number of modified GDT block is overestimated since we
fit EXT4_DESC_PER_BLOCK(sb) descriptors in one block.

Make the estimation more precise. That reduces number of requested
credits enough that we can grow 20 MB filesystem (which has 1 MB
journal, 79 reserved GDT blocks, and flex group size 16 by default).

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
2015-05-02 23:58:32 -04:00
Davide Italiano
280227a75b ext4: move check under lock scope to close a race.
fallocate() checks that the file is extent-based and returns
EOPNOTSUPP in case is not. Other tasks can convert from and to
indirect and extent so it's safe to check only after grabbing
the inode mutex.

Signed-off-by: Davide Italiano <dccitaliano@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
2015-05-02 23:21:15 -04:00
Lukas Czerner
d2dc317d56 ext4: fix data corruption caused by unwritten and delayed extents
Currently it is possible to lose whole file system block worth of data
when we hit the specific interaction with unwritten and delayed extents
in status extent tree.

The problem is that when we insert delayed extent into extent status
tree the only way to get rid of it is when we write out delayed buffer.
However there is a limitation in the extent status tree implementation
so that when inserting unwritten extent should there be even a single
delayed block the whole unwritten extent would be marked as delayed.

At this point, there is no way to get rid of the delayed extents,
because there are no delayed buffers to write out. So when a we write
into said unwritten extent we will convert it to written, but it still
remains delayed.

When we try to write into that block later ext4_da_map_blocks() will set
the buffer new and delayed and map it to invalid block which causes
the rest of the block to be zeroed loosing already written data.

For now we can fix this by simply not allowing to set delayed status on
written extent in the extent status tree. Also add WARN_ON() to make
sure that we notice if this happens in the future.

This problem can be easily reproduced by running the following xfs_io.

xfs_io -f -c "pwrite -S 0xaa 4096 2048" \
          -c "falloc 0 131072" \
          -c "pwrite -S 0xbb 65536 2048" \
          -c "fsync" /mnt/test/fff

echo 3 > /proc/sys/vm/drop_caches
xfs_io -c "pwrite -S 0xdd 67584 2048" /mnt/test/fff

This can be theoretically also reproduced by at random by running fsx,
but it's not very reliable, though on machines with bigger page size
(like ppc) this can be seen more often (especially xfstest generic/127)

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
2015-05-02 21:36:55 -04:00
Chanho Park
9402bdcacd ext4 crypto: remove duplicated encryption mode definitions
This patch removes duplicated encryption modes which were already in
ext4.h. They were duplicated from commit 3edc18d and commit f542fb.

Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Michael Halcrow <mhalcrow@google.com>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Signed-off-by: Chanho Park <chanho61.park@samsung.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-02 10:29:22 -04:00
Herbert Xu
fb63e5489f ext4 crypto: do not select from EXT4_FS_ENCRYPTION
This patch adds a tristate EXT4_ENCRYPTION to do the selections
for EXT4_FS_ENCRYPTION because selecting from a bool causes all
the selected options to be built-in, even if EXT4 itself is a
module.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-02 10:29:19 -04:00
Linus Torvalds
6c3c1eb3c3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Receive packet length needs to be adjust by 2 on RX to accomodate
    the two padding bytes in altera_tse driver.  From Vlastimil Setka.

 2) If rx frame is dropped due to out of memory in macb driver, we leave
    the receive ring descriptors in an undefined state.  From Punnaiah
    Choudary Kalluri

 3) Some netlink subsystems erroneously signal NLM_F_MULTI.  That is
    only for dumps.  Fix from Nicolas Dichtel.

 4) Fix mis-use of raw rt->rt_pmtu value in ipv4, one must always go via
    the ipv4_mtu() helper.  From Herbert Xu.

 5) Fix null deref in bridge netfilter, and miscalculated lengths in
    jump/goto nf_tables verdicts.  From Florian Westphal.

 6) Unhash ping sockets properly.

 7) Software implementation of BPF divide did 64/32 rather than 64/64
    bit divide.  The JITs got it right.  Fix from Alexei Starovoitov.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (30 commits)
  ipv4: Missing sk_nulls_node_init() in ping_unhash().
  net: fec: Fix RGMII-ID mode
  net/mlx4_en: Schedule napi when RX buffers allocation fails
  netxen_nic: use spin_[un]lock_bh around tx_clean_lock
  net/mlx4_core: Fix unaligned accesses
  mlx4_en: Use correct loop cursor in error path.
  cxgb4: Fix MC1 memory offset calculation
  bnx2x: Delay during kdump load
  net: Fix Kernel Panic in bonding driver debugfs file: rlb_hash_table
  net: dsa: Fix scope of eeprom-length property
  net: macb: Fix race condition in driver when Rx frame is dropped
  hv_netvsc: Fix a bug in netvsc_start_xmit()
  altera_tse: Correct rx packet length
  mlx4: Fix tx ring affinity_mask creation
  tipc: fix problem with parallel link synchronization mechanism
  tipc: remove wrong use of NLM_F_MULTI
  bridge/nl: remove wrong use of NLM_F_MULTI
  bridge/mdb: remove wrong use of NLM_F_MULTI
  net: sched: act_connmark: don't zap skb->nfct
  trivial: net: systemport: bcmsysport.h: fix 0x0x prefix
  ...
2015-05-01 20:51:04 -07:00
Stefan Hajnoczi
e412d3a32b virtio: fix typo in vring_need_event() doc comment
Here the "other side" refers to the guest or host.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-05-01 20:46:32 -07:00