Pull char/misc driver updates from Greg KH:
"Here is the big set of char/misc patches for 4.20-rc1.
Loads of things here, we have new code in all of these driver
subsystems:
- fpga
- stm
- extcon
- nvmem
- eeprom
- hyper-v
- gsmi
- coresight
- thunderbolt
- vmw_balloon
- goldfish
- soundwire
along with lots of fixes and minor changes to other small drivers.
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (245 commits)
Documentation/security-bugs: Clarify treatment of embargoed information
lib: Fix ia64 bootloader linkage
MAINTAINERS: Clarify UIO vs UIOVEC maintainer
docs/uio: fix a grammar nitpick
docs: fpga: document programming fpgas using regions
fpga: add devm_fpga_region_create
fpga: bridge: add devm_fpga_bridge_create
fpga: mgr: add devm_fpga_mgr_create
hv_balloon: Replace spin_is_locked() with lockdep
sgi-xp: Replace spin_is_locked() with lockdep
eeprom: New ee1004 driver for DDR4 memory
eeprom: at25: remove unneeded 'at25_remove'
w1: IAD Register is yet readable trough iad sys file. Fix snprintf (%u for unsigned, count for max size).
misc: mic: scif: remove set but not used variables 'src_dma_addr, dst_dma_addr'
misc: mic: fix a DMA pool free failure
platform: goldfish: pipe: Add a blank line to separate varibles and code
platform: goldfish: pipe: Remove redundant casting
platform: goldfish: pipe: Call misc_deregister if init fails
platform: goldfish: pipe: Move the file-scope goldfish_pipe_dev variable into the driver state
platform: goldfish: pipe: Move the file-scope goldfish_pipe_miscdev variable into the driver state
...
Add umoddi3 and udivmoddi4 support for 32-bit.
The RV32 need the umoddi3 to do modulo when the operands are long long
type, like other libraries implementation such as ucmpdi2, lshrdi3 and
so on.
I encounter the undefined reference 'umoddi3' when I use the in
house dma driver, although it is in house driver, but I think that
umoddi3 is a common function for RV32.
The udivmoddi4 and umoddi3 are copies from libgcc in gcc. There are other
functions use the udivmoddi4 in libgcc, so I separate the umoddi3 and
udivmoddi4 for flexible extension in the future.
Signed-off-by: Zong Li <zong@andestech.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
kbuild robot reports that since commit ce76d938dd ("lib: Add memcat_p():
paste 2 pointer arrays together") the ia64/hp/sim/boot fails to link:
> LD arch/ia64/hp/sim/boot/bootloader
> lib/string.o: In function `__memcat_p':
> string.c:(.text+0x1f22): undefined reference to `__kmalloc'
> string.c:(.text+0x1ff2): undefined reference to `__kmalloc'
> make[1]: *** [arch/ia64/hp/sim/boot/Makefile:37: arch/ia64/hp/sim/boot/bootloader] Error 1
The reason is, the above commit, via __memcat_p(), adds a call to
__kmalloc to string.o, which happens to be used in the bootloader, but
there's no kmalloc or slab or anything.
Since the linker would only pull in objects that contain referenced
symbols, moving __memcat_p() to a different compilation unit solves the
problem.
Fixes: ce76d938dd ("lib: Add memcat_p(): paste 2 pointer arrays together")
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reported-by: kbuild test robot <lkp@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The previous patch introduced very large kernel stack usage and a Makefile
change to hide the warning about it.
From what I can tell, a number of things went wrong here:
- The BCH_MAX_T constant was set to the maximum value for 'n',
not the maximum for 't', which is much smaller.
- The stack usage is actually larger than the entire kernel stack
on some architectures that can use 4KB stacks (m68k, sh, c6x), which
leads to an immediate overrun.
- The justification in the patch description claimed that nothing
changed, however that is not the case even without the two points above:
the configuration is machine specific, and most boards never use the
maximum BCH_ECC_WORDS() length but instead have something much smaller.
That maximum would only apply to machines that use both the maximum
block size and the maximum ECC strength.
The largest value for 't' that I could find is '32', which in turn leads
to a 60 byte array instead of 2048 bytes. Making it '64' for future
extension seems also worthwhile, with 120 bytes for the array. Anything
larger won't fit into the OOB area on NAND flash.
With that changed, the warning can be enabled again.
Only linux-4.19+ contains the breakage, so this is only needed
as a stable backport if it does not make it into the release.
Fixes: 02361bc778 ("lib/bch: Remove VLA usage")
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: stable@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
This adds a helper to paste 2 pointer arrays together, useful for merging
various types of attribute arrays. There are a few places in the kernel
tree where this is open coded, and I just added one more in the STM class.
The naming is inspired by memset_p() and memcat(), and partial credit for
it goes to Andy Shevchenko.
This patch adds the function wrapped in a type-enforcing macro and a test
module.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull IDA updates from Matthew Wilcox:
"A better IDA API:
id = ida_alloc(ida, GFP_xxx);
ida_free(ida, id);
rather than the cumbersome ida_simple_get(), ida_simple_remove().
The new IDA API is similar to ida_simple_get() but better named. The
internal restructuring of the IDA code removes the bitmap
preallocation nonsense.
I hope the net -200 lines of code is convincing"
* 'ida-4.19' of git://git.infradead.org/users/willy/linux-dax: (29 commits)
ida: Change ida_get_new_above to return the id
ida: Remove old API
test_ida: check_ida_destroy and check_ida_alloc
test_ida: Convert check_ida_conv to new API
test_ida: Move ida_check_max
test_ida: Move ida_check_leaf
idr-test: Convert ida_check_nomem to new API
ida: Start new test_ida module
target/iscsi: Allocate session IDs from an IDA
iscsi target: fix session creation failure handling
drm/vmwgfx: Convert to new IDA API
dmaengine: Convert to new IDA API
ppc: Convert vas ID allocation to new IDA API
media: Convert entity ID allocation to new IDA API
ppc: Convert mmu context allocation to new IDA API
Convert net_namespace to new IDA API
cb710: Convert to new IDA API
rsxx: Convert to new IDA API
osd: Convert to new IDA API
sd: Convert to new IDA API
...
Patch series "add crc64 calculation as kernel library", v5.
This patchset adds basic implementation of crc64 calculation as a Linux
kernel library. Since bcache already does crc64 by itself, this patchset
also modifies bcache code to use the new crc64 library routine.
Currently bcache is the only user of crc64 calculation, another potential
user is bcachefs which is on the way to be in mainline kernel. Therefore
it makes sense to make crc64 calculation to be a public library.
bcache uses crc64 as storage checksum, if a change of crc lib routines
results an inconsistent result, the unmatched checksum may make bcache
'think' the on-disk is corrupted, such a change should be avoided or
detected as early as possible. Therefore a patch is being prepared which
adds a crc test framework, to check consistency of different calculations.
This patch (of 2):
Add the re-write crc64 calculation routines for Linux kernel. The CRC64
polynomical arithmetic follows ECMA-182 specification, inspired by CRC
paper of Dr. Ross N. Williams (see
http://www.ross.net/crc/download/crc_v3.txt) and other public domain
implementations.
All the changes work in this way,
- When Linux kernel is built, host program lib/gen_crc64table.c will be
compiled to lib/gen_crc64table and executed.
- The output of gen_crc64table execution is an array called as lookup
table (a.k.a POLY 0x42f0e1eba9ea369) which contain 256 64-bit long
numbers, this table is dumped into header file lib/crc64table.h.
- Then the header file is included by lib/crc64.c for normal 64bit crc
calculation.
- Function declaration of the crc64 calculation routines is placed in
include/linux/crc64.h
Currently bcache is the only user of crc64_be(), another potential user is
bcachefs which is on the way to be in mainline kernel. Therefore it makes
sense to move crc64 calculation into lib/crc64.c as public code.
[colyli@suse.de: fix review comments from v4]
Link: http://lkml.kernel.org/r/20180726053352.2781-2-colyli@suse.de
Link: http://lkml.kernel.org/r/20180718165545.1622-2-colyli@suse.de
Signed-off-by: Coly Li <colyli@suse.de>
Co-developed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Michael Lyle <mlyle@lyle.org>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Eric Biggers <ebiggers3@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Noah Massey <noah.massey@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull SCSI updates from James Bottomley:
"This is mostly updates to the usual drivers: mpt3sas, lpfc, qla2xxx,
hisi_sas, smartpqi, megaraid_sas, arcmsr.
In addition, with the continuing absence of Nic we have target updates
for tcmu and target core (all with reviews and acks).
The biggest observable change is going to be that we're (again) trying
to switch to mulitqueue as the default (a user can still override the
setting on the kernel command line).
Other major core stuff is the removal of the remaining Microchannel
drivers, an update of the internal timers and some reworks of
completion and result handling"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (203 commits)
scsi: core: use blk_mq_run_hw_queues in scsi_kick_queue
scsi: ufs: remove unnecessary query(DM) UPIU trace
scsi: qla2xxx: Fix issue reported by static checker for qla2x00_els_dcmd2_sp_done()
scsi: aacraid: Spelling fix in comment
scsi: mpt3sas: Fix calltrace observed while running IO & reset
scsi: aic94xx: fix an error code in aic94xx_init()
scsi: st: remove redundant pointer STbuffer
scsi: qla2xxx: Update driver version to 10.00.00.08-k
scsi: qla2xxx: Migrate NVME N2N handling into state machine
scsi: qla2xxx: Save frame payload size from ICB
scsi: qla2xxx: Fix stalled relogin
scsi: qla2xxx: Fix race between switch cmd completion and timeout
scsi: qla2xxx: Fix Management Server NPort handle reservation logic
scsi: qla2xxx: Flush mailbox commands on chip reset
scsi: qla2xxx: Fix unintended Logout
scsi: qla2xxx: Fix session state stuck in Get Port DB
scsi: qla2xxx: Fix redundant fc_rport registration
scsi: qla2xxx: Silent erroneous message
scsi: qla2xxx: Prevent sysfs access when chip is down
scsi: qla2xxx: Add longer window for chip reset
...
Pull networking updates from David Miller:
"Highlights:
- Gustavo A. R. Silva keeps working on the implicit switch fallthru
changes.
- Support 802.11ax High-Efficiency wireless in cfg80211 et al, From
Luca Coelho.
- Re-enable ASPM in r8169, from Kai-Heng Feng.
- Add virtual XFRM interfaces, which avoids all of the limitations of
existing IPSEC tunnels. From Steffen Klassert.
- Convert GRO over to use a hash table, so that when we have many
flows active we don't traverse a long list during accumluation.
- Many new self tests for routing, TC, tunnels, etc. Too many
contributors to mention them all, but I'm really happy to keep
seeing this stuff.
- Hardware timestamping support for dpaa_eth/fsl-fman from Yangbo Lu.
- Lots of cleanups and fixes in L2TP code from Guillaume Nault.
- Add IPSEC offload support to netdevsim, from Shannon Nelson.
- Add support for slotting with non-uniform distribution to netem
packet scheduler, from Yousuk Seung.
- Add UDP GSO support to mlx5e, from Boris Pismenny.
- Support offloading of Team LAG in NFP, from John Hurley.
- Allow to configure TX queue selection based upon RX queue, from
Amritha Nambiar.
- Support ethtool ring size configuration in aquantia, from Anton
Mikaev.
- Support DSCP and flowlabel per-transport in SCTP, from Xin Long.
- Support list based batching and stack traversal of SKBs, this is
very exciting work. From Edward Cree.
- Busyloop optimizations in vhost_net, from Toshiaki Makita.
- Introduce the ETF qdisc, which allows time based transmissions. IGB
can offload this in hardware. From Vinicius Costa Gomes.
- Add parameter support to devlink, from Moshe Shemesh.
- Several multiplication and division optimizations for BPF JIT in
nfp driver, from Jiong Wang.
- Lots of prepatory work to make more of the packet scheduler layer
lockless, when possible, from Vlad Buslov.
- Add ACK filter and NAT awareness to sch_cake packet scheduler, from
Toke Høiland-Jørgensen.
- Support regions and region snapshots in devlink, from Alex Vesker.
- Allow to attach XDP programs to both HW and SW at the same time on
a given device, with initial support in nfp. From Jakub Kicinski.
- Add TLS RX offload and support in mlx5, from Ilya Lesokhin.
- Use PHYLIB in r8169 driver, from Heiner Kallweit.
- All sorts of changes to support Spectrum 2 in mlxsw driver, from
Ido Schimmel.
- PTP support in mv88e6xxx DSA driver, from Andrew Lunn.
- Make TCP_USER_TIMEOUT socket option more accurate, from Jon
Maxwell.
- Support for templates in packet scheduler classifier, from Jiri
Pirko.
- IPV6 support in RDS, from Ka-Cheong Poon.
- Native tproxy support in nf_tables, from Máté Eckl.
- Maintain IP fragment queue in an rbtree, but optimize properly for
in-order frags. From Peter Oskolkov.
- Improvde handling of ACKs on hole repairs, from Yuchung Cheng"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1996 commits)
bpf: test: fix spelling mistake "REUSEEPORT" -> "REUSEPORT"
hv/netvsc: Fix NULL dereference at single queue mode fallback
net: filter: mark expected switch fall-through
xen-netfront: fix warn message as irq device name has '/'
cxgb4: Add new T5 PCI device ids 0x50af and 0x50b0
net: dsa: mv88e6xxx: missing unlock on error path
rds: fix building with IPV6=m
inet/connection_sock: prefer _THIS_IP_ to current_text_addr
net: dsa: mv88e6xxx: bitwise vs logical bug
net: sock_diag: Fix spectre v1 gadget in __sock_diag_cmd()
ieee802154: hwsim: using right kind of iteration
net: hns3: Add vlan filter setting by ethtool command -K
net: hns3: Set tx ring' tc info when netdev is up
net: hns3: Remove tx ring BD len register in hns3_enet
net: hns3: Fix desc num set to default when setting channel
net: hns3: Fix for phy link issue when using marvell phy driver
net: hns3: Fix for information of phydev lost problem when down/up
net: hns3: Fix for command format parsing error in hclge_is_all_function_id_zero
net: hns3: Add support for serdes loopback selftest
bnxt_en: take coredump_record structure off stack
...
Pull NAND updates from Miquel Raynal:
"
NAND core changes:
- Add the SPI-NAND framework.
- Create a helper to find the best ECC configuration.
- Create NAND controller operations.
- Allocate dynamically ONFI parameters structure.
- Add defines for ONFI version bits.
- Add manufacturer fixup for ONFI parameter page.
- Add an option to specify NAND chip as a boot device.
- Add Reed-Solomon error correction algorithm.
- Better name for the controller structure.
- Remove unused caller_is_module() definition.
- Make subop helpers return unsigned values.
- Expose _notsupp() helpers for raw page accessors.
- Add default values for dynamic timings.
- Kill the chip->scan_bbt() hook.
- Rename nand_default_bbt() into nand_create_bbt().
- Start to clean the nand_chip structure.
- Remove stale prototype from rawnand.h.
Raw NAND controllers drivers changes:
- Qcom: structuring cleanup.
- Denali: use core helper to find the best ECC configuration.
- Possible build of almost all drivers by adding a dependency on
COMPILE_TEST for almost all of them in Kconfig, implies various
fixes, Kconfig cleanup, GPIO headers inclusion cleanup, and even
changes in sparc64 and ia64 architectures.
- Clean the ->probe() functions error path of a lot of drivers.
- Migrate all drivers to use nand_scan() instead of
nand_scan_ident()/nand_scan_tail() pair.
- Use mtd_device_register() where applicable to simplify the code.
- Marvell:
* Handle on-die ECC.
* Better clocks handling.
* Remove bogus comment.
* Add suspend and resume support.
- Tegra: add NAND controller driver.
- Atmel:
* Add module param to avoid using dma.
* Drop Wenyou Yang from MAINTAINERS.
- Denali: optimize timings handling.
- FSMC: Stop using chip->read_buf().
- FSL:
* Switch to SPDX license tag identifiers.
* Fix qualifiers in MXC init functions.
Raw NAND chip drivers changes:
- Micron:
* Add fixup for ONFI revision.
* Update ecc_stats.corrected.
* Make ECC activation stateful.
* Avoid enabling/disabling ECC when it can't be disabled.
* Get the actual number of bitflips.
* Allow forced on-die ECC.
* Support 8/512 on-die ECC.
* Fix on-die ECC detection logic.
- Hynix:
* Fix decoding the OOB size on H27UCG8T2BTR.
* Use ->exec_op() in hynix_nand_reg_write_op().
"
Kalle Valo says:
====================
wireless-drivers-next patches for 4.19
The first set of patches for 4.19. Only smaller features and bug
fixes, not really anything major. Also included are changes to
include/linux/bitfield.h, we agreed with Johannes that it makes sense
to apply them via wireless-drivers-next.
Major changes:
ath10k
* support channel 173
* fix spectral scan for QCA9984 and QCA9888 chipsets
ath6kl
* add support for Dell Wireless 1537
ti wlcore
* add support for runtime PM
* enable runtime PM autosuspend support
qtnfmac
* support changing MAC address
* enable source MAC address randomization support
libertas
* fix suspend and resume for SDIO cards
mt76
* add software DFS radar pattern detector for mt76x2 based devices
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add tests for the bitfield helpers. The constant ones will all
be folded to nothing by the compiler (if everything is correct
in the header file), and the variable ones do some tests against
open-coding the necessary shifts.
A few test cases that should fail/warn compilation are provided
under ifdef.
Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Pull locking fixes from Thomas Gleixner:
"A set of fixes and updates for the locking code:
- Prevent lockdep from updating irq state within its own code and
thereby confusing itself.
- Buid fix for older GCCs which mistreat anonymous unions
- Add a missing lockdep annotation in down_read_non_onwer() which
causes up_read_non_owner() to emit a lockdep splat
- Remove the custom alpha dec_and_lock() implementation which is
incorrect in terms of ordering and use the generic one.
The remaining two commits are not strictly fixes. They provide irqsave
variants of atomic_dec_and_lock() and refcount_dec_and_lock(). These
are required to merge the relevant updates and cleanups into different
maintainer trees for 4.19, so routing them into mainline without
actual users is the sanest approach.
They should have been in -rc1, but last weekend I took the liberty to
just avoid computers in order to regain some mental sanity"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/qspinlock: Fix build for anonymous union in older GCC compilers
locking/lockdep: Do not record IRQ state within lockdep code
locking/rwsem: Fix up_read_non_owner() warning with DEBUG_RWSEMS
locking/refcounts: Implement refcount_dec_and_lock_irqsave()
atomic: Add irqsave variant of atomic_dec_and_lock()
alpha: Remove custom dec_and_lock() implementation
In the quest to remove all stack VLA usage from the kernel[1], this
allocates a fixed size stack array to cover the range needed for
bch. This was done instead of a preallocation on the SLAB due to
performance reasons, shown by Ivan Djelic:
little-endian, type sizes: int=4 long=8 longlong=8
cpu: Intel(R) Core(TM) i5 CPU 650 @ 3.20GHz
calibration: iter=4.9143µs niter=2034 nsamples=200 m=13 t=4
Buffer allocation | Encoding throughput (Mbit/s)
---------------------------------------------------
on-stack, VLA | 3988
on-stack, fixed | 4494
kmalloc | 1967
So this change actually improves performance too, it seems.
The resulting stack allocation can get rather large; without
CONFIG_BCH_CONST_PARAMS, it will allocate 4096 bytes, which
trips the stack size checking:
lib/bch.c: In function ‘encode_bch’:
lib/bch.c:261:1: warning: the frame size of 4432 bytes is larger than 2048 bytes [-Wframe-larger-than=]
Even the default case for "allmodconfig" (with CONFIG_BCH_CONST_M=14 and
CONFIG_BCH_CONST_T=4) would have started throwing a warning:
lib/bch.c: In function ‘encode_bch’:
lib/bch.c:261:1: warning: the frame size of 2288 bytes is larger than 2048 bytes [-Wframe-larger-than=]
But this is how large it's always been; it was just hidden from
the checker because it was a VLA. So the Makefile has been adjusted to
silence this warning for anything smaller than 4500 bytes, which should
provide room for normal cases, but still low enough to catch any future
pathological situations.
[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Ivan Djelic <ivan.djelic@parrot.com>
Tested-by: Ivan Djelic <ivan.djelic@parrot.com>
Acked-by: Boris Brezillon <boris.brezillon@bootlin.com>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Currently the code is split over various files with dma- prefixes in the
lib/ and drives/base directories, and the number of files keeps growing.
Move them into a single directory to keep the code together and remove
the file name prefixes. To match the irq infrastructure this directory
is placed under the kernel/ directory.
Signed-off-by: Christoph Hellwig <hch@lst.de>
We already have exact config symbols to select the direct, non-coherent,
or virt dma ops. So use the normal obj- scheme to select them.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Alpha provides a custom implementation of dec_and_lock(). The functions
is split into two parts:
- atomic_add_unless() + return 0 (fast path in assembly)
- remaining part including locking (slow path in C)
Comparing the result of the alpha implementation with the generic
implementation compiled by gcc it looks like the fast path is optimized
by avoiding a stack frame (and reloading the GP), register store and all
this. This is only done in the slowpath.
After marking the slowpath (atomic_dec_and_lock_1()) as "noinline" and
doing the slowpath in C (the atomic_add_unless(atomic, -1, 1) part) I
noticed differences in the resulting assembly:
- the GP is still reloaded
- atomic_add_unless() adds more memory barriers compared to the custom
assembly
- the custom assembly here does "load, sub, beq" while
atomic_add_unless() does "load, cmpeq, add, bne". This is okay because
it compares against zero after subtraction while the generic code
compares against 1 before.
I'm not sure if avoiding the stack frame (and GP reloading) brings a lot
in terms of performance. Regarding the different barriers, Peter
Zijlstra says:
|refcount decrement needs to be a RELEASE operation, such that all the
|load/stores to the object happen before we decrement the refcount.
|
|Otherwise things like:
|
| obj->foo = 5;
| refcnt_dec(&obj->ref);
|
|can be re-ordered, which then allows fun scenarios like:
|
| CPU0 CPU1
|
| refcnt_dec(&obj->ref);
| if (dec_and_test(&obj->ref))
| free(obj);
| obj->foo = 5; // oops UaF
|
|
|This means (for alpha) that there should be a memory barrier _before_
|the decrement, however the dec_and_lock asm thing only has one _after_,
|which, per the above, is too late.
|
|The generic version using add_unless will result in memory barrier
|before and after (because that is the rule for atomic ops with a return
|value) which is strictly too many barriers for the refcount story, but
|who knows what other ordering requirements code has.
Remove the custom alpha implementation of dec_and_lock() and if it is an
issue (performance wise) then the fast path could still be inlined.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: linux-alpha@vger.kernel.org
Link: https://lkml.kernel.org/r/20180606115918.GG12198@hirez.programming.kicks-ass.net
Link: https://lkml.kernel.org/r20180612161621.22645-2-bigeasy@linutronix.de
Pull MIPS updates from James Hogan:
"These are the main MIPS changes for 4.18.
Rough overview:
- MAINTAINERS: Add Paul Burton as MIPS co-maintainer
- Misc: Generic compiler intrinsics, Y2038 improvements, Perf+MT fixes
- Platform support: Netgear WNR1000 V3, Microsemi Ocelot integrated
switch, Ingenic watchdog cleanups
More detailed summary:
Maintainers:
- Add Paul Burton as MIPS co-maintainer, as I soon won't have access
to much MIPS hardware, nor enough time to properly maintain MIPS on
my own.
Miscellaneous:
- Use generic GCC library routines from lib/
- Add notrace to generic ucmpdi2 implementation
- Rename compiler intrinsic selects to GENERIC_LIB_*
- vmlinuz: Use generic ashldi3
- y2038: Convert update/read_persistent_clock() to *_clock64()
- sni: Remove read_persistent_clock()
- perf: Fix perf with MT counting other threads
- Probe for per-TC perf counters in cpu-probe.c
- Use correct VPE ID for VPE tracing
Minor cleanups:
- Avoid unneeded built-in.a in DTS dirs
- sc-debugfs: Re-use kstrtobool_from_user
- memset.S: Reinstate delay slot indentation
- VPE: Fix spelling "uneeded" -> "Unneeded"
Platform support:
BCM47xx:
- Add support for Netgear WNR1000 V3
- firmware: Support small NVRAM partitions
- Use __initdata for LEDs platform data
Ingenic:
- Watchdog driver & platform code improvements:
- Disable clock after stopping counter
- Use devm_* functions
- Drop module remove function
- Move platform reset code to restart handler in driver
- JZ4740: Convert watchdog instantiation to DT
- JZ4780: Fix watchdog DT node
- qi_lb60_defconfig: Enable watchdog driver
Microsemi:
- Ocelot: Add support for integrated switch
- pcb123: Connect phys to ports"
* tag 'mips_4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (30 commits)
MAINTAINERS: Add Paul Burton as MIPS co-maintainer
MIPS: ptrace: Make FPU context layout comments match reality
MIPS: memset.S: Reinstate delay slot indentation
MIPS: perf: Fix perf with MT counting other threads
MIPS: perf: Use correct VPE ID when setting up VPE tracing
MIPS: perf: More robustly probe for the presence of per-tc counters
MIPS: Probe for MIPS MT perf counters per TC
MIPS: mscc: Connect phys to ports on ocelot_pcb123
MIPS: mscc: Add switch to ocelot
MIPS: JZ4740: Drop old platform reset code
MIPS: qi_lb60: Enable the jz4740-wdt driver
MIPS: JZ4780: dts: Fix watchdog node
MIPS: JZ4740: dts: Add bindings for the jz4740-wdt driver
watchdog: JZ4740: Drop module remove function
watchdog: JZ4740: Register a restart handler
watchdog: JZ4740: Use devm_* functions
watchdog: JZ4740: Disable clock after stopping counter
MIPS: VPE: Fix spelling mistake: "uneeded" -> "unneeded"
MIPS: Re-use kstrtobool_from_user()
MIPS: Convert update_persistent_clock() to update_persistent_clock64()
...
Pull overflow updates from Kees Cook:
"This adds the new overflow checking helpers and adds them to the
2-factor argument allocators. And this adds the saturating size
helpers and does a treewide replacement for the struct_size() usage.
Additionally this adds the overflow testing modules to make sure
everything works.
I'm still working on the treewide replacements for allocators with
"simple" multiplied arguments:
*alloc(a * b, ...) -> *alloc_array(a, b, ...)
and
*zalloc(a * b, ...) -> *calloc(a, b, ...)
as well as the more complex cases, but that's separable from this
portion of the series. I expect to have the rest sent before -rc1
closes; there are a lot of messy cases to clean up.
Summary:
- Introduce arithmetic overflow test helper functions (Rasmus)
- Use overflow helpers in 2-factor allocators (Kees, Rasmus)
- Introduce overflow test module (Rasmus, Kees)
- Introduce saturating size helper functions (Matthew, Kees)
- Treewide use of struct_size() for allocators (Kees)"
* tag 'overflow-v4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
treewide: Use struct_size() for devm_kmalloc() and friends
treewide: Use struct_size() for vmalloc()-family
treewide: Use struct_size() for kmalloc()-family
device: Use overflow helpers for devm_kmalloc()
mm: Use overflow helpers in kvmalloc()
mm: Use overflow helpers in kmalloc_array*()
test_overflow: Add memory allocation overflow tests
overflow.h: Add allocation size calculation helpers
test_overflow: Report test failures
test_overflow: macrofy some more, do more tests for free
lib: add runtime test of check_*_overflow functions
compiler.h: enable builtin overflow checkers and add fallback code
This adds a small module for testing that the check_*_overflow
functions work as expected, whether implemented in C or using gcc
builtins.
Example output:
test_overflow: u8 : 18 tests
test_overflow: s8 : 19 tests
test_overflow: u16: 17 tests
test_overflow: s16: 17 tests
test_overflow: u32: 17 tests
test_overflow: s32: 17 tests
test_overflow: u64: 17 tests
test_overflow: s64: 21 tests
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
[kees: add output to commit log, drop u64 tests on 32-bit]
Signed-off-by: Kees Cook <keescook@chromium.org>
Add a new dma_map_ops implementation that uses dma-direct for the
address mapping of streaming mappings, and which requires arch-specific
implemenations of coherent allocate/free.
Architectures have to provide flushing helpers to ownership trasnfers
to the device and/or CPU, and can provide optional implementations of
the coherent mmap functionality, and the cache_flush routines for
non-coherent long term allocations.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Alexey Brodkin <abrodkin@synopsys.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
This code is only used by sparc, and all new iommu drivers should use the
drivers/iommu/ framework. Also remove the unused exports.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>