Reorganize the E400 detection now that we have everything in place:
switch the CPUs to broadcast mode after the LAPIC has been initialized
and remove the facilities that were used previously on the idle path.
Unfortunately static_cpu_has_bug() cannpt be used in the E400 idle routine
because alternatives have been applied when the actual detection happens,
so the static switching does not take effect and the test will stay
false. Use boot_cpu_has_bug() instead which is definitely an improvement
over the RDMSR and the cpumask handling.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20161209182912.2726-5-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
AMD CPUs affected by the E400 erratum suffer from the issue that the
local APIC timer stops when the CPU goes into C1E. Unfortunately there
is no way to detect the affected CPUs on early boot. It's only possible
to determine the range of possibly affected CPUs from the family/model
range.
The actual decision whether to enter C1E and thus cause the bug is done
by the firmware and we need to detect that case late, after ACPI has
been initialized.
The current solution is to check in the idle routine whether the CPU is
affected by reading the MSR_K8_INT_PENDING_MSG MSR and checking for the
K8_INTP_C1E_ACTIVE_MASK bits. If one of the bits is set then the CPU is
affected and the system is switched into forced broadcast mode.
This is ineffective and on non-affected CPUs every entry to idle does
the extra RDMSR.
After doing some research it turns out that the bits are visible on the
boot CPU right after the ACPI subsystem is initialized in the early
boot process. So instead of polling for the bits in the idle loop, add
a detection function after acpi_subsystem_init() and check for the MSR
bits. If set, then the X86_BUG_AMD_APIC_C1E is set on the boot CPU and
the TSC is marked unstable when X86_FEATURE_NONSTOP_TSC is not set as it
will stop in C1E state as well.
The switch to broadcast mode cannot be done at this point because the
boot CPU still uses HPET as a clockevent device and the local APIC timer
is not yet calibrated and installed. The switch to broadcast mode on the
affected CPUs needs to be done when the local APIC timer is actually set
up.
This allows to cleanup the amd_e400_idle() function in the next step.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20161209182912.2726-4-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The workaround for the AMD Erratum E400 (Local APIC timer stops in C1E
state) is a two step process:
- Selection of the E400 aware idle routine
- Detection whether the platform is affected
The idle routine selection happens for possibly affected CPUs depending on
family/model/stepping information. These range of CPUs is not necessarily
affected as the decision whether to enable the C1E feature is made by the
firmware. Unfortunately there is no way to query this at early boot.
The current implementation polls a MSR in the E400 aware idle routine to
detect whether the CPU is affected. This is inefficient on non affected
CPUs because every idle entry has to do the MSR read.
There is a better way to detect this before going idle for the first time
which requires to seperate the bug flags:
X86_BUG_AMD_E400 - Selects the E400 aware idle routine and
enables the detection
X86_BUG_AMD_APIC_C1E - Set when the platform is affected by E400
Replace the current X86_BUG_AMD_APIC_C1E usage by the new X86_BUG_AMD_E400
bug bit to select the idle routine which currently does an unconditional
detection poll. X86_BUG_AMD_APIC_C1E is going to be used in later patches
to remove the MSR polling and simplify the handling of this misfeature.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20161209182912.2726-3-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
In preparation for removing the idle_notifier, remove its only user, the
i7300_idle driver.
i7300_idle was deployed in 2008 to reduce idle memory power on systems
using the i7300 chipset. The driver worked by throttling the
fully-buffered DIMMs during idle periods using the IOAT DMA engine.
The driver ran only on the i7300 chip-set, and no other hardware has used
this mechanism. The driver no longer has a maintainer.
Removing this driver will increase idle power on i7300 systems when they
run the new kernel without the driver.
Signed-off-by: Len Brown <len.brown@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/ad6a044e57cc75f44cc8621abe846e58f7882243.1479449716.git.len.brown@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull KVM fixes from Paolo Bonzini:
"ARM fixes. There are a couple pending x86 patches but they'll have to
wait for next week"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: arm/arm64: vgic: Kick VCPUs when queueing already pending IRQs
KVM: arm/arm64: vgic: Prevent access to invalid SPIs
arm/arm64: KVM: Perform local TLB invalidation when multiplexing vcpus on a single CPU
Merge media fixes from Mauro Carvalho Chehab:
"This contains two patches fixing problems with my patch series meant
to make USB drivers to work again after the DMA on stack changes.
The last patch on this series is actually not related to DMA on stack.
It solves a longstanding bug affecting module unload, causing
module_put() to be called twice. It was reported by the user who
reported and tested the issues with the gp8psk driver with the DMA
fixup patches. As we're late at -rc cycle, maybe you prefer to not
apply it right now. If this is the case, I'll add to the pile of
patches for 4.10.
Exceptionally this time, I'm sending the patches via e-mail, because
I'm on another trip, and won't be able to use the usual procedure
until Monday. Also, it is only three patches, and you followed already
the discussions about the first one"
* emailed patches from Mauro Carvalho Chehab <mchehab@osg.samsung.com>:
gp8psk: Fix DVB frontend attach
gp8psk: fix gp8psk_usb_in_op() logic
dvb-usb: move data_mutex to struct dvb_usb_device
Pull char/misc fixes from Greg KH:
"Here are three small driver fixes for some reported issues for
4.9-rc5.
One for the hyper-v subsystem, fixing up a naming issue that showed up
in 4.9-rc1, one mei driver fix, and one fix for parallel ports,
resolving a reported regression.
All have been in linux-next with no reported issues"
* tag 'char-misc-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
ppdev: fix double-free of pp->pdev->name
vmbus: make sysfs names consistent with PCI
mei: bus: fix received data size check in NFC fixup
Pull driver core fixes from Greg KH:
"Here are two driver core fixes for 4.9-rc5.
The first resolves an issue with some drivers not liking to be unbound
and bound again (if CONFIG_DEBUG_TEST_DRIVER_REMOVE is enabled), which
solves some reported problems with graphics and storage drivers. The
other resolves a smatch error with the 4.9-rc1 driver core changes
around this feature.
Both have been in linux-next with no reported issues"
* tag 'driver-core-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
driver core: fix smatch warning on dev->bus check
driver core: skip removal test for non-removable drivers
Pull staging/IIO fixes from Grek KH:
"Here are a few small staging and iio driver fixes for reported issues.
The last one was cherry-picked from my -next branch to resolve a build
warning that Arnd fixed, in his quest to be able to turn
-Wmaybe-uninitialized back on again. That patch, and all of the
others, have been in linux-next for a while with no reported issues"
* tag 'staging-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
iio: maxim_thermocouple: detect invalid storage size in read()
staging: nvec: remove managed resource from PS2 driver
Revert "staging: nvec: ps2: change serio type to passthrough"
drivers: staging: nvec: remove bogus reset command for PS/2 interface
staging: greybus: arche-platform: fix device reference leak
staging: comedi: ni_tio: fix buggy ni_tio_clock_period_ps() return value
staging: sm750fb: Fix bugs introduced by early commits
iio: hid-sensors: Increase the precision of scale to fix wrong reading interpretation.
iio: orientation: hid-sensor-rotation: Add PM function (fix non working driver)
iio: st_sensors: fix scale configuration for h3lis331dl
staging: iio: ad5933: avoid uninitialized variable in error case
Pull USB / PHY fixes from Greg KH:
"Here are a number of small USB and PHY driver fixes for 4.9-rc5
Nothing major, just small fixes for reported issues, all of these have
been in linux-next for a while with no reported issues"
* tag 'usb-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: cdc-acm: fix TIOCMIWAIT
cdc-acm: fix uninitialized variable
drivers/usb: Skip auto handoff for TI and RENESAS usb controllers
usb: musb: remove duplicated actions
usb: musb: da8xx: Don't print phy error on -EPROBE_DEFER
phy: sun4i: check PMU presence when poking unknown bit of pmu
phy-rockchip-pcie: remove deassert of phy_rst from exit callback
phy: da8xx-usb: rename the ohci device to ohci-da8xx
phy: Add reset callback for not generic phy
uwb: fix device reference leaks
usb: gadget: u_ether: remove interrupt throttling
usb: dwc3: st: add missing <linux/pinctrl/consumer.h> include
usb: dwc3: Fix error handling for core init
Pull more block fixes from Jens Axboe:
"Since I mistakenly left out the lightnvm regression fix yesterday and
the aoeblk seems adequately tested at this point, might as well send
out another pull to make -rc5"
* 'for-linus' of git://git.kernel.dk/linux-block:
aoe: fix crash in page count manipulation
lightnvm: invalid offset calculation for lba_shift
Pull SCSI fixes from James Bottomley:
"The megaraid_sas patch in here fixes a major regression in the last
fix set that made all megaraid_sas cards unusable. It turns out no-one
had actually tested such an "obvious" fix, sigh. The fix for the fix
has been tested ...
The next most serious is the vmw_pvscsi abort problem which basically
means that aborts don't work on the vmware paravirt devices and error
handling always escalates to reset.
The rest are an assortment of missed reference counting in certain
paths and corner case bugs that show up on some architectures"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: megaraid_sas: fix macro MEGASAS_IS_LOGICAL to avoid regression
scsi: qla2xxx: fix invalid DMA access after command aborts in PCI device remove
scsi: qla2xxx: do not queue commands when unloading
scsi: libcxgbi: fix incorrect DDP resource cleanup
scsi: qla2xxx: Fix scsi scan hang triggered if adapter fails during init
scsi: scsi_dh_alua: Fix a reference counting bug
scsi: vmw_pvscsi: return SUCCESS for successful command aborts
scsi: mpt3sas: Fix for block device of raid exists even after deleting raid disk
scsi: scsi_dh_alua: fix missing kref_put() in alua_rtpg_work()
Pull clk fixes from Stephen Boyd:
"The typical collection of minor bug fixes in clk drivers. We don't
have anything in the core framework here, just driver fixes.
There's a boot fix for Samsung devices and a safety measure for qoriq
to prevent CPUs from running too fast. There's also a fix for i.MX6Q
to properly handle audio clock rates. We also have some "that's
obviously wrong" fixes like bad NULL pointer checks in the MPP driver
and a poor usage of __pa in the xgene clk driver that are fixed here"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: mmp: pxa910: fix return value check in pxa910_clk_init()
clk: mmp: pxa168: fix return value check in pxa168_clk_init()
clk: mmp: mmp2: fix return value check in mmp2_clk_init()
clk: qoriq: Don't allow CPU clocks higher than starting value
clk: imx: fix integer overflow in AV PLL round rate
clk: xgene: Don't call __pa on ioremaped address
clk/samsung: Use CLK_OF_DECLARE_DRIVER initialization method for CLKOUT
clk: rockchip: don't return NULL when failing to register ddrclk branch
The DVB binding schema at the DVB core assumes that the frontend is a
separate driver. Faling to do that causes OOPS when the module is
removed, as it tries to do a symbol_put_addr on an internal symbol,
causing craches like:
WARNING: CPU: 1 PID: 28102 at kernel/module.c:1108 module_put+0x57/0x70
Modules linked in: dvb_usb_gp8psk(-) dvb_usb dvb_core nvidia_drm(PO) nvidia_modeset(PO) snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd soundcore nvidia(PO) [last unloaded: rc_core]
CPU: 1 PID: 28102 Comm: rmmod Tainted: P WC O 4.8.4-build.1 #1
Hardware name: MSI MS-7309/MS-7309, BIOS V1.12 02/23/2009
Call Trace:
dump_stack+0x44/0x64
__warn+0xfa/0x120
module_put+0x57/0x70
module_put+0x57/0x70
warn_slowpath_null+0x23/0x30
module_put+0x57/0x70
gp8psk_fe_set_frontend+0x460/0x460 [dvb_usb_gp8psk]
symbol_put_addr+0x27/0x50
dvb_usb_adapter_frontend_exit+0x3a/0x70 [dvb_usb]
From Derek's tests:
"Attach bug is fixed, tuning works, module unloads without
crashing. Everything seems ok!"
Reported-by: Derek <user.vdr@gmail.com>
Tested-by: Derek <user.vdr@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit bc29131ecb10 ("[media] gp8psk: don't do DMA on stack") fixed the
usage of DMA on stack, but the memcpy was wrong for gp8psk_usb_in_op().
Fix it.
From Derek's email:
"Fix confirmed using 2 different Skywalker models with
HD mpeg4, SD mpeg2."
Suggested-by: Johannes Stezenbach <js@linuxtv.org>
Fixes: bc29131ecb10 ("[media] gp8psk: don't do DMA on stack")
Tested-by: Derek <user.vdr@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As found by gcc -Wmaybe-uninitialized, having a storage_bytes value other
than 2 or 4 will result in undefined behavior:
drivers/iio/temperature/maxim_thermocouple.c: In function 'maxim_thermocouple_read':
drivers/iio/temperature/maxim_thermocouple.c:141:5: error: 'ret' may be used uninitialized in this function [-Werror=maybe-uninitialized]
This probably cannot happen, but returning -EINVAL here is appropriate
and makes gcc happy and the code more robust.
Fixes: 231147ee77 ("iio: maxim_thermocouple: Align 16 bit big endian value of raw reads")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
(cherry picked from commit 32cb7d27e6)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
aoeblk contains some mysterious code, that wants to elevate the bio
vec page counts while it's under IO. That is not needed, it's
fragile, and it's causing kernel oopses for some.
Reported-by: Tested-by: Don Koch <kochd@us.ibm.com>
Tested-by: Tested-by: Don Koch <kochd@us.ibm.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
The ns->lba_shift assumes its value to be the logarithmic of the
LA size. A previous patch duplicated the lba_shift calculation into
lightnvm. It prematurely also subtracted a 512byte shift, which commonly
is applied per-command. The 512byte shift being subtracted twice led to
data loss when restoring the logical to physical mapping table from
device and when issuing I/O commands using rrpc.
Fix offset by removing the 512byte shift subtraction when calculating
lba_shift.
Fixes: b0b4e09c1a "lightnvm: control life of nvm_dev in driver"
Reported-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>