Commit Graph

301 Commits

Author SHA1 Message Date
Dexuan Cui ca1c4b7457 Drivers: hv: vmbus: fix init_vp_index() for reloading hv_netvsc
This fixes the recent commit 3b71107d73:
Drivers: hv: vmbus: Further improve CPU affiliation logic

Without the fix, reloading hv_netvsc hangs the guest.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-20 22:44:51 -07:00
Vitaly Kuznetsov f39c4280a3 Drivers: hv: vmbus: use cpu_hotplug_enable/disable
Commit e513229b4c ("Drivers: hv: vmbus: prevent cpu offlining on newer
hypervisors") was altering smp_ops.cpu_disable to prevent CPU offlining.
We can bo better by using cpu_hotplug_enable/disable functions instead of
such hard-coding.

Reported-by: Radim Kr.má  <rkrcmar@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-05 11:46:44 -07:00
Dexuan Cui 042ab0313b Drivers: hv: vmbus: add a sysfs attr to show the binding of channel/VP
This is useful to analyze performance issue.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-05 11:44:29 -07:00
K. Y. Srinivasan ca9357bd26 Drivers: hv: vmbus: Implement a clocksource based on the TSC page
The current Hyper-V clock source is based on the per-partition reference counter
and this counter is being accessed via s synthetic MSR - HV_X64_MSR_TIME_REF_COUNT.
Hyper-V has a more efficient way of computing the per-partition reference
counter value that does not involve reading a synthetic MSR. We implement
a time source based on this mechanism.

Tested-by: Vivek Yadav <vyadav@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-05 11:44:29 -07:00
Viresh Kumar bc609cb47f drivers/hv: Migrate to new 'set-state' interface
Migrate hv driver to the new 'set-state' interface provided by
clockevents core, the earlier 'set-mode' interface is marked obsolete
now.

This also enables us to implement callbacks for new states of clockevent
devices, for example: ONESHOT_STOPPED.

Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: devel@linuxdriverproject.org
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-05 11:44:29 -07:00
Christopher Oo a5cca686ce Drivers: hv_vmbus: Fix signal to host condition
Fixes a bug where previously hv_ringbuffer_read would pass in the old
number of bytes available to read instead of the expected old read index
when calculating when to signal to the host that the ringbuffer is empty.
Since the previous write size is already saved, also changes the
hv_need_to_signal_on_read to use the previously read value rather than
recalculating it.

Signed-off-by: Christopher Oo <t-chriso@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-05 11:44:29 -07:00
Dexuan Cui 3b71107d73 Drivers: hv: vmbus: Further improve CPU affiliation logic
Keep track of CPU affiliations of sub-channels within the scope of the primary
channel. This will allow us to better distribute the load amongst available
CPUs.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-05 11:44:28 -07:00
K. Y. Srinivasan 9f01ec5345 Drivers: hv: vmbus: Improve the CPU affiliation for channels
The current code tracks the assigned CPUs within a NUMA node in the context of
the primary channel. So, if we have a VM with a single NUMA node with 8 VCPUs, we may
end up unevenly distributing the channel load. Fix the issue by tracking affiliations
globally.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-05 11:41:31 -07:00
Jake Oshins 3546448338 drivers:hv: Move MMIO range picking from hyper_fb to hv_vmbus
This patch deletes the logic from hyperv_fb which picked a range of MMIO space
for the frame buffer and adds new logic to hv_vmbus which picks ranges for
child drivers.  The new logic isn't quite the same as the old, as it considers
more possible ranges.

Signed-off-by: Jake Oshins <jakeo@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-05 11:41:31 -07:00
Jake Oshins 7f163a6fd9 drivers:hv: Modify hv_vmbus to search for all MMIO ranges available.
This patch changes the logic in hv_vmbus to record all of the ranges in the
VM's firmware (BIOS or UEFI) that offer regions of memory-mapped I/O space for
use by paravirtual front-end drivers.  The old logic just found one range
above 4GB and called it good.  This logic will find any ranges above 1MB.

It would have been possible with this patch to just use existing resource
allocation functions, rather than keep track of the entire set of Hyper-V
related MMIO regions in VMBus.  This strategy, however, is not sufficient
when the resource allocator needs to be aware of the constraints of a
Hyper-V virtual machine, which is what happens in the next patch in the series.
So this first patch exists to show the first steps in reworking the MMIO
allocation paths for Hyper-V front-end drivers.

Signed-off-by: Jake Oshins <jakeo@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-05 11:41:30 -07:00
K. Y. Srinivasan 379e4f756b Drivers: hv: vmbus: Consider ND NIC in binding channels to CPUs
We cycle through all the "high performance" channels to distribute
load across the available CPUs. Process the NetworkDirect as a
high performance device.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:30:44 -07:00
Denis V. Lunev cc2dd4027a mshyperv: fix recognition of Hyper-V guest crash MSR's
Hypervisor Top Level Functional Specification v3.1/4.0 notes that cpuid
(0x40000003) EDX's 10th bit should be used to check that Hyper-V guest
crash MSR's functionality available.

This patch should fix this recognition. Currently the code checks EAX
register instead of EDX.

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:30:44 -07:00
Vitaly Kuznetsov 4a54243fc0 Drivers: hv: vmbus: don't send CHANNELMSG_UNLOAD on pre-Win2012R2 hosts
Pre-Win2012R2 hosts don't properly handle CHANNELMSG_UNLOAD and
wait_for_completion() hangs. Avoid sending such request on old hosts.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:29:45 -07:00
Nik Nyby e26009aad0 Drivers: hv: vmbus: fix typo in hv_port_info struct
This fixes a typo: base_flag_bumber to base_flag_number

Signed-off-by: Nik Nyby <nikolas@gnu.org>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:29:45 -07:00
Dan Carpenter 9dd6a06430 hv: util: checking the wrong variable
We don't catch this allocation failure because there is a typo and we
check the wrong variable.

Fixes: 14b50f80c3 ('Drivers: hv: util: introduce hv_utils_transport abstraction')

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:29:45 -07:00
K. Y. Srinivasan b81658cf5d Drivers: hv: vmbus: Permit sending of packets without payload
The guest may have to send a completion packet back to the host.
To support this usage, permit sending a packet without a payload -
we would be only sending the descriptor in this case.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:28:39 -07:00
Alex Ng b6ddeae160 Drivers: hv: balloon: Enable dynamic memory protocol negotiation with Windows 10 hosts
Support Win10 protocol for Dynamic Memory. Thia patch allows guests on Win10 hosts
to hot-add memory even when dynamic memory is not enabled on the guest.

Signed-off-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:28:39 -07:00
Vitaly Kuznetsov 25ef06fe27 Drivers: hv: fcopy: dynamically allocate smsg_out in fcopy_send_data()
struct hv_start_fcopy is too big to be on stack on i386, the following
warning is reported:

>> drivers/hv/hv_fcopy.c:159:1: warning: the frame size of 1088 bytes is larger than 1024 bytes [-Wframe-larger-than=]

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:28:38 -07:00
Vitaly Kuznetsov b36fda3397 Drivers: hv: kvp: check kzalloc return value
kzalloc() return value check was accidentally lost in 11bc3a5fa9:
"Drivers: hv: kvp: convert to hv_utils_transport" commit.

We don't need to reset kvp_transaction.state here as we have the
kvp_timeout_func() timeout function and in case we're in OOM situation
it is preferable to wait.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:28:38 -07:00
Vitaly Kuznetsov 510f7aef65 Drivers: hv: vmbus: prefer 'die' notification chain to 'panic'
current_pt_regs() sometimes returns regs of the userspace process and in
case of a kernel crash this is not what we need to report. E.g. when we
trigger crash with sysrq we see the following:
...
 RIP: 0010:[<ffffffff815b8696>]  [<ffffffff815b8696>] sysrq_handle_crash+0x16/0x20
 RSP: 0018:ffff8800db0a7d88  EFLAGS: 00010246
 RAX: 000000000000000f RBX: ffffffff820a0660 RCX: 0000000000000000
...
at the same time current_pt_regs() give us:
ip=7f899ea7e9e0, ax=ffffffffffffffda, bx=26c81a0, cx=7f899ea7e9e0, ...
These registers come from the userspace process triggered the crash. As we
don't even know which process it was this information is rather useless.

When kernel crash happens through 'die' proper regs are being passed to
all receivers on the die_chain (and panic_notifier_list is being notified
with the string passed to panic() only). If panic() is called manually
(e.g. on BUG()) we won't get 'die' notification so keep the 'panic'
notification reporter as well but guard against double reporting.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:28:38 -07:00
Vitaly Kuznetsov b4370df2b1 Drivers: hv: vmbus: add special crash handler
Full kernel hang is observed when kdump kernel starts after a crash. This
hang happens in vmbus_negotiate_version() function on
wait_for_completion() as Hyper-V host (Win2012R2 in my testing) never
responds to CHANNELMSG_INITIATE_CONTACT as it thinks the connection is
already established. We need to perform some mandatory minimalistic
cleanup before we start new kernel.

Reported-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:28:38 -07:00
Vitaly Kuznetsov d7646eaa76 Drivers: hv: don't do hypercalls when hypercall_page is NULL
At the very late stage of kexec a driver (which are not being unloaded) can
try to post a message or signal an event. This will crash the kernel as we
already did hv_cleanup() and the hypercall page is NULL.

Move all common (between 32 and 64 bit code) declarations to the beginning
of the do_hypercall() function. Unfortunately we have to write the
!hypercall_page check twice to not mix declarations and code.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:25:29 -07:00
Vitaly Kuznetsov 2517281d63 Drivers: hv: vmbus: add special kexec handler
When general-purpose kexec (not kdump) is being performed in Hyper-V guest
the newly booted kernel fails with an MCE error coming from the host. It
is the same error which was fixed in the "Drivers: hv: vmbus: Implement
the protocol for tearing down vmbus state" commit - monitor pages remain
special and when they're being written to (as the new kernel doesn't know
these pages are special) bad things happen. We need to perform some
minimalistic cleanup before booting a new kernel on kexec. To do so we
need to register a special machine_ops.shutdown handler to be executed
before the native_machine_shutdown(). Registering a shutdown notification
handler via the register_reboot_notifier() call is not sufficient as it
happens to early for our purposes. machine_ops is not being exported to
modules (and I don't think we want to export it) so let's do this in
mshyperv.c

The minimalistic cleanup consists of cleaning up clockevents, synic MSRs,
guest os id MSR, and hypercall MSR.

Kdump doesn't require all this stuff as it lives in a separate memory
space.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:25:29 -07:00
Vitaly Kuznetsov 06210b42f3 Drivers: hv: vmbus: remove hv_synic_free_cpu() call from hv_synic_cleanup()
We already have hv_synic_free() which frees all per-cpu pages for all
CPUs, let's remove the hv_synic_free_cpu() call from hv_synic_cleanup()
so it will be possible to do separate cleanup (writing to MSRs) and final
freeing. This is going to be used to assist kexec.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:25:28 -07:00
K. Y. Srinivasan 294409d205 Drivers: hv: vmbus: Allocate ring buffer memory in NUMA aware fashion
Allocate ring buffer memory from the NUMA node assigned to the channel.
Since this is a performance and not a correctness issue, if the node specific
allocation were to fail, fall back and allocate without specifying the node.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-06-12 16:58:33 -07:00