Commit Graph

374 Commits

Author SHA1 Message Date
Wei Liu
ccc9d90a9a xenbus_client: Extend interface to support multi-page ring
Originally Xen PV drivers only use single-page ring to pass along
information. This might limit the throughput between frontend and
backend.

The patch extends Xenbus driver to support multi-page ring, which in
general should improve throughput if ring is the bottleneck. Changes to
various frontend / backend to adapt to the new interface are also
included.

Affected Xen drivers:
* blkfront/back
* netfront/back
* pcifront/back
* scsifront/back
* vtpmfront

The interface is documented, as before, in xenbus_client.c.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Cc: Konrad Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-04-15 10:56:47 +01:00
David Vrabel
4e8c0c8c4b xen/privcmd: improve performance of MMAPBATCH_V2
Make the IOCTL_PRIVCMD_MMAPBATCH_V2 (and older V1 version) map
multiple frames at a time rather than one at a time, despite the pages
being non-consecutive GFNs.

xen_remap_foreign_mfn_array() is added which maps an array of GFNs
(instead of a consecutive range of GFNs).

Since per-frame errors are returned in an array, privcmd must set the
MMAPBATCH_V1 error bits as part of the "report errors" phase, after
all the frames are mapped.

Migrate times are significantly improved (when using a PV toolstack
domain).  For example, for an idle 12 GiB PV guest:

        Before     After
  real  0m38.179s  0m26.868s
  user  0m15.096s  0m13.652s
  sys   0m28.988s  0m18.732s

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2015-03-16 14:49:15 +00:00
David Vrabel
628c28eefd xen: unify foreign GFN map/unmap for auto-xlated physmap guests
Auto-translated physmap guests (arm, arm64 and x86 PVHVM/PVH) map and
unmap foreign GFNs using the same method (updating the physmap).
Unify the two arm and x86 implementations into one commont one.

Note that on arm and arm64, the correct error code will be returned
(instead of always -EFAULT) and map/unmap failure warnings are no
longer printed.  These changes are required if the foreign domain is
paging (-ENOENT failures are expected and must be propagated up to the
caller).

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2015-03-16 14:49:15 +00:00
Juergen Gross
16b12d6057 xen: synchronize include/xen/interface/xen.h with xen
The header include/xen/interface/xen.h doesn't contain all definitions
from Xen's version of that header. Update it accordingly.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-03-16 14:49:13 +00:00
Yuval Shaia
604b91fee4 xen: Remove trailing semicolon from xenbus_register_frontend() definition
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-03-02 10:38:59 +00:00
David Vrabel
fdfd811ddd x86/xen: allow privcmd hypercalls to be preempted
Hypercalls submitted by user space tools via the privcmd driver can
take a long time (potentially many 10s of seconds) if the hypercall
has many sub-operations.

A fully preemptible kernel may deschedule such as task in any upcall
called from a hypercall continuation.

However, in a kernel with voluntary or no preemption, hypercall
continuations in Xen allow event handlers to be run but the task
issuing the hypercall will not be descheduled until the hypercall is
complete and the ioctl returns to user space.  These long running
tasks may also trigger the kernel's soft lockup detection.

Add xen_preemptible_hcall_begin() and xen_preemptible_hcall_end() to
bracket hypercalls that may be preempted.  Use these in the privcmd
driver.

When returning from an upcall, call xen_maybe_preempt_hcall() which
adds a schedule point if if the current task was within a preemptible
hypercall.

Since _cond_resched() can move the task to a different CPU, clear and
set xen_in_preemptible_hcall around the call.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2015-02-23 16:30:24 +00:00
Linus Torvalds
c5ce28df0e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) More iov_iter conversion work from Al Viro.

    [ The "crypto: switch af_alg_make_sg() to iov_iter" commit was
      wrong, and this pull actually adds an extra commit on top of the
      branch I'm pulling to fix that up, so that the pre-merge state is
      ok.   - Linus ]

 2) Various optimizations to the ipv4 forwarding information base trie
    lookup implementation.  From Alexander Duyck.

 3) Remove sock_iocb altogether, from CHristoph Hellwig.

 4) Allow congestion control algorithm selection via routing metrics.
    From Daniel Borkmann.

 5) Make ipv4 uncached route list per-cpu, from Eric Dumazet.

 6) Handle rfs hash collisions more gracefully, also from Eric Dumazet.

 7) Add xmit_more support to r8169, e1000, and e1000e drivers.  From
    Florian Westphal.

 8) Transparent Ethernet Bridging support for GRO, from Jesse Gross.

 9) Add BPF packet actions to packet scheduler, from Jiri Pirko.

10) Add support for uniqu flow IDs to openvswitch, from Joe Stringer.

11) New NetCP ethernet driver, from Muralidharan Karicheri and Wingman
    Kwok.

12) More sanely handle out-of-window dupacks, which can result in
    serious ACK storms.  From Neal Cardwell.

13) Various rhashtable bug fixes and enhancements, from Herbert Xu,
    Patrick McHardy, and Thomas Graf.

14) Support xmit_more in be2net, from Sathya Perla.

15) Group Policy extensions for vxlan, from Thomas Graf.

16) Remove Checksum Offload support for vxlan, from Tom Herbert.

17) Like ipv4, support lockless transmit over ipv6 UDP sockets.  From
    Vlad Yasevich.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1494+1 commits)
  crypto: fix af_alg_make_sg() conversion to iov_iter
  ipv4: Namespecify TCP PMTU mechanism
  i40e: Fix for stats init function call in Rx setup
  tcp: don't include Fast Open option in SYN-ACK on pure SYN-data
  openvswitch: Only set TUNNEL_VXLAN_OPT if VXLAN-GBP metadata is set
  ipv6: Make __ipv6_select_ident static
  ipv6: Fix fragment id assignment on LE arches.
  bridge: Fix inability to add non-vlan fdb entry
  net: Mellanox: Delete unnecessary checks before the function call "vunmap"
  cxgb4: Add support in cxgb4 to get expansion rom version via ethtool
  ethtool: rename reserved1 memeber in ethtool_drvinfo for expansion ROM version
  net: dsa: Remove redundant phy_attach()
  IB/mlx4: Reset flow support for IB kernel ULPs
  IB/mlx4: Always use the correct port for mirrored multicast attachments
  net/bonding: Fix potential bad memory access during bonding events
  tipc: remove tipc_snprintf
  tipc: nl compat add noop and remove legacy nl framework
  tipc: convert legacy nl stats show to nl compat
  tipc: convert legacy nl net id get to nl compat
  tipc: convert legacy nl net id set to nl compat
  ...
2015-02-10 20:01:30 -08:00
David Vrabel
923b2919e2 xen/gntdev: mark userspace PTEs as special on x86 PV guests
In an x86 PV guest, get_user_pages_fast() on a userspace address range
containing foreign mappings does not work correctly because the M2P
lookup of the MFN from a userspace PTE may return the wrong page.

Force get_user_pages_fast() to fail on such addresses by marking the PTEs
as special.

If Xen has XENFEAT_gnttab_map_avail_bits (available since at least
4.0), we can do so efficiently in the grant map hypercall.  Otherwise,
it needs to be done afterwards.  This is both inefficient and racy
(the mapping is visible to the task before we fixup the PTEs), but
will be fine for well-behaved applications that do not use the mapping
until after the mmap() system call returns.

Guests with XENFEAT_auto_translated_physmap (ARM and x86 HVM or PVH)
do not need this since get_user_pages() has always worked correctly
for them.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2015-01-28 14:04:21 +00:00
Jennifer Herbert
3f9f1c6757 xen/grant-table: add a mechanism to safely unmap pages that are in use
Introduce gnttab_unmap_refs_async() that can be used to safely unmap
pages that may be in use (ref count > 1).  If the pages are in use the
unmap is deferred and retried later.  This polling is not very clever
but it should be good enough if the cases where the delay is necessary
are rare.

The initial delay is 5 ms and is increased linearly on each subsequent
retry (to reduce load if the page is in use for a long time).

This is needed to allow block backends using grant mapping to safely
use network storage (block or filesystem based such as iSCSI or NFS).

The network storage driver may complete a block request whilst there
is a queued network packet retry (because the ack from the remote end
races with deciding to queue the retry).  The pages for the retried
packet would be grant unmapped and the network driver (or hardware)
would access the unmapped page.

Signed-off-by: Jennifer Herbert <jennifer.herbert@citrix.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-01-28 14:03:14 +00:00
Jennifer Herbert
8da7633f16 xen: mark grant mapped pages as foreign
Use the "foreign" page flag to mark pages that have a grant map.  Use
page->private to store information of the grant (the granting domain
and the grant reference).

Signed-off-by: Jennifer Herbert <jennifer.herbert@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-01-28 14:03:12 +00:00
David Vrabel
ff4b156f16 xen/grant-table: add helpers for allocating pages
Add gnttab_alloc_pages() and gnttab_free_pages() to allocate/free pages
suitable to for granted maps.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2015-01-28 14:03:12 +00:00
David Vrabel
853d028934 xen/grant-table: pre-populate kernel unmap ops for xen_gnttab_unmap_refs()
When unmapping grants, instead of converting the kernel map ops to
unmap ops on the fly, pre-populate the set of unmap ops.

This allows the grant unmap for the kernel mappings to be trivially
batched in the future.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2015-01-28 14:03:10 +00:00
David S. Miller
3f3558bb51 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/xen-netfront.c

Minor overlapping changes in xen-netfront.c, mostly to do
with some buffer management changes alongside the split
of stats into TX and RX.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-15 00:53:17 -05:00
David Vrabel
28e98c2c20 xen: add page_to_mfn()
pfn_to_mfn(page_to_pfn(p)) is a common use case so add a generic
helper for it.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-14 00:22:00 -05:00
Jan Beulich
f221b04fe0 x86/xen: properly retrieve NMI reason
Using the native code here can't work properly, as the hypervisor would
normally have cleared the two reason bits by the time Dom0 gets to see
the NMI (if passed to it at all). There's a shared info field for this,
and there's an existing hook to use - just fit the two together. This
is particularly relevant so that NMIs intended to be handled by APEI /
GHES actually make it to the respective handler.

Note that the hook can (and should) be used irrespective of whether
being in Dom0, as accessing port 0x61 in a DomU would be even worse,
while the shared info field would just hold zero all the time. Note
further that hardware NMI handling for PVH doesn't currently work
anyway due to missing code in the hypervisor (but it is expected to
work the native rather than the PV way).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-01-13 09:39:50 +00:00
Stefano Stabellini
da095a9960 xen/arm: introduce GNTTABOP_cache_flush
Introduce support for new hypercall GNTTABOP_cache_flush.
Use it to perform cache flashing on pages used for dma when necessary.

If GNTTABOP_cache_flush is supported by the hypervisor, we don't need to
bounce dma map operations that involve foreign grants and non-coherent
devices.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
2014-12-04 12:41:54 +00:00
Stefano Stabellini
e9e87eb3f9 xen/arm: remove handling of XENFEAT_grant_map_identity
The feature has been removed from Xen. Also Linux cannot use it on ARM32
without CONFIG_ARM_LPAE.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
2014-12-04 12:41:45 +00:00
David Vrabel
95afae4814 xen: remove DEFINE_XENBUS_DRIVER() macro
The DEFINE_XENBUS_DRIVER() macro looks a bit weird and causes sparse
errors.

Replace the uses with standard structure definitions instead.  This is
similar to pci and usb device registration.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2014-10-06 10:27:57 +01:00
Juergen Gross
bca9b68558 xen: sync some headers with xen tree
To be able to use an initially unmapped initrd with xen the following
header files must be synced to a newer version from the xen tree:

include/xen/interface/elfnote.h
include/xen/interface/xen.h

As the KEXEC and DUMPCORE related ELFNOTES are not relevant for the
kernel they are omitted from elfnote.h.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2014-10-03 12:34:52 +01:00
Juergen Gross
e124c9a2c3 xen: Add Xen pvSCSI protocol description
Add the definition of pvSCSI protocol used between the pvSCSI frontend
in a XEN domU and the pvSCSI backend in a XEN driver domain (usually
Dom0).

This header was originally provided by Fujitsu for Xen based on Linux
2.6.18.  Changes are:
- Added comments.
- Adapt to Linux style guide.
- Add support for larger SG-lists by putting them in an own granted
  page.
- Remove stale definitions.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2014-09-23 13:36:19 +00:00
Juergen Gross
854072dd0f xen/events: support threaded irqs for interdomain event channels
Export bind_interdomain_evtchn_to_irq() so drivers can use threaded
interrupt handlers with:

 irq = bind_interdomain_evtchn_to_irq(remote_dom, remote_port);
 if (irq < 0)
     /* error */
 ret = request_threaded_irq(...);

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2014-09-23 13:36:19 +00:00
Stefano Stabellini
5ebc77de83 xen/arm: introduce XENFEAT_grant_map_identity
The flag tells us that the hypervisor maps a grant page to guest
physical address == machine address of the page in addition to the
normal grant mapping address. It is needed to properly issue cache
maintenance operation at the completion of a DMA operation involving a
foreign grant.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Tested-by: Denis Schneider <v1ne2go@gmail.com>
2014-09-11 18:11:52 +00:00
Linus Torvalds
e306e3be1c Merge tag 'stable/for-linus-3.17-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull Xen updates from David Vrabel:
 - remove unused V2 grant table support
 - note that Konrad is xen-blkkback/front maintainer
 - add 'xen_nopv' option to disable PV extentions for x86 HVM guests
 - misc minor cleanups

* tag 'stable/for-linus-3.17-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen-pciback: Document the 'quirks' sysfs file
  xen/pciback: Fix error return code in xen_pcibk_attach()
  xen/events: drop negativity check of unsigned parameter
  xen/setup: Remove Identity Map Debug Message
  xen/events/fifo: remove a unecessary use of BM()
  xen/events/fifo: ensure all bitops are properly aligned even on x86
  xen/events/fifo: reset control block and local HEADs on resume
  xen/arm: use BUG_ON
  xen/grant-table: remove support for V2 tables
  x86/xen: safely map and unmap grant frames when in atomic context
  MAINTAINERS: Make me the Xen block subsystem (front and back) maintainer
  xen: Introduce 'xen_nopv' to disable PV extensions for HVM guests.
2014-08-07 11:33:15 -07:00
Linus Torvalds
76f09aa464 Merge branch 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI changes from Ingo Molnar:
 "Main changes in this cycle are:

   - arm64 efi stub fixes, preservation of FP/SIMD registers across
     firmware calls, and conversion of the EFI stub code into a static
     library - Ard Biesheuvel

   - Xen EFI support - Daniel Kiper

   - Support for autoloading the efivars driver - Lee, Chun-Yi

   - Use the PE/COFF headers in the x86 EFI boot stub to request that
     the stub be loaded with CONFIG_PHYSICAL_ALIGN alignment - Michael
     Brown

   - Consolidate all the x86 EFI quirks into one file - Saurabh Tangri

   - Additional error logging in x86 EFI boot stub - Ulf Winkelvos

   - Support loading initrd above 4G in EFI boot stub - Yinghai Lu

   - EFI reboot patches for ACPI hardware reduced platforms"

* 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits)
  efi/arm64: Handle missing virtual mapping for UEFI System Table
  arch/x86/xen: Silence compiler warnings
  xen: Silence compiler warnings
  x86/efi: Request desired alignment via the PE/COFF headers
  x86/efi: Add better error logging to EFI boot stub
  efi: Autoload efivars
  efi: Update stale locking comment for struct efivars
  arch/x86: Remove efi_set_rtc_mmss()
  arch/x86: Replace plain strings with constants
  xen: Put EFI machinery in place
  xen: Define EFI related stuff
  arch/x86: Remove redundant set_bit(EFI_MEMMAP) call
  arch/x86: Remove redundant set_bit(EFI_SYSTEM_TABLES) call
  efi: Introduce EFI_PARAVIRT flag
  arch/x86: Do not access EFI memory map if it is not available
  efi: Use early_mem*() instead of early_io*()
  arch/ia64: Define early_memunmap()
  x86/reboot: Add EFI reboot quirk for ACPI Hardware Reduced flag
  efi/reboot: Allow powering off machines using EFI
  efi/reboot: Add generic wrapper around EfiResetSystem()
  ...
2014-08-04 17:13:50 -07:00
Linus Torvalds
acba648dca Merge tag 'stable/for-linus-3.16-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull Xen fix from David Vrabel:
 "Fix BUG when trying to expand the grant table.  This seems to occur
  often during boot with Ubuntu 14.04 PV guests"

* tag 'stable/for-linus-3.16-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  x86/xen: safely map and unmap grant frames when in atomic context
2014-07-30 09:00:20 -07:00