Commit Graph

619644 Commits

Author SHA1 Message Date
Zhi Wang e39c5add32 drm/i915/gvt: vGPU MMIO virtualization
This patch introduces the generic vGPU MMIO emulation intercept
framework.  The MPT modules will request GVT-g core logic to
emulate MMIO read/write through IO emulation operations
callback when hypervisor trapped a guest GTTMMIO read/write.

Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2016-10-14 18:12:59 +08:00
Zhi Wang 4d60c5fd3f drm/i915/gvt: vGPU PCI configuration space virtualization
This patch introduces vGPU PCI configuration space virtualization.

- Adjust the trapped GPFN(Guest Page Frame Number) window of virtual GEN
PCI BAR 0 when guest initializes PCI BAR 0 address.

- Emulate OpRegion when guest touches OpRegion.

- Pass-through a part of aperture to guest when guest initializes
aperture BAR.

Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2016-10-14 18:12:46 +08:00
Zhi Wang 2707e44466 drm/i915/gvt: vGPU graphics memory virtualization
The vGPU graphics memory emulation framework is responsible for graphics
memory table virtualization. Under virtualization environment, a VM will
populate the page table entry with guest page frame number(GPFN/GFN), while
HW needs a page table filled with MFN(Machine frame number). The
relationship between GFN and MFN(Machine frame number) is managed by
hypervisor, while GEN HW doesn't have such knowledge to translate a GFN.

To solve this gap, shadow GGTT/PPGTT page table is introdcued.

For GGTT, the GFN inside the guest GGTT page table entry will be translated
into MFN and written into physical GTT MMIO registers when guest write
virtual GTT MMIO registers.

For PPGTT, a shadow PPGTT page table will be created and write-protected
translated from guest PPGTT page table.  And the shadow page table root
pointers will be written into the shadow context after a guest workload
is shadowed.

vGPU graphics memory emulation framework consists:

- Per-GEN HW platform page table entry bits extract/de-extract routines.
- GTT MMIO register emulation handlers, which will call hypercall to do
GFN->MFN translation when guest write GTT MMIO register
- PPGTT shadow page table routines, e.g. shadow create/destroy/out-of-sync

Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2016-10-14 18:12:33 +08:00
Zhi Wang c8fe6a6811 drm/i915/gvt: vGPU interrupt virtualization.
This patch introduces vGPU interrupt emulation framework.

The vGPU intrerrupt emulation framework is an event-based interrupt
emulation framework. It's responsible for emulating GEN hardware interrupts
during emulating other HW behaviour.

It consists several components:

- Descriptions of interrupt register bit
- Upper level <-> lower level interrupt mapping
- GEN HW IER/IMR/IIR register emulation routines
- Event-based interrupt propagation interface

When a GVT-g component wants to inject an interrupt to a VM during a
emulation, first it should specify the event needs to be emulated and the
framework will deal with the rest of emulation:

- Generating related virtual IIR bit according to virtual IER and IMRs,
- Generate related virtual upper level virtual IIR bit accodring to the
per-platform interrupt mapping
- Injecting a MSI to VM

Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2016-10-14 18:12:22 +08:00
Zhi Wang 3f728236c5 drm/i915/gvt: trace stub
v2:
- Make checkpatch.pl happy(Joonas)

Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2016-10-14 18:12:11 +08:00
Zhi Wang 82d375d1b5 drm/i915/gvt: Introduce basic vGPU life cycle management
A vGPU represents a virtual Intel GEN hardware, which consists following
virtual resources:

- Configuration space (virtualized)
- HW registers (virtualized)
- GGTT memory space (partitioned)
- GPU page table (shadowed)
- Fence registers (partitioned)

* virtualized: fully emulated by GVT-g.
* partitioned: Only a part of the HW resource is allowed to be accessed
by VM.
* shadowed: Resource needs to be translated and shadowed before getting
applied into HW.

This patch introduces vGPU life cycle management framework, which is
responsible for creating/destroying a vGPU and preparing/free resources
related to a vGPU.

Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2016-10-14 18:11:59 +08:00
Zhi Wang 579cea5f30 drm/i915/gvt: golden virtual HW state management
Each vGPU expects a golden virtual HW state, which is just the state after
system is freshly powered on. GVT-g will try to load the golden virtual HW
state via kernel firmware interface.

Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2016-10-14 18:11:46 +08:00
Zhi Wang 12d14cc43b drm/i915/gvt: Introduce a framework for tracking HW registers.
This patch introduces a framework for tracking HW registers on different
GEN platforms.

Accesses to GEN HW registers from VMs will be trapped by hypervisor. It
will forward these emulation requests to GVT-g device model, which
requires this framework to search for related register descriptions.

Each MMIO entry in this framework describes a GEN HW registers, e.g.
offset, length, whether it contains RO bits, whether it can be accessed by
LRIs...and also emulation handlers for emulating register reading and
writing.

- Use i915 MMIO register definition & statement.(Joonas)

Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2016-10-14 18:11:33 +08:00
Zhi Wang 28a60dee2c drm/i915/gvt: vGPU HW resource management
This patch introduces the GVT-g vGPU HW resource management. Under
GVT-g virtualizaion environment, each vGPU requires portions HW
resources, including aperture, hidden GM space, and fence registers.

When creating a vGPU, GVT-g will request these HW resources from host,
and return them to host after a vGPU is destroyed.

Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2016-10-14 18:11:19 +08:00
Chris Wilson 86e83e35d1 drm/i915: Merge duplicate gen4 and vlv/chv enable vblank callbacks
gen4/vlv/chv all use the same bits in pipestat to enable the vblank
interrupt, so they can share the same callbacks to enable/disable.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161007194953.15616-1-chris@chris-wilson.co.uk
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-10-13 21:58:52 +01:00
Dan Carpenter f7170e2eb8 drm/i915: fix a read size argument
We want to read 3 bytes here, but because the parenthesis are in the
wrong place we instead read:

	sizeof(intel_dp->edp_dpcd) == sizeof(intel_dp->edp_dpcd)

which is one byte.

Fixes: fe5a66f91c ("drm/i915: Read PSR caps/intermediate freqs/etc. only once on eDP")
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <drm-intel-fixes@lists.freedesktop.org>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161013085508.GJ16198@mwanda
2016-10-13 17:30:21 +02:00
Chris Wilson ad16d2ed8f drm/i915: Skip unbinding large unmappable global buffers
If the user requests a mappable binding to the global GTT, we will first
unbind an existing mapping if it doesn't match. We will unbind even if
there is no possibility that the object can fit in the mappable
aperture. This may lead to a ping-pong migration of the object, for
example igt/gem_exec_big.

v2: Comment upon the reasoning, or lack thereof!, behind the choice of
magic numbers.

Testcase: igt/gem_exec_big
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20161013085504.30705-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com
2016-10-13 15:47:47 +01:00
Chris Wilson 06392e3b21 drm/i915: Fix misplaced '\n' in printing the GPU error's RING_HEAD
'\n' is supposed to be at the end of the line, not in the middle.

Fixes: cdb324bde5 ("drm/i915: Show bounds of active request in the ring...")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161013101815.26978-2-chris@chris-wilson.co.uk
2016-10-13 13:29:13 +01:00
Chris Wilson 35ca039e0a drm/i915: Record the current requests queue for execlists upon hang
Mika wanted to know what requests were pending at the time of a hang as
we now track which requests we have submitted to the hardware.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161013101815.26978-1-chris@chris-wilson.co.uk
2016-10-13 13:29:13 +01:00
Tvrtko Ursulin db49296ba1 drm/i915: Shrink TV modes const data
Make struct video_levels and struct tv_mode use data types
of sufficient width to save approximately one kilobyte in
the .rodata section.

v2: Do not align struct members. (Jani Nikula, Joonas Lahtinen)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1476353366-13931-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-10-13 13:06:41 +01:00
Tvrtko Ursulin ae9400cab1 drm/i915: Shrink per-platform watermark configuration
Use types of more appropriate size in struct
intel_watermark_params to save 512 bytes of .rodata.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-10-13 13:06:40 +01:00
Tvrtko Ursulin 579627ea18 drm/i915: Shrink sdvo_cmd_names
Pack the struct _sdvo_cmd_name to save 736 bytes of .rodata.

This is fine since the name pointers are used only for debug.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-10-13 13:06:40 +01:00
Tvrtko Ursulin 44a655cae3 drm/i915: Shrink cxsr_latency_table
unsigned long is too wide - use smaller types in
struct cxsr_latency to save 800-something bytes of .rodata.

v2: All data even fits in u16 for even more saving. (Ville Syrjala)
v3: Move bitfields to the end of the struct. (Joonas Lahtinen)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-10-13 13:06:40 +01:00
Imre Deak 1c777c5d1d drm/i915/hsw: Fix GPU hang during resume from S3-devices state
Currently resuming on HSW from S3 pm_test/devices state leads to an
unrecoverable GPU hang. Resetting the GPU during suspend fixes this. For
a full S3 cycle this change only means the reset happens earlier (before
reaching S3). For S4 the reset will happen now both during the freeze
and quiesce phases, which is a benefit since it will guarantee that the
GPU is idle before creating and loading the hibernation image.

Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1476283597-580-1-git-send-email-imre.deak@intel.com
2016-10-13 12:11:31 +03:00
Chris Wilson 45353ce59b drm/i915: Treat a framebuffer reference as an active reference whilst shrinking
Treat a framebuffer reference with the same priority as an active
reference whilst shrinking. Framebuffers are likely to be reused and
typically cost more to migrate to and from GPU memory (on LLC
architectures we need to clflush), so defer the temptation to purge them
during a kswapd run until we have run out of cheap buffers.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: John Harrison <john.c.harrison@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161012124824.23521-1-chris@chris-wilson.co.uk
2016-10-12 17:17:20 +01:00
Chris Wilson 8baa1f04b9 drm/i915: Update debugfs describe_obj() to show fault-mappable
The current meaning of whether an object has a GGTT vma is very
ill-defined (and note we don't check for any partials either), it just
means that at some point it was in the GGTT but it may not be now. The
information we really care about here is whether it is taking up
precious mappable aperture space. This is the obj->fault_mappable flag.
We have a redundant long form reprinting of this information, so remove
that in favour of the compact flag.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161012114827.17031-2-chris@chris-wilson.co.uk
2016-10-12 16:27:14 +01:00
Chris Wilson 4676dc838b drm/i915: Use fence_write() from rpm resume
During rpm resume we restore the fences, but we do not have the
protection of struct_mutex. This rules out updating the activity
tracking on the fences, and requires us to rely on the rpm as the
serialisation barrier instead.

[  350.298052] [drm:intel_runtime_resume [i915]] Resuming device
[  350.308606]
[  350.310520] ===============================
[  350.315560] [ INFO: suspicious RCU usage. ]
[  350.320554] 4.8.0-rc8-bsw-rapl+ #3133 Tainted: G     U  W
[  350.327208] -------------------------------
[  350.331977] ../drivers/gpu/drm/i915/i915_gem_request.h:371 suspicious rcu_dereference_protected() usage!
[  350.342619]
[  350.342619] other info that might help us debug this:
[  350.342619]
[  350.351593]
[  350.351593] rcu_scheduler_active = 1, debug_locks = 0
[  350.358952] 3 locks held by Xorg/320:
[  350.363077]  #0:  (&dev->mode_config.mutex){+.+.+.}, at: [<ffffffffa030589c>] drm_modeset_lock_all+0x3c/0xd0 [drm]
[  350.375162]  #1:  (crtc_ww_class_acquire){+.+.+.}, at: [<ffffffffa03058a6>] drm_modeset_lock_all+0x46/0xd0 [drm]
[  350.387022]  #2:  (crtc_ww_class_mutex){+.+.+.}, at: [<ffffffffa0305056>] drm_modeset_lock+0x36/0x110 [drm]
[  350.398236]
[  350.398236] stack backtrace:
[  350.403196] CPU: 1 PID: 320 Comm: Xorg Tainted: G     U  W       4.8.0-rc8-bsw-rapl+ #3133
[  350.412457] Hardware name: Intel Corporation CHERRYVIEW C0 PLATFORM/Braswell CRB, BIOS BRAS.X64.X088.R00.1510270350 10/27/2015
[  350.425212]  0000000000000000 ffff8801680a78c8 ffffffff81332187 ffff88016c5c5000
[  350.433611]  0000000000000001 ffff8801680a78f8 ffffffff810ca6da ffff88016cc8b0f0
[  350.442012]  ffff88016cc80000 ffff88016cc80000 ffff880177ad0000 ffff8801680a7948
[  350.450409] Call Trace:
[  350.453165]  [<ffffffff81332187>] dump_stack+0x67/0x90
[  350.458931]  [<ffffffff810ca6da>] lockdep_rcu_suspicious+0xea/0x120
[  350.466002]  [<ffffffffa039e8dd>] fence_update+0xbd/0x670 [i915]
[  350.472766]  [<ffffffffa039efe2>] i915_gem_restore_fences+0x52/0x70 [i915]
[  350.480496]  [<ffffffffa0368f42>] vlv_resume_prepare+0x72/0x570 [i915]
[  350.487839]  [<ffffffffa0369802>] intel_runtime_resume+0x102/0x210 [i915]
[  350.495442]  [<ffffffff8137f26f>] pci_pm_runtime_resume+0x7f/0xb0
[  350.502274]  [<ffffffff8137f1f0>] ? pci_restore_standard_config+0x40/0x40
[  350.509883]  [<ffffffff814401c5>] __rpm_callback+0x35/0x70
[  350.516037]  [<ffffffff8137f1f0>] ? pci_restore_standard_config+0x40/0x40
[  350.523646]  [<ffffffff81440224>] rpm_callback+0x24/0x80
[  350.529604]  [<ffffffff8137f1f0>] ? pci_restore_standard_config+0x40/0x40
[  350.537212]  [<ffffffff814417bd>] rpm_resume+0x4ad/0x740
[  350.543161]  [<ffffffff81441aa1>] __pm_runtime_resume+0x51/0x80
[  350.549824]  [<ffffffffa03889c8>] intel_runtime_pm_get+0x28/0x90 [i915]
[  350.557265]  [<ffffffffa0388a53>] intel_display_power_get+0x23/0x50 [i915]
[  350.565001]  [<ffffffffa03ef23d>] intel_atomic_commit_tail+0xdfd/0x10b0 [i915]
[  350.573106]  [<ffffffffa034b2e9>] ? drm_atomic_helper_swap_state+0x159/0x300 [drm_kms_helper]
[  350.582659]  [<ffffffff81615091>] ? _raw_spin_unlock+0x31/0x50
[  350.589205]  [<ffffffffa034b2e9>] ? drm_atomic_helper_swap_state+0x159/0x300 [drm_kms_helper]
[  350.598787]  [<ffffffffa03ef8a5>] intel_atomic_commit+0x3b5/0x500 [i915]
[  350.606319]  [<ffffffffa03061dc>] ? drm_atomic_set_crtc_for_connector+0xcc/0x100 [drm]
[  350.615209]  [<ffffffffa0306b49>] drm_atomic_commit+0x49/0x50 [drm]
[  350.622242]  [<ffffffffa034dee8>] drm_atomic_helper_set_config+0x88/0xc0 [drm_kms_helper]
[  350.631419]  [<ffffffffa02f94ac>] drm_mode_set_config_internal+0x6c/0x120 [drm]
[  350.639623]  [<ffffffffa02fa94c>] drm_mode_setcrtc+0x22c/0x4d0 [drm]
[  350.646760]  [<ffffffffa02f0f19>] drm_ioctl+0x209/0x460 [drm]
[  350.653217]  [<ffffffffa02fa720>] ? drm_mode_getcrtc+0x150/0x150 [drm]
[  350.660536]  [<ffffffff810c984a>] ? __lock_is_held+0x4a/0x70
[  350.666885]  [<ffffffff81202303>] do_vfs_ioctl+0x93/0x6b0
[  350.672939]  [<ffffffff8120f843>] ? __fget+0x113/0x200
[  350.678797]  [<ffffffff8120f735>] ? __fget+0x5/0x200
[  350.684361]  [<ffffffff81202964>] SyS_ioctl+0x44/0x80
[  350.690030]  [<ffffffff81001deb>] do_syscall_64+0x5b/0x120
[  350.696184]  [<ffffffff81615ada>] entry_SYSCALL64_slow_path+0x25/0x25

Note we also have to remember the lesson from commit 4fc788f5ee
("drm/i915: Flush delayed fence releases after reset") where we have to
flush any changes to the fence on restore.

v2: Replace call to release user mmaps with an assertion that they have
already been zapped.

Fixes: 49ef5294cd ("drm/i915: Move fence tracking from object to vma")
Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161012114827.17031-1-chris@chris-wilson.co.uk
2016-10-12 16:27:14 +01:00
Chris Wilson 0a97015d45 drm/i915: Compress GPU objects in error state
Our error states are quickly growing, pinning kernel memory with them.
The majority of the space is taken up by the error objects. These
compress well using zlib and without decode are mostly meaningless, so
encoding them does not hinder quickly parsing the error state for
familiarity.

v2: Make the zlib dependency optional

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161012090522.367-6-chris@chris-wilson.co.uk
2016-10-12 12:00:33 +01:00
Chris Wilson fc4c79c37e drm/i915: Consolidate error object printing
Leave all the pretty printing to userspace and simplify the error
capture to only have a single common object printer. It makes the kernel
code more compact, and the refactoring allows us to apply more complex
transformations like compressing the output.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161012090522.367-5-chris@chris-wilson.co.uk
2016-10-12 12:00:33 +01:00
Chris Wilson 95374d759a drm/i915: Always use the GTT for error capture
Since the GTT provides universal access to any GPU page, we can use it
to reduce our plethora of read methods to just one. It also has the
important characteristic of being exactly what the GPU sees - if there
are incoherency problems, seeing the batch as executed (rather than as
trapped inside the cpu cache) is important.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161012090522.367-4-chris@chris-wilson.co.uk
2016-10-12 12:00:32 +01:00