Commit Graph

150 Commits

Author SHA1 Message Date
Ben Widawsky e178f7057b drm/i915: Provide PDP updates via MMIO
The initial implementation of this function used MMIO to write the PDPs.
Upon review it was determined (correctly) that the docs say to use LRI.
The issue is there are times where we want to do a synchronous write
(GPU reset).

I've tested this, and it works. I've verified with as many people as
possible that it should work.

This should fix the failing reset problems.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-18 15:27:43 +01:00
Ben Widawsky 5ed1678206 drm/i915: Move the gtt mm takedown to cleanup
Our VM code already has a cleanup function, and this is a nice place to
put the drm_mm_takedown. This should have no functional impact, it just
leaves the unload function a bit cleaer, and is more logical IMO

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-26 10:21:27 +01:00
Ben Widawsky 686e1f6f87 drm/i915: Add a few missed bits to the mm
This should really have been added in BDW integration, as well as:

commit 93bd8649db
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Tue Jul 16 16:50:06 2013 -0700

    drm/i915: Put the mm in the parent address space

It didn't really matter before, but it will in the future.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-26 10:06:22 +01:00
Ben Widawsky d595bd4bbd drm/i915: Fix BDW PPGTT error path
When we fail for some reason on loading the PDPs, it would be wise to
disable the PPGTT in the ring registers. If we do not do this, we have
undefined results.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-26 09:59:05 +01:00
Daniel Vetter c09cd6e969 Merge branch 'backlight-rework' into drm-intel-next-queued
Pull in Jani's backlight rework branch. This was merged through a
separate branch to be able to sort out the Broadwell conflicts
properly before pulling it into the main development branch.

Conflicts:
	drivers/gpu/drm/i915/intel_display.c

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-15 10:02:39 +01:00
Ben Widawsky 3a2ffb65ee drm/i915/bdw: Limit GTT to 2GB
Because of the way in which we're allocating the pages for the Aliasing
PPGTT, we cannot actually successfully alloc enough space for anything
greater than 2GB.

Instead of a quick hack to fix this, we should defer until we have the
real solution in place (allocating much less contiguous space).

This wasn't found sooner because we didn't not have any systems
supporting more than a 2GB GTT.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-14 09:33:11 +01:00
Ben Widawsky 230f955f73 drm/i915/bdw: Free correct number of ppgtt pages
I am unclear how this got messed up in the shuffle, but it did.

Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-14 09:33:10 +01:00
Jesse Barnes b53c8c3577 drm/i915: drop duplicate ggtt vma list add in setup_global_gtt
Preallocated objects will already have been added to the vma_list when
creating their ggtt vma entry, and coincidentally also marked as holding
a ggtt mapping. Repeating the vma_list manipulation when setting up the
ggtt after preallocation is a recipe for an unhappy kernel.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Use the improve commit message suggest by Chris.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-13 01:05:01 +01:00
Ville Syrjälä b42218c19f drm/i915/bdw: Don't muck with gtt_size on Gen8 when PPGTT setup fails
v2: Resolve rebase conflicts and switch to gen < 8 color for GenX
checking.

v3: Rebase on top of the address space refactoring.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:49 +01:00
Ben Widawsky 28cf541543 drm/i915/bdw: unleash PPGTT
v2: Squash in fix from Ben: Set PPGTT batches as necessary

This fixes the regression in the last couple of days when we enabled
PPGTT.

v3: Squash in fixup to still use GTT for secure batches from Ville:

BDW doesn't have a separate secure vs. non-secure bit in
MI_BATCH_BUFFER_START. So for secure batches we have to simply
leave the PPGTT bit unset. Fortunately older generations (except
HSW) had similar limitations so execbuffer already creates a GTT
mapping for all secure batches.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:48 +01:00
Ben Widawsky 94e409c144 drm/i915/bdw: Implement PPGTT enable
Legacy PPGTT on GEN8 requires programming 4 PDP registers per ring.
Since all rings are using the same address space with the current code
the logic is simply to program all the tables we've setup for the PPGTT.

v2: Turn on PPGTT in GFX_MODE

v3: v2 was the wrong patch

v4: Resolve conflicts due to patch series reordering.

v5: Squash in fixup from Ben: Use LRI to write PDPs

The docs (and simulator seems to back up) suggest that we can only
program legacy PPGTT PDPs with LRI commands.

v6: Rebase around context differences conflicts.

v7: Use #defines for per ring PDPs. (Damien)

v8: Don't use typede'f private_t.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (up to v3 and v7)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:47 +01:00
Ben Widawsky 9df15b499b drm/i915/bdw: Implement PPGTT insert
GEN8 insertion is very similar to GEN6.

v2: Rebase on top of Imre's for_each_sg_page helpers.

v3: Fixup my conversion (spotted by Ville).

v4: Rebase on top of the address space refactoring.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:47 +01:00
Ben Widawsky 459108b8cc drm/i915/bdw: Implement PPGTT clear range
GEN8 PPGTT range clearing is very similar to GEN6 if we assume that our
PDEs are all valid, which they should be.

v2: Rebase on top of the address space refactoring.

v3: Rebase on top of the bool use_scratch addition to the clear_range interface.

Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:46 +01:00
Ben Widawsky b1fe667329 drm/i915/bdw: Initialize the PDEs
The upcoming clear and insert routines will expect that PDEs all point
to valid Page Directories. Doing that lazily doesn't really buy us
anything.

The page allocation is done regardless earlier in init so it shouldn't
hurt set the PDEs.

v2: Squash in patches to implement fixed PDE write function:

- If I had done this in the first place, the bug that's going to be
  fixed in an upcoming patch would have been much easier to find.

- Use WB for PDEs.

  The PAT bit is used for page size. 2ME PDEs aren't even supported in
  BDW, so this was completely invalid. The solution is to make our
  PDEs WB+LLC instead of the pervious WB+eLLC. As far as I can guess,
  this change won't matter for performance.

  Thanks to Ville for the quick correction when discussing on IRC.

v3: Return the pde type for pde encoding (Damien)

Reviewed-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:46 +01:00
Ben Widawsky 37aca44ad5 drm/i915/bdw: PPGTT init & cleanup
Aside from the potential size increase of the PPGTT, the primary
difference from previous hardware is the Page Directories are no longer
carved out of the Global GTT.

Note that the PDE allocation is done as a 8MB contiguous allocation,
this needs to be eventually fixed (since driver reloading will be a
pain otherwise). Also, this will be a no-go for real PPGTT support.

v2: Move vtable initialization

v3: Resolve conflicts due to patch series reordering.

v4: Rebase on top of the address space refactoring of the PPGTT
support. Drop Imre's r-b tag for v2, too outdated by now.

v5: Free the correct amount of memory, "get_order takes size not a page
count." (Imre)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:45 +01:00
Ben Widawsky fbe5d36e77 drm/i915/bdw: Support BDW caching
BDW caching works differently than the previous generations. Instead of
having bits in the PTE which directly control how the page is cached,
the 3 PTE bits PWT PCD and PAT provide an index into a PAT defined by
register 0x40e0. This style of caching is functionally equivalent to how
it works on HSW and before.

v2: Tiny bikeshed as discussed on internal irc.

v3: Squash in patch from Ville to mirror the x86 PAT setup more like
in arch/x86/mm/pat.c. Primarily, the 0th index will be WB, and not
uncached.

v4: Comment for reason to not use a 64b write on the PPAT.

v5: Add a FIXME comment that the caching bits in the PAT registers
might be wrong due to doc confusion.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:45 +01:00
Ben Widawsky 94ec8f6130 drm/i915/bdw: Add GTT functions
With the PTE clarifications, the bind and clear functions can now be
added for gen8.

v2: Use for_each_sg_pages in gen8_ggtt_insert_entries.

v3: Drop dev argument to pte encode functions, upstream lost it. Also
rebase on top of the scratch page movement.

v4: Rebase on top of the new address space vfuncs.

v5: Add the bool use_scratch argument to clear_range and the bool valid argument
to the PTE encode function to follow upstream changes.

v6: Add a FIXME(BDW) about the size mismatch of the readback check
that Jon Bloomfield spotted.

v7: Squash in fixup patch from Ben for the posting read to match the
64bit ptes and so shut up the WARN.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:44 +01:00
Ben Widawsky d31eb10e6c drm/i915/bdw: Create gen8_gtt_pte_t
With gen6 PTE type in place, pave the way for the new gen8 type.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:44 +01:00
Ben Widawsky 63340133f3 drm/i915/bdw: Make gen8_gmch_probe
Probing gen8 is similar to gen6. To make the code cleaner and more
maintainable however we can use the probe functions to split it out.

v2: Rebased on top of update gtt probe infrastructure.

v3: Rebased on top of Kenneth' Graunke's ->pte_encode refactoring.

V4: Resolve conflicts with Ben's latest ppgtt patches, also switch to
gen < 8 testing instead of gen <= 7.

v5: Resolve conflicts with address space vfunc changes in upstream.

v6: Use 39b DMA mask. At least, for this mode, it is the correct mask.
(Imre)

Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:43 +01:00
Ben Widawsky 9459d25237 drm/i915/bdw: support GMS and GGMS changes
All the BARs have the ability to grow.

v2: Pulled out the simulator workaround to a separate patch.
Rebased.

v3: Rebase onto latest vlv patches from Jesse.

v4: Rebased on top of the early stolen quirk patch from Jesse.

v5: Use the new macro names.
s/INTEL_BDW_PCI_IDS_D/INTEL_BDW_D_IDS
s/INTEL_BDW_PCI_IDS_M/INTEL_BDW_M_IDS
It's Jesse's fault for not following the convention I originally set.

Cc: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:39 +01:00
Daniel Vetter 8fe6bd239a drm/i915/bdw: Disable PPGTT for now
This will be changed once the gen8 code is fully implemented.

v2: Use ENOSYS instead of ENXIO as suggested by Chris.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-08 18:09:35 +01:00
Daniel Vetter 7f16e5c141 Merge tag 'v3.12' into drm-intel-next
I want to merge in the new Broadwell support as a late hw enabling
pull request. But since the internal branch was based upon our
drm-intel-nightly integration branch I need to resolve all the
oustanding conflicts in drm/i915 with a backmerge to make the 60+
patches apply properly.

We'll propably have some fun because Linus will come up with a
slightly different merge solution.

Conflicts:
	drivers/gpu/drm/i915/i915_dma.c
	drivers/gpu/drm/i915/i915_drv.c
	drivers/gpu/drm/i915/intel_crt.c
	drivers/gpu/drm/i915/intel_ddi.c
	drivers/gpu/drm/i915/intel_display.c
	drivers/gpu/drm/i915/intel_dp.c
	drivers/gpu/drm/i915/intel_drv.h

All rather simple adjacent lines changed or partial backports from
-next to -fixes, with the exception of the thaw code in i915_dma.c.
That one needed a bit of shuffling to restore the intent.

Oh and the massive header file reordering in intel_drv.h is a bit
trouble. But not much.

v2: Also don't forget the fixup for the silent conflict that results
in compile fail ...

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-11-04 16:28:52 +01:00
Ben Widawsky 828c79087c drm/i915: Disable GGTT PTEs on GEN6+ suspend
Once the machine gets to a certain point in the suspend process, we
expect the GPU to be idle. If it is not, we might corrupt memory.
Empirically (with an early version of this patch) we have seen this is
not the case. We cannot currently explain why the latent GPU writes
occur.

In the technical sense, this patch is a workaround in that we have an
issue we can't explain, and the patch indirectly solves the issue.
However, it's really better than a workaround because we understand why
it works, and it really should be a safe thing to do in all cases.

The noticeable effect other than the debug messages would be an increase
in the suspend time. I have not measure how expensive it actually is.

I think it would be good to spend further time to root cause why we're
seeing these latent writes, but it shouldn't preclude preventing the
fallout.

NOTE: It should be safe (and makes some sense IMO) to also keep the
VALID bit unset on resume when we clear_range(). I've opted not to do
this as properly clearing those bits at some later point would be extra
work.

v2: Fix bugzilla link

Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=65496
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=59321
Tested-by: Takashi Iwai <tiwai@suse.de>
Tested-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Tested-By: Todd Previte <tprevite@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-10-18 15:44:47 +02:00
Ben Widawsky b35b380ed4 drm/i915: Make PTE valid encoding optional
We need this to work around a corruption when the boot kernel image
loads the hibernated kernel image from swap on Haswell systems -
somehow not everything is properly shut off.

This is just the prep work, the next patch will implement the actual
workaround.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Add a commit message suitable for -fixes and add cc: stable]
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-10-18 15:40:21 +02:00
Daniel Vetter a1e2265332 drm/i915: Use kcalloc more
No buffer overflows here, but better safe than sorry.

v2:
- Fixup the sizeof conversion, I've missed the pointer deref (Jani).
- Drop the redundant GFP_ZERO, kcalloc alreads memsets (Jani).
- Use kmalloc_array for the execbuf fastpath to avoid the memset
  (Chris). I've opted to leave all other conversions as-is since they
  aren't in a fastpath and dealing with cleared memory instead of
  random garbage is just generally nicer.

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
[danvet: Drop the contentious kmalloc_array hunk in execbuf.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-10-01 07:45:01 +02:00