Manual backmerge of -rc7 to resolve a silent conflict leading to
compile failure in drivers/gpu/drm/i915/intel_hdmi.c.
This is due to the bugfix in -rc7:
commit b98b601672
Author: Wang Xingchao <xingchao.wang@intel.com>
Date: Thu Sep 13 07:43:22 2012 +0800
drm/i915: HDMI - Clear Audio Enable bit for Hot Plug
Since this code moved around a lot in -next git put that snippet at
the wrong spot. I've tried to fix this by making the conflict explicit
by merging a version for next with:
commit 3cce574f01
Author: Wang Xingchao <xingchao.wang@intel.com>
Date: Thu Sep 13 11:19:00 2012 +0800
drm/i915: HDMI - Clear Audio Enable bit for Hot Plug unconditionally
But that failed to solve the entire problem. To avoid pushing out
further -nightly branch to our QA where this is broken, do the
backmerge and manually add the stuff git adds to -next from the patch
in -fixes.
Note that this doesn't show up in git's merge diff (and hence is also
not handled by git rerere), which adds to the reasons why I'd like to
fix this with a verbose backmerge. The git merge diff only shows a
bunch of trivial conflicts of the "code changed in lines next to each
another" kind.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
With the new "standardized" sysfs interfaces we need to be a bit more
careful about setting the RPS values.
Because the sysfs code and the rps workqueue can run at the same time,
if the sysfs setter wins the race to the mutex, the workqueue can come
in and set a value which is out of range (ie. we're no longer protecting
by RPINTLIM).
I was not able to actually make this error occur in testing.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Currently we've only frobbed this bit at irq_init time, but did
not restore it at resume time. Move it to the gen3 clock gating
function to fix this.
Notice while reading through code.
Cc: stable@vger.kernel.org (for 3.5 only)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The same designer from the previous patch has told us to never read
FORCEWAKE. We only do this for the POSTING_READ(), so simply change that
to something within the same cacheline (for no reason in particular
other than it sounds nice). In the _mt case we can leverage
the gtfifodbg check for the POSTING_READ.
This partially reverts
commit 6af2d180f8
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Thu Jul 26 16:24:50 2012 +0200
drm/i915: fix forcewake related hangs on snb
v2: commit message, comments about posting read from (Daniel)
Note: vlv forcewake doesn't need any changes for this special
treatment since FORCEWAKE_VLV is in a totally different register
range, and the readback FORCEWAKE_ACK_VLV readback that follows is in
the same range.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Added note.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
A designer familiar with the hardware has stated that the forcewake
timeout can theoretically be as high as a little over 1ms. Therefore we
modify our code to use 2ms (appropriate fudge and because we don't want
to round down).
Hopefully this can't prevent spurious timeouts.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Acked-by: Chris Wilson <chris@chris-wilson.oc.uk>
[danvet: again fix conflict with vlv patch.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
It's used all over the place, and we want to be able to play around with
the value, apparently. Note that it doesn't touch other timeouts of the
same value (like gtfifo, and thread C0 wait).
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.oc.uk>
[danvet: fixup conflict with vlv forcewake patches.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
<ickle> danvet: in the force wake, both DRM_ERRORs have the same string.
<ickle> useful for .txt shrinkage, horrible for debugging
Acked-by: Paul Menzel <paulepanter@users.sourceforge.net>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
For some odd reasons, the vlv forcewake code is rather different from
all other platforms, with no clear justification. Adjust things:
- Don't check whether the gt is awake already (and bail out early), we
need to grab a forcewake anyway. Otherwise the chip might go to
sleep too early. And this would also screw up our forcewake
accounting.
- Like all other platforms, check whether the gt has cleared the
forcewake bit in the _ACK register before setting it again.
- Use _MASKED_BIT_ENABLE/DISABLE macros
- Only use bit0 of the forcewake reg, not all 16 bits.
- check the gtfifodb reg like on all other platforms in _put.
- Drop the POSTING_READs for consistency.
v2: Failure to git add ... again.
v3: Fixup the spelling fail a bit.
Tested-by: "Purushothaman, Vijay A" <vijay.a.purushothaman@intel.com>
Tested-by: "Widawsky, Benjamin" <benjamin.widawsky@intel.com>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We've had and still have too many issues where the gpu turbo doesn't
quite to what it's supposed to do (or what we want it to do).
Adding a tracepoint to track when the desired gpu frequency changes
should help a lot in characterizing and understanding problematic
workloads.
Also, this should be fairly interesting for power tuning (and
especially noticing when the gpu is stuck in high frequencies, as has
happened in the past) and hence for integration into powertop and
similar tools.
Cc: Arjan van de Ven <arjan@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Paul Menzel <paulepanter@users.sourceforge.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Like with the equivalent change for gen6+ rps state, this helps in
clarifying the code (and in fixing a few places that have fallen through
the cracks in the locking review).
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Daniel writes:
"New stuff for -next. Highlights:
- prep patches for the modeset rework. Note that one of those patches
touches the fb helper in the common drm code.
- hasw hdmi audio support (Wang Xingchao)
- improved instdone dumping for gen7 (Ben)
- unbound tracking and a few follow-up patches from Chris
- dma_buf->begin/end_cpu_access plus fix for drm/udl (Dave)
- improve mmio error reporting for hsw
- prep patch for WQ_NON_REENTRANT removal (Tejun Heo)
"
* 'for-airlied' of git://people.freedesktop.org/~danvet/drm-intel: (41 commits)
drm/i915: Remove __GFP_NO_KSWAPD
drm/i915: disable rc6 on ilk when vt-d is enabled
drm/i915: Avoid unbinding due to an interrupted pin_and_fence during execbuffer
drm/i915: Use new INSTDONE registers (Gen7+)
drm/i915: Add new INSTDONE registers
drm/i915: Extract reading INSTDONE
drm/i915: Use a non-blocking wait for set-to-domain ioctl
drm/i915: Juggle code order to ease flow of the next patch
drm/i915: Use cpu relocations if the object is in the GTT but not mappable
drm/i915: Extract general object init routine
drm/i915: Protect private gem objects from truncate (such as imported dmabuf)
drm/i915: Only pwrite through the GTT if there is space in the aperture
i915: use alloc_ordered_workqueue() instead of explicit UNBOUND w/ max_active = 1
drm/i915: Find unclaimed MMIO writes.
drm/i915: Add ERR_INT to gen7 error state
drm/i915: Cantiga+ cannot handle a hsync front porch of 0
drm/i915: fix reassignment of variable "intel_dp->DP"
drm/i915: Try harder to allocate an mmap_offset
drm/i915: Show pin count in debugfs
drm/i915: Show (count, size) of purgeable objects in i915_gem_objects
...
There was some merge conflicts in -next and they weren't so pretty, so
backmerge now to avoid them.
Conflicts:
drivers/gpu/drm/i915/i915_gem.c
drivers/gpu/drm/i915/intel_modes.c
It blows up. And hopefully this is the root-cause of the mysterious
rc6 related hang on ilk. For reference, the commit that enabled rc6 on
ilk again is:
commit 456470eb58
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Wed Aug 8 23:35:40 2012 +0200
drm/i915: enable rc6 on ilk again
Reported-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Avoid stalling and waiting for the GPU by checking to see if there is
sufficient inactive space in the aperture for us to bind the buffer
prior to writing through the GTT. If there is inadequate space we will
have to stall waiting for the GPU, and incur overheads moving objects
about. Instead, only incur the clflush overhead on the target object by
writing through shmem.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
James Bottomley reported [1] a massive power regression, due to the
enabling of semaphores by default in 3.5. A workaround for him is to
again disable semaphores. And indeed, his system has a very hard time
to enter rc6 with semaphores enabled.
Ben Widawsky run around with a kill-a-watt a lot and noticed:
- There are indeed a few rare systems that seem to have a hard time
entering rc6 when desktop-idle.
- One machine, The Indestructible Toshiba regressed in this behaviour
between 3.5 and 3.6 in a merge commit! So rc6 behaviour with the
current setting seems to be highly timing dependent and not robust
at all.
- The behaviour James reported wrt semaphores seems to be a freak
timing thing that only happens on his specific machine, confirming
that enabling semaphores shouldn't reduce rc6 residency.
Now furthermore the Google ChromeOS guys reported [2] a while ago that
at least on some machines a simply a blinking cursor can keep the gpu
turbo at the highest frequency. This is because the current rps limits
used on snb/ivb are highly asymmetric.
On the theory that gpu turbo and rc6 tuning values are related, we've
tried whether the much saner looking (since much less asymmetric) rps
tuning values used for hsw would also help entering rc6 more robustly.
And it seems to mostly work, and we don't really have the resources to
through-roughly tune things in any better way: The values from the
ChromeOS ppl seem to fare a bit worse for James' machine, so I guess
we better stick with something vpg (the gpu hw/windows group)
provided, hoping that they've done their jobs.
Reference[1]: http://lists.freedesktop.org/archives/dri-devel/2012-July/025675.html
Reference[2]: http://lists.freedesktop.org/archives/intel-gfx/2012-July/018692.html
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53393
Tested-by: Ben Widawsky <ben@bwidawsk.net>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Backmerge Linux 3.6-rc2 to resolve a few funny conflicts before we put
even more madness on top:
- drivers/gpu/drm/i915/i915_irq.c: Just a spurious WARN removed in
-fixes, that has been changed in a variable-rename in -next, too.
- drivers/gpu/drm/i915/intel_ringbuffer.c: -next remove scratch_addr
(since all their users have been extracted in another fucntion),
-fixes added another user for a hw workaroudn.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
I have the faint hope that the total absence of any locking for the
rps code wasn't too good an idea and could very well have caused some
rc6 related regressions.
Unfortunately we've never managed to reproduce these issues on any of
our own machines, so the only way to go about this is to enable it and
see what happens.
While at it, kill some stale comments and improve the logging.
Acked-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We change the drps/ips sw/hw state from different callers: Our own irq
handler, the external intel-ips module and from process context. Most
of these callers don't take any lock at all.
Protect everything by making the mchdev_lock irqsave and grabbing it in
all relevant callsites. Note that we have to convert a few sleeps in the
drps enable/disable code to delays, but alas, I'm not volunteering to
restructure the code around a few work items.
For paranoia add a spin_locked assert to ironlake_set_drps, too.
v2: Move one access inside the lock protection. Caught by the
dev_priv->ips mass-rename ...
v3: Resolve rebase conflict.
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This way it's easier so see what belongs together, and what is used
by the ilk ips code. Also add some comments that explain the locking.
Note that (cur|min|max)_delay need to be duplicated, because
they're also used by the ips code.
v2: Missed one place that the dev_priv->ips change caught ...
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
- Take the dev->struct_mutex around access the corresponding state
(and adjusting the rps hw state).
- Add an assert to gen6_set_rps to ensure we don't forget about this
in the future.
- Don't set up the min/max_freq files if it doesn't apply to the hw.
And do the same for the gen6+ cache sharing file while at it.
v2: Move the gen6+ checks into the read/write callbacks. Thanks to the
awesome drm midlayer we can't check that when registering the debugfs
files, because the driver is not yet fully set up, specifically the
->load callback hasn't run yet.
Oh how I despise this disaster ...
v3: Also add a WARN_ON(mutex_is_locked) in set_rps to check the
locking.
v4: Use mutex_lock_interruptible, suggested by Chris Wilson.
Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (for v2)
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The update_gfx_val function called from mark_busy wasn't taking the
mchdev_lock, as it should have. Also sprinkle a few spinlock asserts
over the code to document things better.
Things are still rather confusing, especially since a few variables
in dev_priv are used by both the gen6+ rps code and the ilk ips code.
But protected by totally different locks. Follow-on patches will clean
that up.
v2: Don't add a deadlock ... hence split up update_gfx_val into a
wrapper that grabs the lock and an internal __ variant for callsites
within intel_pm.c that already have taken the lock.
v3: Mark the internal helper as static, noticed by Ben Widawsky.
v4: Damien Lespiau had questions about the safety of the ips setup
sequence, explain in a comment why it works.
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>