Commit Graph

334894 Commits

Author SHA1 Message Date
Daniel Vetter b3fcabb15b drm/i915: drop the double-OP_STOREDW usage in blt_ring_flush
This has been introduced in "drm/i915: TLB invalidation with
MI_FLUSH_DW requires a post-sync op".

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:42 +01:00
Jesse Barnes 1abd02e2dd drm/i915: don't rewrite the GTT on resume v4
The BIOS shouldn't be touching this memory across suspend/resume, so
just leave it alone.  This saves us ~6ms on resume on my T420 (retested
with write combined PTEs).

v2: change gtt restore default on pre-gen4 (Chris)
    move needs_gtt_restore flag into dev_priv
v3: make sure we restore GTT on resume from hibernate (Daniel)
    use opregion support as the cutoff for restore from resume (Chris)
v4: use a better check for opregion (Chris)

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: Kill the needs_gtt_restore indirection and check directly for
OpRegion. Also explain in a comment what's going on.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:42 +01:00
Jesse Barnes 4fc688ce79 drm/i915: protect RPS/RC6 related accesses (including PCU) with a new mutex
This allows the power related code to run independently of the rest of
the pipeline, extending the resume and init time improvements into
userspace, which would otherwise have been blocked on the struct mutex
if we were doing PCU communication.

v2: Also convert the locking for the rps sysfs interface.

Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:41 +01:00
Jesse Barnes 1a01ab3b2d drm/i915: put ring frequency and turbo setup into a work queue v5
Communicating via the mailbox registers with the PCU can take quite
awhile.  And updating the ring frequency or enabling turbo is not
something that needs to happen synchronously, so take it out of our init
and resume paths to speed things up (~200ms on my T420).

v2: add comment about why we use a work queue (Daniel)
    make sure work queue is idle on suspend (Daniel)
    use a delayed work queue since there's no hurry (Daniel)
v3: make cleanup symmetric and just call cancel work directly (Daniel)
v4: schedule the work using round_jiffies_up to batch work better (Chris)
v5: fix the right schedule_delayed_work call (Chris)

References: https://bugs.freedesktop.org/show_bug.cgi?id=54089

Signed-of-by: Jesse Barnes <jbarnes@virtuougseek.org>
[danvet: bikeshed the placement of the new delayed work, move it to
all the other gen6 power mgmt stuff.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:41 +01:00
Jesse Barnes 073f34d9d4 drm/i915: don't block resume on fb console resume v2
The console lock can be contended, so rather than prevent other drivers
after us from being held up, queue the console suspend into the global
work queue that can happen anytime.  I've measured this to take around
200ms on my T420.  Combined with the ring freq/turbo change, we should
save almost 1/2 a second on resume.

v2: use console_trylock() to try to resume the console immediately (Chris)

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: move dev_priv->console_resume_work next to the fbdev
pointer.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:40 +01:00
Daniel Vetter a4da4fa4e5 drm/i915: extract l3_parity substruct from dev_priv
Pretty astonishing how far apart these two members landed ... Especially since
I've already removed almost 200 lines in between.

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:40 +01:00
Daniel Vetter 231f42a48f drm/i915: move dri1 dungeon out of dev_priv
Also, move dev_priv->counter there, it's only used in i915_dma.c

And also move the dri1 dungeon at the end of dev_priv where no one
cares about it.

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:39 +01:00
Daniel Vetter 3e37394802 drm/i915: move pwrctx/renderctx to the other ilk power state
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:39 +01:00
Daniel Vetter c85aa8855a drm/i915: move dev_priv->(rps|ips) out of line
And give the structs slightly more generic names. I've decided to keep
the short rps/ips prefix, since that's just easier and less churn.

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:38 +01:00
Daniel Vetter f4c956adc7 drm/i915: move the suspend/resume register file out of dev_priv
dev_priv has grown way too big, and grouping memebers into substructs
and moving them out of line helps re-gain some overview.

Unfortunatley I couldn't just call the substruct save and drop the prefix, since
that will make most member names clash with registers #defines. Changes in
i915_drv.h done by hand, everything else changed with
s/\<save\([A-Z]*\)/regfile.save\1/ in vim.

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:38 +01:00
Jesse Barnes 310c53a84f drm/i915: add clock gating regs to VLV offset check function
So we can write them properly.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Antti Koskipää <antti.koskipaa@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:37 +01:00
Jesse Barnes 3ac7831314 drm/i915: PIPE_CONTROL TLB invalidate requires CS stall
"If ENABLED, PIPE_CONTROL command will flush the in flight data  written
out by render engine to Global Observation point on flush done. Also
Requires stall bit ([20] of DW1) set."

So set the stall bit to ensure proper invalidation.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Antti Koskipää <antti.koskipaa@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:37 +01:00
Jesse Barnes 9a28977181 drm/i915: TLB invalidation with MI_FLUSH_DW requires a post-sync op v3
So store into the scratch space of the HWS to make sure the invalidate
occurs.

v2: use GTT address space for store, clean up #defines (Chris)
v3: use correct #define in blt ring flush (Chris)

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Antti Koskipää <antti.koskipaa@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
References: https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/1063252
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:36 +01:00
Jesse Barnes 12f3382bc0 drm/i915: implement WaDisablePSDDualDispatchEnable on IVB & VLV
Workaround for dual port PS dispatch on GT1.

v2: pull in register definition & offset handling
v3: use IVB GT1 macro to get the right regs (Ben)
v4: add for VLV too (Ben)
v5: don't read the reg, it's masked so we'll only enable the one extra bit (Chris)
v6: use a _GT2 suffix for the second reg (Chris)

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Antti Koskipää <antti.koskipaa@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:36 +01:00
Jesse Barnes 2d809570c8 drm/i915: implement WaDisableVLVClockGating_VBIIssue on VLV
This allows us to get the right vblank interrupt frequency.

v2: pull in register definition

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Antti Koskipää <antti.koskipaa@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:35 +01:00
Jesse Barnes 5c9664d75a drm/i915: implement WaForceL3Serialization on VLV and IVB
References: https://bugs.freedesktop.org/show_bug.cgi?id=50250
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Antti Koskipää <antti.koskipaa@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:35 +01:00
Jesse Barnes 8ab4397640 drm/i915: implement WaDisableDopClockGatingisable on VLV and IVB
v2: use correct register
v3: remove extra hunks, pull in register definitions & offset check directly
v4: add GT1 vs GT2 distinction for IVB portion (Ben)

References: https://bugs.freedesktop.org/show_bug.cgi?id=50233
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Antti Koskipää <antti.koskipaa@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:34 +01:00
Jesse Barnes d0cf5eadc0 drm/i915: implement WaDisableL3CacheAging on VLV
Needs to be set on every context restore as well, so set it as part of
the initial state so we can save/restore it.  Note this removes the IVB
workaround value from VLV and uses the default value, just adding in the
L3 cache aging disable bit, since the IVB value is wrong for VLV.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Antti Koskipää <antti.koskipaa@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:34 +01:00
Paulo Zanoni 1ad960f25c drm/i915: fix Haswell FDI link disable path
This covers the "Disable FDI" section from the CRT mode set sequence.
This disables the FDI receiver and also the FDI pll.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:33 +01:00
Paulo Zanoni 049456416f drm/i915: fix Haswell FDI link training code
This commit makes hsw_fdi_link_train responsible for implementing
everything described in the "Enable and train FDI" section from the
Hawell CRT mode set sequence documentation. We completely rewrite
hsw_fdi_link_train to match the documentation and we also call it in
the right place.

This patch was initially sent as a series of tiny patches fixing every
little problem of the function, but since there were too many patches
fixing the same function it got a little difficult to get the "big
picture" of how the function would be in the end, so here we amended
all the patches into a single big patch fixing the whole function.

Problems we fixed:

  1 - Train Haswell FDI at the right time.

    We need to train the FDI before enabling the pipes and planes, so
    we're moving the call from lpt_pch_enable to haswell_crtc_enable
    directly.

    We are also removing ironlake_fdi_pll_enable since the PLL
    enablement on Haswell is completely different and is also done
    during the link training steps.

  2 - Use the right FDI_RX_CTL register on Haswell

    There is only one PCH transcoder, so it's always _FDI_RXA_CTL.
    Using "pipe" here is wrong.

  3 - Don't rely on DDI_BUF_CTL previous values

    Just set the bits we want, everything else is zero. Also
    POSTING_READ the register before sleeping.

  4 - Program the FDI RX TUSIZE register on hsw_fdi_link_train

    According to the mode set sequence documentation, this is the
    right place. According to the FDI_RX_TUSIZE register description,
    this is the value we should set.

    Also remove the code that sets this register from the old
    location: lpt_pch_enable.

  5 - Properly program FDI_RX_MISC pwrdn lane values on HSW

  6 - Wait only 35us for the FDI link training

    First we wait 30us for the FDI receiver lane calibration, then we
    wait 5us for the FDI auto training time.

  7 - Remove an useless indentation level on hsw_fdi_link_train

    We already "break" when the link training succeeds.

  8 - Disable FDI_RX_ENABLE, not FDI_RX_PLL_ENABLE

    When we fail the training.

  9 - Change Haswell FDI link training error messages

    We shouldn't call DRM_ERROR when still looping through voltage
    levels since this is expected and not really a failure. So in this
    commit we adjust the error path to only DRM_ERROR when we really
    fail after trying everything.

    While at it, replace DRM_DEBUG_DRIVER with DRM_DEBUG_KMS since
    it's what we use everywhere.

  10 - Try each voltage twice at hsw_fdi_link_train

    Now with Daniel Vetter's suggestion to use "/2" instead of ">>1".

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
[danvet: Applied tiny bikesheds:
- mention in comment that we test each voltage/emphasis level twice
- realing arguments of the only untouched reg write, it spilled over
  the 80 char limit ...]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:33 +01:00
Jani Nikula 547dc041df drm/i915: remove HAS_eDP as unnecessary and inconsistent indirection
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:32 +01:00
Paulo Zanoni 349d7e5d7b drm/i915: set the correct number of FDI lanes on Haswell
We had 2 places using X2 and one place using X1.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:32 +01:00
Daniel Vetter 3107bd48bf drm/i915: kill pch_init_clock_gating indirection
Now that we no longer pretend to have flexibility in matching any
north display block with any pch, we can ditch this.

v2: Fix the embarassing rebase fail that Paulo Zanoni spotted.

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:31 +01:00
Daniel Vetter ce40141f55 drm/i915: implement WADP0ClockGatingDisable
Found in Bspec vol4h South Display Engine Registers [CPT, PPT],
section "5.3.1  TRANS_CHICKEN_1—Transcoder Chicken Bits 1"

v2: Make it compile.

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:31 +01:00
Daniel Vetter 23670b322c drm/i915: CPT+ pch transcoder workaround
We need to set the timing override chicken bit after fdi link training
has completed and before we enable the transcoder. We also have to
clear that bit again after disabling the pch transcoder.

See "Graphics BSpec: vol4g North Display Engine Registers [IVB],
Display Mode Set Sequence" and "Graphics BSpec: vol4h South Display
Engine Registers [CPT, PPT], South Display Engine Transcoder and FDI
Control, Transcoder Debug and DFT, TRANS_CHICKEN_2" bit 31:

"Workaround : Enable the override prior to enabling the transcoder.
Disable the override after disabling the transcoder."

While at it, use the _PIPE macro for the other TRANS_DP register.

v2: Keep the w/a as-is, but kill the original (but wrongly placed)
workaround introduced in

commit 3bcf603f6d
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date:   Wed Jul 27 11:51:40 2011 -0700

    drm/i915: apply timing generator bug workaround on CPT and PPT

and

commit d4270e57ef
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date:   Tue Oct 11 10:43:02 2011 -0700

    drm/i915: export a CPT mode set verification function

Note that this old code has unconditionally set the w/a, which might
explain why fdi link training sometimes silently fails, and especially
why the auto-train did not seem to work properly.

v3: Paulo Zanoni pointed out that this workaround is also required on
the LPT PCH. And Arthur Ranyan confirmed that this workaround is
requierd for all ports on the pch, not just DP: The important part
is that the bit is set whenever the pch transcoder is enabled, and
that it is _not_ set while the fdi link is trained. It is also
important that the pch transcoder is fully disabled, i.e. we have to
wait for bit 30 to clear before clearing the w/a bit.

Hence move to workaround into enable/disable_transcoder, where the pch
transcoder gets enabled/disabled.

v4: Whitespace changes dropped.

v5: Don't run the w/a on IBX, we only need it on CPT/PPT and LPT.

v6:
- resolve conflicts with Paulo's big hsw vga rework
- s/!IBX/CPT since hsw paths are now all separate, and Paulo's patch
  to implement the equivalent w/a for LPT is already merged.

Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Paulo Zanoni <przanoni@gmail.com>
Cc: Arthur Ranyan <arthur.j.runyan@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> (v5)
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> (v5)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-11-11 23:51:30 +01:00