Commit Graph

20563 Commits

Author SHA1 Message Date
Alexey Skidanov
edad40239f drm/radeon: Add ATC VMID<-->PASID functions to kfd->kgd
This patch adds three new interfaces to kfd2kgd interface file of radeon.

The interfaces are:

- Check if a specific VMID has a valid PASID mapping
- Retrieve the PASID which is mapped to a specific VMID
- Issue a VMID invalidation request to the ATC

Signed-off-by: Alexey Skidanov <Alexey.Skidanov@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:34:46 +03:00
Yair Shachar
f8bd13338a drm/amdkfd: Implement address watch debugger IOCTL
v2:

- rename get_dbgmgr_mutex to kfd_get_dbgmgr_mutex to namespace it
- change void* to uint64_t inside ioctl arguments
- use kmalloc instead of kzalloc because we use copy_from_user
  immediately after it

Signed-off-by: Yair Shachar <yair.shachar@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:34:35 +03:00
Yair Shachar
9448458998 drm/amdkfd: Implement wave control debugger IOCTL
v2:

- rename get_dbgmgr_mutex to kfd_get_dbgmgr_mutex to namespace it
- change void* to uint64_t inside ioctl arguments
- use kmalloc instead of kzalloc because we use copy_from_user
  immediately after it

Signed-off-by: Yair Shachar <yair.shachar@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:33:26 +03:00
Yair Shachar
037ed9a2ac drm/amdkfd: Implement (un)register debugger IOCTLs
v2: rename get_dbgmgr_mutex to kfd_get_dbgmgr_mutex to namespace it

Signed-off-by: Yair Shachar <yair.shachar@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:33:07 +03:00
Yair Shachar
e2e9afc4a3 drm/amdkfd: Add address watch operation to debugger
The address watch operation gives the ability to specify watch points
which will generate a shader breakpoint, based on a specified single
address or range of addresses.

There is support for read/write/any access modes.

Signed-off-by: Yair Shachar <yair.shachar@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:33:06 +03:00
Yair Shachar
788bf83db3 drm/amdkfd: Add wave control operation to debugger
The wave control operation supports several command types executed upon
existing wave fronts that belong to the currently debugged process.

The available commands are:

HALT   - Freeze wave front(s) execution
RESUME - Resume freezed wave front(s) execution
KILL   - Kill existing wave front(s)

Signed-off-by: Yair Shachar <yair.shachar@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:33:06 +03:00
Yair Shachar
fbeb661bfa drm/amdkfd: Add skeleton H/W debugger module support
This patch adds the skeleton H/W debugger module support. This code
enables registration and unregistration of a single HSA process at a
time.

The module saves the process's pasid and use it to verify that only the
registered process is allowed to execute debugger operations through the
kernel driver.

v2: rename get_dbgmgr_mutex to kfd_get_dbgmgr_mutex to namespace it

Signed-off-by: Yair Shachar <yair.shachar@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:32:28 +03:00
Yair Shachar
992839ad64 drm/amdkfd: Add static user-mode queues support
This patch adds support for static user-mode queues in QCM.
Queues which are designated as static can NOT be preempted by
the CP microcode when it is executing its scheduling algorithm.

This is needed for supporting the debugger feature, because we
can't allow the CP to preempt queues which are currently being debugged.

The number of queues that can be designated as static is limited by the
number of HQDs (Hardware Queue Descriptors).

Signed-off-by: Yair Shachar <yair.shachar@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:32:28 +03:00
Yair Shachar
aef11009c4 drm/amdkfd: add H/W debugger IOCTL set definitions
This patch adds four new IOCTLs to amdkfd. These IOCTLs expose a H/W
debugger functionality to the userspace.

The IOCTLs are:

- AMDKFD_IOC_DBG_REGISTER:

The purpose of this IOCTL is to notify amdkfd that a process wants to use
GPU debugging facilities on itself only.
It is expected that this IOCTL would be called before any other H/W
debugger requests are sent to amdkfd and for each GPU where the H/W
debugging needs to be enabled. The use of this IOCTL ensures that only
one instance of a debugger is active in the system.

- AMDKFD_IOC_DBG_UNREGISTER:

This IOCTL detaches the debugger/debugged process from the H/W
Debug which was established by the AMDKFD_IOC_DBG_REGISTER IOCTL.

- AMDKFD_IOC_DBG_ADDRESS_WATCH:

This IOCTL allows to set different watchpoints with various conditions as
indicated by the IOCTL's arguments. The available number of watchpoints
is retrieved from topology. This operation is confined to the current
debugged process, which was registered through AMDKFD_IOC_DBG_REGISTER.

- AMDKFD_IOC_DBG_WAVE_CONTROL:

This IOCTL allows to control a wavefront as indicated by the IOCTL's
arguments. For example, you can halt/resume or kill either a
single wavefront or a set of wavefronts. This operation is confined to
the current debugged process, which was registered through
AMDKFD_IOC_DBG_REGISTER.

Because the arguments for the address watch IOCTL and wave control IOCTL
are dynamic, meaning that they could vary in size, the userspace passes a
pointer to a structure (in userspace) that contains the value of the
arguments. The kernel driver is responsible to parse this structure and
validate its contents.

v2: change void* to uint64_t inside ioctl arguments

Signed-off-by: Yair Shachar <yair.shachar@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:32:07 +03:00
Yair Shachar
a6186f4d6f drm/radeon: Add H/W debugger kfd->kgd functions
This patch adds new interface functions to the kfd2kgd interface file. The
new functions allow to perform H/W debugger operations by writing to GPU
registers.

Signed-off-by: Yair Shachar <yair.shachar@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:31:12 +03:00
Joe Perches
f761d8bd80 drm/amdkfd: Use DECLARE_BITMAP
Use the generic mechanism to declare a bitmap instead of unsigned long.

It seems that "struct kfd_process.allocated_queue_bitmap" is unused.
Maybe it could be deleted instead.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-06-03 11:31:12 +03:00
Dave Airlie
3e8d222f2a Merge tag 'drm-intel-next-fixes-2015-05-29' of git://anongit.freedesktop.org/drm-intel into drm-next
Fixes for 4.2. Nothing too serious (given that it's still pre merge
window). With that it's off for 2 weeks of vacation for me and taking care
of 4.2 fixes for Jani.

* tag 'drm-intel-next-fixes-2015-05-29' of git://anongit.freedesktop.org/drm-intel:
  drm/i915: limit PPGTT size to 2GB in 32-bit platforms
  drm/i915: Another fbdev hack to avoid PSR on fbcon.
  drm/i915: Return the frontbuffer flip to enable intel_crtc_enable_planes.
  drm/i915: disable IPS while getting the sink CRCs
  drm/i915: Disable 12bpc hdmi for now
  drm/i915: Adjust sideband locking a bit for CHV/VLV
  drm/i915: s/dpio_lock/sb_lock/
  drm/i915: Kill intel_flush_primary_plane()
  drm/i915: Throw out WIP CHV power well definitions
  drm/i915: Use the default 600ns LDO programming sequence delay
  drm/i915: Remove unnecessary null check in execlists_context_unqueue
  drm/i915: Use spinlocks for checking when to waitboost
  drm/i915: Fix the confusing comment about the ioctl limits
  Revert "drm/i915: Force clean compilation with -Werror"
2015-06-02 18:10:50 +10:00
Alexandre Courbot
1c34d824bd drm/ttm: dma: Don't crash on memory in the vmalloc range
dma_alloc_coherent() can return memory in the vmalloc range.
virt_to_page() cannot handle such addresses and crashes. This
patch detects such cases and obtains the struct page * using
vmalloc_to_page() instead.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-06-02 17:24:49 +10:00
Michel Thierry
501fd70fca drm/i915: limit PPGTT size to 2GB in 32-bit platforms
We already set this limit for the GGTT.

This is a temporary patch until a full replacement of size_t variables
(inadequate in 32-bit kernel) is in place.

Regression from:
	commit a4e0bedca6
	Author: Michel Thierry <michel.thierry@intel.com>
	Date:   Wed Apr 8 12:13:35 2015 +0100

		drm/i915: Use complete address space in true PPGTT

v2: Prettify code and explain why this is needed. (Chris)
v3: Don't hide the compilation warning in 32-bit. (Chris)

Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-05-29 19:08:22 +02:00
Rodrigo Vivi
d9a946b523 drm/i915: Another fbdev hack to avoid PSR on fbcon.
With unified modeset and flip paths introduced recently when switching
to fbcon PSR was being disabled on fb_set_par path but re-enabled on
fb_pan_display one, causing missed screen updates and un unusable
console.

Regression introduced with:

commit bb54662350
Author: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Date:   Tue Apr 21 17:13:13 2015 +0300

    drm/i915: Unify modeset and flip paths of intel_crtc_set_config()

Cc: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-05-29 10:18:32 +02:00
Rodrigo Vivi
2d847d45b2 drm/i915: Return the frontbuffer flip to enable intel_crtc_enable_planes.
Without this frontbuffer flip when enabling planes PSR got compromised
and wasn't being enabled waiting forever on the flush that never
arrived.

Another solution would to create a enable_cursor function and split this
frontbuffer flip among the different plane enable and disable functions.
But if necessary this can be done in a follow up work. For now let's
just fix the regression.

It was removed by:

commit 87d4300a7d
Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Date:   Tue Apr 21 17:12:54 2015 +0300

    drm/i915: Move intel_(pre_disable/post_enable)_primary to intel_display.c, and use it there.

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-05-29 10:18:07 +02:00
Dave Airlie
95872b49ce Merge branch 'drm-tda998x-devel' of git://ftp.arm.linux.org.uk/~rmk/linux-arm into drm-next
warning fix for tda998x

* 'drm-tda998x-devel' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
  drm/i2c: tda998x: fix compiler warning for ssize_t
2015-05-29 09:19:59 +10:00
Russell King
2f15791c28 drm: clean up drm_mm debugfs output
The drm_mm debugfs output is difficult to read as two different formats
are used for the addresses:

0x00000080000000-0x0000008000b000: 45056: used
0x8000b000-0x80016000: 45056: free
0x00000080016000-0x0000008001b000: 20480: used
0x8001b000-0x817a1000: 24666112: free
0x000000817a1000-0x000000817a8000: 28672: used
0x000000817a8000-0x00000081ba8000: 4194304: used

Fix this by using %#018llx for all addresses, thus making the output:

0x0000000080000000-0x000000008000b000: 45056: used
0x000000008000b000-0x0000000080016000: 45056: free
0x0000000080016000-0x000000008001b000: 20480: used
0x000000008001b000-0x00000000817a1000: 24666112: free
0x00000000817a1000-0x00000000817a8000: 28672: used
0x00000000817a8000-0x0000000081ba8000: 4194304: used

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-05-29 09:17:57 +10:00
Dave Airlie
c99d153013 Merge tag 'drm-intel-next-2015-05-22' of git://anongit.freedesktop.org/drm-intel into drm-next
- cpt modeset sequence fixes from Ville
- more rps boosting tuning from Chris
- S3 support for skl (Damien)
- a pile of w/a for bxt from various people
- cleanup of primary plane pixel formats (Damien)
- a big pile of small patches with fixes and cleanups all over

* tag 'drm-intel-next-2015-05-22' of git://anongit.freedesktop.org/drm-intel: (90 commits)
  drm/i915: Update DRIVER_DATE to 20150522
  drm/i915: Introduce DRM_I915_THROTTLE_JIFFIES
  drm/i915: Use the correct destructor for freeing requests on error
  drm/i915/skl: don't fail colorkey + scaler request
  drm/i915: Enable GTT caching on gen8
  drm/i915: Move WaProgramL3SqcReg1Default:bdw to init_clock_gating()
  drm/i915: Use ilk_init_lp_watermarks() on BDW
  drm/i915: Disable FDI RX/TX before the ports
  drm/i915: Disable CRT port after pipe on PCH platforms
  drm/i915: Disable SDVO port after the pipe on PCH platforms
  drm/i915: Disable HDMI port after the pipe on PCH platforms
  drm/i915: Fix the IBX transcoder B workarounds
  drm/i915: Write the SDVO reg twice on IBX
  drm/i915: Fix DP enhanced framing for CPT
  drm/i915: Clean up the CPT DP .get_hw_state() port readout
  drm/i915: Clarfify the DP code platform checks
  drm/i915: Remove the double register write from intel_disable_hdmi()
  drm/i915: Remove a bogus 12bpc "toggle" from intel_disable_hdmi()
  drm/i915/skl: Deinit/init the display at suspend/resume
  drm/i915: Free RPS boosts for all laggards
  ...
2015-05-29 09:11:49 +10:00
Denys Vlasenko
9e5acbc213 radeon: Deinline indirect register accessor functions
This patch deinlines indirect register accessor functions.

These functions perform two mmio accesses, framed by spin lock/unlock.
Spin lock/unlock by itself takes more than 50 cycles in ideal case
(if lock is exclusively cached on current CPU).

With this .config: http://busybox.net/~vda/kernel_config,
after uninlining these functions have sizes and callsite counts
as follows:

r600_uvd_ctx_rreg: 111 bytes, 4 callsites
r600_uvd_ctx_wreg: 113 bytes, 5 callsites
eg_pif_phy0_rreg: 106 bytes, 13 callsites
eg_pif_phy0_wreg: 108 bytes, 13 callsites
eg_pif_phy1_rreg: 107 bytes, 13 callsites
eg_pif_phy1_wreg: 108 bytes, 13 callsites
rv370_pcie_rreg: 111 bytes, 21 callsites
rv370_pcie_wreg: 113 bytes, 24 callsites
r600_rcu_rreg: 111 bytes, 16 callsites
r600_rcu_wreg: 113 bytes, 25 callsites
cik_didt_rreg: 106 bytes, 10 callsites
cik_didt_wreg: 107 bytes, 10 callsites
tn_smc_rreg: 106 bytes, 126 callsites
tn_smc_wreg: 107 bytes, 116 callsites
eg_cg_rreg: 107 bytes, 20 callsites
eg_cg_wreg: 108 bytes, 52 callsites

Functions r100_mm_rreg() and r100_mm_rreg() have a fast path and
a locked (slow) path. This patch deinlines only slow path.

r100_mm_rreg_slow: 78 bytes, 2083 callsites
r100_mm_wreg_slow: 81 bytes, 3570 callsites

Reduction in code size is more than 65,000 bytes:

    text     data      bss       dec     hex filename
85740176 22294680 20627456 128662312 7ab3b28 vmlinux.before
85674192 22294776 20627456 128598664 7aa4288 vmlinux

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-05-28 14:52:40 -04:00
Paulo Zanoni
4373f0f24e drm/i915: disable IPS while getting the sink CRCs
This commit is the "sink CRC" version of:

commit 8c740dcea2
Author: Paulo Zanoni <paulo.r.zanoni@intel.com>
Date:   Fri Oct 17 18:42:03 2014 -0300
    drm/i915: disable IPS while getting the pipe CRCs.

For some unknown reason, when IPS gets enabled, the sink CRC changes.
Since hsw_enable_ips() doesn't really guarantee to enable IPS (it
depends on package C-states), we can't really predict if IPS is
enabled or disabled while running our CRC tests, so let's just
completely disable IPS while sink CRCs are being used.

If we find a way to make IPS not change the pipe CRC result, we may
want to fix IPS and then revert this patch (and 8c740dcea too). While
this doesn't happen, let's merge this patch, so the IGT tests relying
on sink CRCs can work properly.

This was discovered while developing a new IGT test, which will
probably be called kms_frontbuffer_tracking.

Testcase: igt/kms_frontbuffer_tracking (not on upstream IGT yet)
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-05-28 11:13:53 +02:00
Daniel Vetter
5e3daaca09 drm/i915: Disable 12bpc hdmi for now
It's totally broken, and since

commit d328c9d78d
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Fri Apr 10 16:22:37 2015 +0200

    drm/i915: Select starting pipe bpp irrespective or the primary plane

the kernel will try to use it even for the common rgb888 framebuffers.
Ville has patches to fix it all up properly, but unfortunately they're
stuck in review limbo. And since the 4.2 feature cutoff has passed we
need to somehow  handle this regression.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Paulo Zanoni <przanoni@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2015-05-28 11:13:52 +02:00
Ville Syrjälä
54433e91a6 drm/i915: Adjust sideband locking a bit for CHV/VLV
chv_enable_pll() doesn't need to hold sb_lock for the entire duration of
the function. Drop the lock as soon as possible.

valleyview_set_cdclk() does a potential lock+unlock+lock+unlock cycle
with sb_lock. Grab the lock a few lines earlier so we can make do
with a single lock+unlock cycle always.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-05-28 11:13:52 +02:00
Ville Syrjälä
a580516d9f drm/i915: s/dpio_lock/sb_lock/
Rename dpio_lock to sb_lock to inform the reader that its primary
purpose is to protect the sideband mailbox rather than some DPIO
state.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-05-28 11:13:51 +02:00
Ville Syrjälä
b12ce1d84f drm/i915: Kill intel_flush_primary_plane()
The primary plane frobbing was removed from the sprite code in
 commit ecce87ea3a
 Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
 Date:   Tue Apr 21 17:12:50 2015 +0300

    drm/i915: Remove implicitly disabling primary plane for now

but the intel_flush_primary_plane() calls were left behind. Replace them
with straight forward POSTING_READ() of the sprite surface address
register.

The other user of intel_flush_primary_plane() is g4x_disable_trickle_feed()
where we can just inline the steps directly.

This allows intel_flush_primary_plane() to be killed off.

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-05-28 11:13:51 +02:00