Files
linux-rockchip/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
Liviu Dudau d6b2c7b99e dt-bindings: gpu: mali-valhall-csf: Add support for Arm Mali CSF GPUs
Arm has introduced a new v10 GPU architecture that replaces the Job Manager
interface with a new Command Stream Frontend. It adds firmware driven
command stream queues that can be used by kernel and user space to submit
jobs to the GPU.

Add the initial schema for the device tree that is based on support for
RK3588 SoC. The minimum number of clocks is one for the IP, but on Rockchip
platforms they will tend to expose the semi-independent clocks for better
power management.

v5:
- Move the opp-table node under the gpu node

v4:
- Fix formatting issue

v3:
- Cleanup commit message to remove redundant text
- Added opp-table property and re-ordered entries
- Clarified power-domains and power-domain-names requirements for RK3588.
- Cleaned up example

Note: power-domains and power-domain-names requirements for other platforms
are still work in progress, hence the bindings are left incomplete here.

v2:
- New commit

Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Cc: Krzysztof Kozlowski <krzysztof.kozlowski+dt@linaro.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Conor Dooley <conor+dt@kernel.org>
Cc: devicetree@vger.kernel.org
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Rob Herring <robh@kernel.org>

drm: execution context for GEM buffers v7

This adds the infrastructure for an execution context for GEM buffers
which is similar to the existing TTMs execbuf util and intended to replace
it in the long term.

The basic functionality is that we abstracts the necessary loop to lock
many different GEM buffers with automated deadlock and duplicate handling.

v2: drop xarray and use dynamic resized array instead, the locking
    overhead is unnecessary and measurable.
v3: drop duplicate tracking, radeon is really the only one needing that.
v4: fixes issues pointed out by Danilo, some typos in comments and a
    helper for lock arrays of GEM objects.
v5: some suggestions by Boris Brezillon, especially just use one retry
    macro, drop loop in prepare_array, use flags instead of bool
v6: minor changes suggested by Thomas, Boris and Danilo
v7: minor typos pointed out by checkpatch.pl fixed

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Danilo Krummrich <dakr@redhat.com>
Tested-by: Danilo Krummrich <dakr@redhat.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230711133122.3710-2-christian.koenig@amd.com

drm: manager to keep track of GPUs VA mappings

Add infrastructure to keep track of GPU virtual address (VA) mappings
with a decicated VA space manager implementation.

New UAPIs, motivated by Vulkan sparse memory bindings graphics drivers
start implementing, allow userspace applications to request multiple and
arbitrary GPU VA mappings of buffer objects. The DRM GPU VA manager is
intended to serve the following purposes in this context.

1) Provide infrastructure to track GPU VA allocations and mappings,
   using an interval tree (RB-tree).

2) Generically connect GPU VA mappings to their backing buffers, in
   particular DRM GEM objects.

3) Provide a common implementation to perform more complex mapping
   operations on the GPU VA space. In particular splitting and merging
   of GPU VA mappings, e.g. for intersecting mapping requests or partial
   unmap requests.

Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Tested-by: Matthew Brost <matthew.brost@intel.com>
Tested-by: Donald Robson <donald.robson@imgtec.com>
Suggested-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230720001443.2380-2-dakr@redhat.com

drm: manager: Fix printk format for size_t

sizeof() returns a size_t which may be different to an unsigned long.
Use the correct format specifier of '%zu' to prevent compiler warnings.

Fixes: e6303f323b1a ("drm: manager to keep track of GPUs VA mappings")
Reviewed-by: Danilo Krummrich <dakr@redhat.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/2bf64010-c40a-8b84-144c-5387412b579e@arm.com

drm/gpuva_mgr: remove unused prev pointer in __drm_gpuva_sm_map()

The prev pointer in __drm_gpuva_sm_map() was used to implement automatic
merging of mappings. Since automatic merging did not make its way
upstream, remove this leftover.

Fixes: e6303f323b1a ("drm: manager to keep track of GPUs VA mappings")
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230823233119.2891-1-dakr@redhat.com

drm/gpuvm: rename struct drm_gpuva_manager to struct drm_gpuvm

Rename struct drm_gpuva_manager to struct drm_gpuvm including
corresponding functions. This way the GPUVA manager's structures align
very well with the documentation of VM_BIND [1] and VM_BIND locking [2].

It also provides a better foundation for the naming of data structures
and functions introduced for implementing a common dma-resv per GPU-VM
including tracking of external and evicted objects in subsequent
patches.

[1] Documentation/gpu/drm-vm-bind-async.rst
[2] Documentation/gpu/drm-vm-bind-locking.rst

Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230920144343.64830-2-dakr@redhat.com

drm/gpuvm: allow building as module

HB:
drivers/gpu/drm/nouveau/Kconfig
skipped because there is no gpuvm support of nouveau in 6.1

Currently, the DRM GPUVM does not have any core dependencies preventing
a module build.

Also, new features from subsequent patches require helpers (namely
drm_exec) which can be built as module.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230920144343.64830-3-dakr@redhat.com

drm/gpuvm: convert WARN() to drm_WARN() variants

HB:
drivers/gpu/drm/nouveau/nouveau_uvmm.c
skipped since 6.1 does not support gpuvm on nv

Use drm_WARN() and drm_WARN_ON() variants to indicate drivers the
context the failing VM resides in.

Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231108001259.15123-2-dakr@redhat.com

drm/gpuvm: don't always WARN in drm_gpuvm_check_overflow()

Don't always WARN in drm_gpuvm_check_overflow() and separate it into a
drm_gpuvm_check_overflow() and a dedicated
drm_gpuvm_warn_check_overflow() variant.

This avoids printing warnings due to invalid userspace requests.

Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231108001259.15123-3-dakr@redhat.com

drm/gpuvm: export drm_gpuvm_range_valid()

Drivers may use this function to validate userspace requests in advance,
hence export it.

Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231108001259.15123-4-dakr@redhat.com

drm/gpuvm: add common dma-resv per struct drm_gpuvm

hb:
drivers/gpu/drm/nouveau/nouveau_uvmm.c
skipped

Provide a common dma-resv for GEM objects not being used outside of this
GPU-VM. This is used in a subsequent patch to generalize dma-resv,
external and evicted object handling and GEM validation.

Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231108001259.15123-6-dakr@redhat.com

drm/gpuvm: add drm_gpuvm_flags to drm_gpuvm

HB:
drivers/gpu/drm/nouveau/nouveau_uvmm.c
skipped

Introduce flags for struct drm_gpuvm, this required by subsequent
commits.

Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231108001259.15123-8-dakr@redhat.com

drm/gpuvm: reference count drm_gpuvm structures

HB:
drivers/gpu/drm/nouveau/nouveau_uvmm.c
skipped

Implement reference counting for struct drm_gpuvm.

Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231108001259.15123-10-dakr@redhat.com

drm/gpuvm: add an abstraction for a VM / BO combination

HB:
drivers/gpu/drm/nouveau/nouveau_uvmm.c
skipped

Add an abstraction layer between the drm_gpuva mappings of a particular
drm_gem_object and this GEM object itself. The abstraction represents a
combination of a drm_gem_object and drm_gpuvm. The drm_gem_object holds
a list of drm_gpuvm_bo structures (the structure representing this
abstraction), while each drm_gpuvm_bo contains list of mappings of this
GEM object.

This has multiple advantages:

1) We can use the drm_gpuvm_bo structure to attach it to various lists
   of the drm_gpuvm. This is useful for tracking external and evicted
   objects per VM, which is introduced in subsequent patches.

2) Finding mappings of a certain drm_gem_object mapped in a certain
   drm_gpuvm becomes much cheaper.

3) Drivers can derive and extend the structure to easily represent
   driver specific states of a BO for a certain GPUVM.

The idea of this abstraction was taken from amdgpu, hence the credit for
this idea goes to the developers of amdgpu.

Cc: Christian König <christian.koenig@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231108001259.15123-11-dakr@redhat.com

drm/gpuvm: track/lock/validate external/evicted objects

Currently the DRM GPUVM offers common infrastructure to track GPU VA
allocations and mappings, generically connect GPU VA mappings to their
backing buffers and perform more complex mapping operations on the GPU VA
space.

However, there are more design patterns commonly used by drivers, which
can potentially be generalized in order to make the DRM GPUVM represent
a basis for GPU-VM implementations. In this context, this patch aims
at generalizing the following elements.

1) Provide a common dma-resv for GEM objects not being used outside of
   this GPU-VM.

2) Provide tracking of external GEM objects (GEM objects which are
   shared with other GPU-VMs).

3) Provide functions to efficiently lock all GEM objects dma-resv the
   GPU-VM contains mappings of.

4) Provide tracking of evicted GEM objects the GPU-VM contains mappings
   of, such that validation of evicted GEM objects is accelerated.

5) Provide some convinience functions for common patterns.

Big thanks to Boris Brezillon for his help to figure out locking for
drivers updating the GPU VA space within the fence signalling path.

Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231108001259.15123-12-dakr@redhat.com

drm/gpuvm: fall back to drm_exec_lock_obj()

Fall back to drm_exec_lock_obj() if num_fences is zero for the
drm_gpuvm_prepare_* function family.

Otherwise dma_resv_reserve_fences() would actually allocate slots even
though num_fences is zero.

Cc: Christian König <christian.koenig@amd.com>
Acked-by: Donald Robson <donald.robson@imgtec.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231129220835.297885-2-dakr@redhat.com

drm/gpuvm: Let drm_gpuvm_bo_put() report when the vm_bo object is destroyed

Some users need to release resources attached to the vm_bo object when
it's destroyed. In Panthor's case, we need to release the pin ref so
BO pages can be returned to the system when all GPU mappings are gone.

This could be done through a custom drm_gpuvm::vm_bo_free() hook, but
this has all sort of locking implications that would force us to expose
a drm_gem_shmem_unpin_locked() helper, not to mention the fact that
having a ::vm_bo_free() implementation without a ::vm_bo_alloc() one
seems odd. So let's keep things simple, and extend drm_gpuvm_bo_put()
to report when the object is destroyed.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231204151406.1977285-1-boris.brezillon@collabora.com

drm/exec: Pass in initial # of objects

HB: skipped
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c
drivers/gpu/drm/amd/amdkfd/kfd_svm.c
drivers/gpu/drm/imagination/pvr_job.c
drivers/gpu/drm/nouveau/nouveau_uvmm.c

In cases where the # is known ahead of time, it is silly to do the table
resize dance.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Patchwork: https://patchwork.freedesktop.org/patch/568338/

drm/gem-shmem: When drm_gem_object_init failed, should release object

when goto err_free, the object had init, so it should be release when fail.

Signed-off-by: ChunyouTang <tangchunyou@163.com>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20221119064131.364-1-tangchunyou@163.com

drm: Remove usage of deprecated DRM_DEBUG_PRIME

drm_print.h says DRM_DEBUG_PRIME is deprecated in favor of
drm_dbg_prime().

Signed-off-by: Siddh Raman Pant <code@siddh.me>
Reviewed-by: Simon Ser <contact@emersion.fr>
Signed-off-by: Simon Ser <contact@emersion.fr>
Link: https://patchwork.freedesktop.org/patch/msgid/cd663b1bc42189e55898cddecdb3b73c591b341a.1673269059.git.code@siddh.me

drm/shmem: Cleanup drm_gem_shmem_create_with_handle()

Once we create the handle, the handle owns the reference.  Currently
nothing was doing anything with the shmem ptr after the handle was
created, but let's change drm_gem_shmem_create_with_handle() to not
return the pointer, so-as to not encourage problematic use of this
function in the future.  As a bonus, it makes the code a bit cleaner.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230123154831.3191821-1-robdclark@gmail.com

drm/shmem-helper: Fix locking for drm_gem_shmem_get_pages_sgt()

Other functions touching shmem->sgt take the pages lock, so do that here
too. drm_gem_shmem_get_pages() & co take the same lock, so move to the
_locked() variants to avoid recursive locking.

Discovered while auditing locking to write the Rust abstractions.

Fixes: 2194a63a81 ("drm: Add library for shmem backed GEM objects")
Fixes: 4fa3d66f13 ("drm/shmem: Do dma_unmap_sg before purging pages")
Signed-off-by: Asahi Lina <lina@asahilina.net>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230205125124.2260-1-lina@asahilina.net

drm/shmem-helper: Switch to use drm_* debug helpers

Ease debugging of a multi-GPU system by using drm_WARN_*() and
drm_dbg_kms() helpers that print out DRM device name corresponding
to shmem GEM.

Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Suggested-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Link: https://lore.kernel.org/all/20230108210445.3948344-6-dmitry.osipenko@collabora.com/

drm/shmem-helper: Don't use vmap_use_count for dma-bufs

DMA-buf core has its own refcounting of vmaps, use it instead of drm-shmem
counting. This change prepares drm-shmem for addition of memory shrinker
support where drm-shmem will use a single dma-buf reservation lock for
all operations performed over dma-bufs.

Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Link: https://lore.kernel.org/all/20230108210445.3948344-7-dmitry.osipenko@collabora.com/

drm/shmem-helper: Switch to reservation lock

Replace all drm-shmem locks with a GEM reservation lock. This makes locks
consistent with dma-buf locking convention where importers are responsible
for holding reservation lock for all operations performed over dma-bufs,
preventing deadlock between dma-buf importers and exporters.

Suggested-by: Daniel Vetter <daniel@ffwll.ch>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Link: https://lore.kernel.org/all/20230108210445.3948344-8-dmitry.osipenko@collabora.com/

drm/shmem-helper: Revert accidental non-GPL export

The referenced commit added a wrapper for drm_gem_shmem_get_pages_sgt(),
but in the process it accidentally changed the export type from GPL to
non-GPL. Switch it back to GPL.

Reported-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Fixes: ddddedaa0db9 ("drm/shmem-helper: Fix locking for drm_gem_shmem_get_pages_sgt()")
Signed-off-by: Asahi Lina <lina@asahilina.net>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20230227-shmem-export-fix-v1-1-8880b2c25e81@asahilina.net

Revert "drm/shmem-helper: Switch to reservation lock"

This reverts commit 67b7836d4458790f1261e31fe0ce3250989784f0.

The locking appears incomplete. A caller of SHMEM helper's pin
function never acquires the dma-buf reservation lock. So we get

  WARNING: CPU: 3 PID: 967 at drivers/gpu/drm/drm_gem_shmem_helper.c:243 drm_gem_shmem_pin+0x42/0x90 [drm_shmem_helper]

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230228152612.19971-1-tzimmermann@suse.de

drm/shmem-helper: Switch to reservation lock

Replace all drm-shmem locks with a GEM reservation lock. This makes locks
consistent with dma-buf locking convention where importers are responsible
for holding reservation lock for all operations performed over dma-bufs,
preventing deadlock between dma-buf importers and exporters.

Suggested-by: Daniel Vetter <daniel@ffwll.ch>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230529223935.2672495-7-dmitry.osipenko@collabora.com

drm/shmem-helper: Reset vma->vm_ops before calling dma_buf_mmap()

The dma-buf backend is supposed to provide its own vm_ops, but some
implementation just have nothing special to do and leave vm_ops
untouched, probably expecting this field to be zero initialized (this
is the case with the system_heap implementation for instance).
Let's reset vma->vm_ops to NULL to keep things working with these
implementations.

Fixes: 26d3ac3cb0 ("drm/shmem-helpers: Redirect mmap for imported dma-buf")
Cc: <stable@vger.kernel.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reported-by: Roman Stratiienko <r.stratiienko@gmail.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Tested-by: Roman Stratiienko <r.stratiienko@gmail.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20230724112610.60974-1-boris.brezillon@collabora.com

iommu: Allow passing custom allocators to pgtable drivers

This will be useful for GPU drivers who want to keep page tables in a
pool so they can:

- keep freed page tables in a free pool and speed-up upcoming page
  table allocations
- batch page table allocation instead of allocating one page at a time
- pre-reserve pages for page tables needed for map/unmap operations,
  to ensure map/unmap operations don't try to allocate memory in paths
  they're allowed to block or fail

It might also be valuable for other aspects of GPU and similar
use-cases, like fine-grained memory accounting and resource limiting.

We will extend the Arm LPAE format to support custom allocators in a
separate commit.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20231124142434.1577550-2-boris.brezillon@collabora.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>

iommu: Extend LPAE page table format to support custom allocators

We need that in order to implement the VM_BIND ioctl in the GPU driver
targeting new Mali GPUs.

VM_BIND is about executing MMU map/unmap requests asynchronously,
possibly after waiting for external dependencies encoded as dma_fences.
We intend to use the drm_sched framework to automate the dependency
tracking and VM job dequeuing logic, but this comes with its own set
of constraints, one of them being the fact we are not allowed to
allocate memory in the drm_gpu_scheduler_ops::run_job() to avoid this
sort of deadlocks:

- VM_BIND map job needs to allocate a page table to map some memory
  to the VM. No memory available, so kswapd is kicked
- GPU driver shrinker backend ends up waiting on the fence attached to
  the VM map job or any other job fence depending on this VM operation.

With custom allocators, we will be able to pre-reserve enough pages to
guarantee the map/unmap operations we queued will take place without
going through the system allocator. But we can also optimize
allocation/reservation by not free-ing pages immediately, so any
upcoming page table allocation requests can be serviced by some free
page table pool kept at the driver level.

I might also be valuable for other aspects of GPU and similar
use-cases, like fine-grained memory accounting and resource limiting.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20231124142434.1577550-3-boris.brezillon@collabora.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>

drm/sched: Add FIFO sched policy to run queue

When many entities are competing for the same run queue
on the same scheduler, we observe an unusually long wait
times and some jobs get starved. This has been observed on GPUVis.

The issue is due to the Round Robin policy used by schedulers
to pick up the next entity's job queue for execution. Under stress
of many entities and long job queues within entity some
jobs could be stuck for very long time in it's entity's
queue before being popped from the queue and executed
while for other entities with smaller job queues a job
might execute earlier even though that job arrived later
then the job in the long queue.

Fix:
Add FIFO selection policy to entities in run queue, chose next entity
on run queue in such order that if job on one entity arrived
earlier then job on another entity the first job will start
executing earlier regardless of the length of the entity's job
queue.

v2:
Switch to rb tree structure for entities based on TS of
oldest job waiting in the job queue of an entity. Improves next
entity extraction to O(1). Entity TS update
O(log N) where N is the number of entities in the run-queue

Drop default option in module control parameter.

v3:
Various cosmetical fixes and minor refactoring of fifo update function. (Luben)

v4:
Switch drm_sched_rq_select_entity_fifo to in order search (Luben)

v5: Fix up drm_sched_rq_select_entity_fifo loop (Luben)

v6: Add missing drm_sched_rq_remove_fifo_locked

v7: Fix ts sampling bug and more cosmetic stuff (Luben)

v8: Fix module parameter string (Luben)

Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Direct Rendering Infrastructure - Development <dri-devel@lists.freedesktop.org>
Cc: AMD Graphics <amd-gfx@lists.freedesktop.org>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Tested-by: Yunxiang Li (Teddy) <Yunxiang.Li@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220930041258.1050247-1-luben.tuikov@amd.com

drm/scheduler: Set the FIFO scheduling policy as the default

The currently default Round-Robin GPU scheduling can result in starvation
of entities which have a large number of jobs, over entities which have
a very small number of jobs (single digit).

This can be illustrated in the following diagram, where jobs are
alphabetized to show their chronological order of arrival, where job A is
the oldest, B is the second oldest, and so on, to J, the most recent job to
arrive.

    ---> entities
j | H-F-----A--E--I--
o | --G-----B-----J--
b | --------C--------
s\/ --------D--------

WLOG, assuming all jobs are "ready", then a R-R scheduling will execute them
in the following order (a slice off of the top of the entities' list),

H, F, A, E, I, G, B, J, C, D.

However, to mitigate job starvation, we'd rather execute C and D before E,
and so on, given, of course, that they're all ready to be executed.

So, if all jobs are ready at this instant, the order of execution for this
and the next 9 instances of picking the next job to execute, should really
be,

A, B, C, D, E, F, G, H, I, J,

which is their chronological order. The only reason for this order to be
broken, is if an older job is not yet ready, but a younger job is ready, at
an instant of picking a new job to execute. For instance if job C wasn't
ready at time 2, but job D was ready, then we'd pick job D, like this:

0 +1 +2  ...
A, B, D, ...

And from then on, C would be preferred before all other jobs, if it is ready
at the time when a new job for execution is picked. So, if C became ready
two steps later, the execution order would look like this:

......0 +1 +2  ...
A, B, D, E, C, F, G, H, I, J

This is what the FIFO GPU scheduling algorithm achieves. It uses a
Red-Black tree to keep jobs sorted in chronological order, where picking
the oldest job is O(1) (we use the "cached" structure), and balancing the
tree is O(log n). IOW, it picks the *oldest ready* job to execute now.

The implementation is already in the kernel, and this commit only changes
the default GPU scheduling algorithm to use.

This was tested and achieves about 1% faster performance over the Round
Robin algorithm.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <Alexander.Deucher@amd.com>
Cc: Direct Rendering Infrastructure - Development <dri-devel@lists.freedesktop.org>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221024212634.27230-1-luben.tuikov@amd.com
Signed-off-by: Christian König <christian.koenig@amd.com>

drm/scheduler: add drm_sched_job_add_resv_dependencies

Add a new function to update job dependencies from a resv obj.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221014084641.128280-3-christian.koenig@amd.com

drm/scheduler: remove drm_sched_dependency_optimized

Not used any more.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221014084641.128280-12-christian.koenig@amd.com

drm/scheduler: rework entity flush, kill and fini

This was buggy because when we had to wait for entities which were
killed as well we would just deadlock.

Instead move all the dependency handling into the callbacks so that
will all happen asynchronously.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221014084641.128280-13-christian.koenig@amd.com

drm/scheduler: rename dependency callback into prepare_job

This now matches much better what this is doing.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221014084641.128280-14-christian.koenig@amd.com

drm/amdgpu: revert "implement tdr advanced mode"

This reverts commit e6c6338f39.

This feature basically re-submits one job after another to
figure out which one was the one causing a hang.

This is obviously incompatible with gang-submit which requires
that multiple jobs run at the same time. It's also absolutely
not helpful to crash the hardware multiple times if a clean
recovery is desired.

For testing and debugging environments we should rather disable
recovery alltogether to be able to inspect the state with a hw
debugger.

Additional to that the sw implementation is clearly buggy and causes
reference count issues for the hardware fence.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/scheduler: Fix lockup in drm_sched_entity_kill()

The drm_sched_entity_kill() is invoked twice by drm_sched_entity_destroy()
while userspace process is exiting or being killed. First time it's invoked
when sched entity is flushed and second time when entity is released. This
causes a lockup within wait_for_completion(entity_idle) due to how completion
API works.

Calling wait_for_completion() more times than complete() was invoked is a
error condition that causes lockup because completion internally uses
counter for complete/wait calls. The complete_all() must be used instead
in such cases.

This patch fixes lockup of Panfrost driver that is reproducible by killing
any application in a middle of 3d drawing operation.

Fixes: 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and fini")
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221123001303.533968-1-dmitry.osipenko@collabora.com

drm/scheduler: deprecate drm_sched_resubmit_jobs

This interface is not working as it should.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221109095010.141189-5-christian.koenig@amd.com

drm/scheduler: track GPU active time per entity

Track the accumulated time that jobs from this entity were active
on the GPU. This allows drivers using the scheduler to trivially
implement the DRM fdinfo when the hardware doesn't provide more
specific information than signalling job completion anyways.

[Bagas: Append missing colon to @elapsed_ns]
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>

drm/sched: Create wrapper to add a syncobj dependency to job

In order to add a syncobj's fence as a dependency to a job, it is
necessary to call drm_syncobj_find_fence() to find the fence and then
add the dependency with drm_sched_job_add_dependency(). So, wrap these
steps in one single function, drm_sched_job_add_syncobj_dependency().

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mairacanal@riseup.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20230209124447.467867-2-mcanal@igalia.com

drm/scheduler: Fix variable name in function description

Compiling AMD GPU drivers displays two warnings:

drivers/gpu/drm/scheduler/sched_main.c:738: warning: Function parameter or member 'file' not described in 'drm_sched_job_add_syncobj_dependency'
drivers/gpu/drm/scheduler/sched_main.c:738: warning: Excess function
parameter 'file_private' description in
'drm_sched_job_add_syncobj_dependency'

Get rid of them by renaming the variable name on the function description

Signed-off-by: Caio Novais <caionovais@usp.br>
Link: https://lore.kernel.org/r/20230325131532.6356-1-caionovais@usp.br
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>

drm/scheduler: Add fence deadline support

As the finished fence is the one that is exposed to userspace, and
therefore the one that other operations, like atomic update, would
block on, we need to propagate the deadline from from the finished
fence to the actual hw fence.

v2: Split into drm_sched_fence_set_parent() (ckoenig)
v3: Ensure a thread calling drm_sched_fence_set_deadline_finished() sees
    fence->parent set before drm_sched_fence_set_parent() does this
    test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT).

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Luben Tuikov <luben.tuikov@amd.com>

Revert "drm/scheduler: track GPU active time per entity"

This reverts commit df622729ddbf as it introduces a use-after-free,
which isn't easy to fix without going back to the design drawing board.

Reported-by: Danilo Krummrich <dakr@redhat.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>

drm/scheduler: Fix UAF race in drm_sched_entity_push_job()

After a job is pushed into the queue, it is owned by the scheduler core
and may be freed at any time, so we can't write nor read the submit
timestamp after that point.

Fixes oopses observed with the drm/asahi driver, found with kASAN.

Signed-off-by: Asahi Lina <lina@asahilina.net>
Link: https://lore.kernel.org/r/20230406-scheduler-uaf-2-v1-1-972531cf0a81@asahilina.net
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>

drm/sched: Check scheduler ready before calling timeout handling

During an IGT GPU reset test we see the following oops,

[  +0.000003] ------------[ cut here ]------------
[  +0.000000] WARNING: CPU: 9 PID: 0 at kernel/workqueue.c:1656 __queue_delayed_work+0x6d/0xa0
[  +0.000004] Modules linked in: iptable_filter bpfilter amdgpu(OE) nls_iso8859_1 snd_hda_codec_realtek snd_hda_codec_generic intel_rapl_msr ledtrig_audio snd_hda_codec_hdmi intel_rapl_common snd_hda_intel edac_mce_amd snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hda_core iommu_v2 gpu_sched(OE) kvm_amd drm_buddy snd_hwdep kvm video drm_ttm_helper snd_pcm ttm snd_seq_midi drm_display_helper snd_seq_midi_event snd_rawmidi cec crct10dif_pclmul ghash_clmulni_intel sha512_ssse3 snd_seq aesni_intel rc_core crypto_simd cryptd binfmt_misc drm_kms_helper rapl snd_seq_device input_leds joydev snd_timer i2c_algo_bit syscopyarea snd ccp sysfillrect sysimgblt wmi_bmof k10temp soundcore mac_hid sch_fq_codel msr parport_pc ppdev drm lp parport ramoops reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 hid_generic usbhid hid r8169 ahci xhci_pci gpio_amdpt realtek i2c_piix4 wmi crc32_pclmul xhci_pci_renesas libahci gpio_generic
[  +0.000070] CPU: 9 PID: 0 Comm: swapper/9 Tainted: G        W OE      6.1.11+ #2
[  +0.000003] Hardware name: Gigabyte Technology Co., Ltd. AB350-Gaming 3/AB350-Gaming 3-CF, BIOS F7 06/16/2017
[  +0.000001] RIP: 0010:__queue_delayed_work+0x6d/0xa0
[  +0.000003] Code: 7a 50 48 01 c1 48 89 4a 30 81 ff 00 20 00 00 75 38 4c 89 cf e8 64 3e 0a 00 5d e9 1e c5 11 01 e8 99 f7 ff ff 5d e9 13 c5 11 01 <0f> 0b eb c1 0f 0b 48 81 7a 38 70 5c 0e 81 74 9f 0f 0b 48 8b 42 28
[  +0.000002] RSP: 0018:ffffc90000398d60 EFLAGS: 00010007
[  +0.000002] RAX: ffff88810d589c60 RBX: 0000000000000000 RCX: 0000000000000000
[  +0.000002] RDX: ffff88810d589c58 RSI: 0000000000000000 RDI: 0000000000002000
[  +0.000001] RBP: ffffc90000398d60 R08: 0000000000000000 R09: ffff88810d589c78
[  +0.000002] R10: 72705f305f39765f R11: 7866673a6d72645b R12: ffff88810d589c58
[  +0.000001] R13: 0000000000002000 R14: 0000000000000000 R15: 0000000000000000
[  +0.000002] FS:  0000000000000000(0000) GS:ffff8887fee40000(0000) knlGS:0000000000000000
[  +0.000001] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  +0.000002] CR2: 00005562c4797fa0 CR3: 0000000110da0000 CR4: 00000000003506e0
[  +0.000002] Call Trace:
[  +0.000001]  <IRQ>
[  +0.000001]  mod_delayed_work_on+0x5e/0xa0
[  +0.000004]  drm_sched_fault+0x23/0x30 [gpu_sched]
[  +0.000007]  gfx_v9_0_fault.isra.0+0xa6/0xd0 [amdgpu]
[  +0.000258]  gfx_v9_0_priv_reg_irq+0x29/0x40 [amdgpu]
[  +0.000254]  amdgpu_irq_dispatch+0x1ac/0x2b0 [amdgpu]
[  +0.000243]  amdgpu_ih_process+0x89/0x130 [amdgpu]
[  +0.000245]  amdgpu_irq_handler+0x24/0x60 [amdgpu]
[  +0.000165]  __handle_irq_event_percpu+0x4f/0x1a0
[  +0.000003]  handle_irq_event_percpu+0x15/0x50
[  +0.000001]  handle_irq_event+0x39/0x60
[  +0.000002]  handle_edge_irq+0xa8/0x250
[  +0.000003]  __common_interrupt+0x7b/0x150
[  +0.000002]  common_interrupt+0xc1/0xe0
[  +0.000003]  </IRQ>
[  +0.000000]  <TASK>
[  +0.000001]  asm_common_interrupt+0x27/0x40
[  +0.000002] RIP: 0010:native_safe_halt+0xb/0x10
[  +0.000003] Code: 46 ff ff ff cc cc cc cc cc cc cc cc cc cc cc eb 07 0f 00 2d 69 f2 5e 00 f4 e9 f1 3b 3e 00 90 eb 07 0f 00 2d 59 f2 5e 00 fb f4 <e9> e0 3b 3e 00 0f 1f 44 00 00 55 48 89 e5 53 e8 b1 d4 fe ff 66 90
[  +0.000002] RSP: 0018:ffffc9000018fdc8 EFLAGS: 00000246
[  +0.000002] RAX: 0000000000004000 RBX: 000000000002e5a8 RCX: 000000000000001f
[  +0.000001] RDX: 0000000000000001 RSI: ffff888101298800 RDI: ffff888101298864
[  +0.000001] RBP: ffffc9000018fdd0 R08: 000000527f64bd8b R09: 000000000001dc90
[  +0.000001] R10: 000000000001dc90 R11: 0000000000000003 R12: 0000000000000001
[  +0.000001] R13: ffff888101298864 R14: ffffffff832d9e20 R15: ffff888193aa8c00
[  +0.000003]  ? acpi_idle_do_entry+0x5e/0x70
[  +0.000002]  acpi_idle_enter+0xd1/0x160
[  +0.000003]  cpuidle_enter_state+0x9a/0x6e0
[  +0.000003]  cpuidle_enter+0x2e/0x50
[  +0.000003]  call_cpuidle+0x23/0x50
[  +0.000002]  do_idle+0x1de/0x260
[  +0.000002]  cpu_startup_entry+0x20/0x30
[  +0.000002]  start_secondary+0x120/0x150
[  +0.000003]  secondary_startup_64_no_verify+0xe5/0xeb
[  +0.000004]  </TASK>
[  +0.000000] ---[ end trace 0000000000000000 ]---
[  +0.000003] BUG: kernel NULL pointer dereference, address: 0000000000000102
[  +0.006233] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_low timeout, signaled seq=3, emitted seq=4
[  +0.000734] #PF: supervisor read access in kernel mode
[  +0.009670] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process amd_deadlock pid 2002 thread amd_deadlock pid 2002
[  +0.005135] #PF: error_code(0x0000) - not-present page
[  +0.000002] PGD 0 P4D 0
[  +0.000002] Oops: 0000 [#1] PREEMPT SMP NOPTI
[  +0.000002] CPU: 9 PID: 0 Comm: swapper/9 Tainted: G        W OE      6.1.11+ #2
[  +0.000002] Hardware name: Gigabyte Technology Co., Ltd. AB350-Gaming 3/AB350-Gaming 3-CF, BIOS F7 06/16/2017
[  +0.012101] amdgpu 0000:0c:00.0: amdgpu: GPU reset begin!
[  +0.005136] RIP: 0010:__queue_work+0x1f/0x4e0
[  +0.000004] Code: 87 cd 11 01 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 49 89 d5 41 54 49 89 f4 53 48 83 ec 10 89 7d d4 <f6> 86 02 01 00 00 01 0f 85 6c 03 00 00 e8 7f 36 08 00 8b 45 d4 48

For gfx_rings the schedulers may not be initialized by
amdgpu_device_init_schedulers() due to ring->no_scheduler flag being set to
true and thus the timeout_wq is NULL. As a result, since all ASICs call
drm_sched_fault() unconditionally even for schedulers which have not been
initialized, it is simpler to use the ready condition which indicates whether
the given scheduler worker thread runs and whether the timeout_wq of the reset
domain has been initialized.

Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://lore.kernel.org/r/20230406200054.633379-1-luben.tuikov@amd.com

drm/scheduler: set entity to NULL in drm_sched_entity_pop_job()

It already happend a few times that patches slipped through which
implemented access to an entity through a job that was already removed
from the entities queue. Since jobs and entities might have different
lifecycles, this can potentially cause UAF bugs.

In order to make it obvious that a jobs entity pointer shouldn't be
accessed after drm_sched_entity_pop_job() was called successfully, set
the jobs entity pointer to NULL once the job is removed from the entity
queue.

Moreover, debugging a potential NULL pointer dereference is way easier
than potentially corrupted memory through a UAF.

Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://lore.kernel.org/r/20230418100453.4433-1-dakr@redhat.com
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>

drm/scheduler: properly forward fence errors

When a hw fence is signaled with an error properly forward that to the
finished fence.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230420115752.31470-1-christian.koenig@amd.com

drm/scheduler: add drm_sched_entity_error and use rcu for last_scheduled

Switch to using RCU handling for the last scheduled job and add a
function to return the error code of it.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230420115752.31470-2-christian.koenig@amd.com

drm/scheduler: mark jobs without fence as canceled

When no hw fence is provided for a job that means that the job didn't executed.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230427122726.1290170-1-christian.koenig@amd.com

drm/sched: Check scheduler work queue before calling timeout handling

During an IGT GPU reset test we see again oops despite of
commit 0c8c901aaaebc9 (drm/sched: Check scheduler ready before calling
timeout handling).

It uses ready condition whether to call drm_sched_fault which unwind
the TDR leads to GPU reset.
However it looks the ready condition is overloaded with other meanings,
for example, for the following stack is related GPU reset :

0  gfx_v9_0_cp_gfx_start
1  gfx_v9_0_cp_gfx_resume
2  gfx_v9_0_cp_resume
3  gfx_v9_0_hw_init
4  gfx_v9_0_resume
5  amdgpu_device_ip_resume_phase2

does the following:
	/* start the ring */
	gfx_v9_0_cp_gfx_start(adev);
	ring->sched.ready = true;

The same approach is for other ASICs as well :
gfx_v8_0_cp_gfx_resume
gfx_v10_0_kiq_resume, etc...

As a result, our GPU reset test causes GPU fault which calls unconditionally gfx_v9_0_fault
and then drm_sched_fault. However now it depends on whether the interrupt service routine
drm_sched_fault is executed after gfx_v9_0_cp_gfx_start is completed which sets the ready
field of the scheduler to true even  for uninitialized schedulers and causes oops vs
no fault or when ISR  drm_sched_fault is completed prior  gfx_v9_0_cp_gfx_start and
NULL pointer dereference does not occur.

Use the field timeout_wq  to prevent oops for uninitialized schedulers.
The field could be initialized by the work queue of resetting the domain.

v1: Corrections to commit message (Luben)

Fixes: 11b3b9f461c5c4 ("drm/sched: Check scheduler ready before calling timeout handling")
Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Link: https://lore.kernel.org/r/20230510135111.58631-1-vitaly.prosyak@amd.com
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>

drm/sched: Remove redundant check

The rq pointer points inside the drm_gpu_scheduler structure. Thus
it can't be NULL.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: c61cdbdbff ("drm/scheduler: Fix hang when sched_entity released")
Signed-off-by: Vladislav Efanov <VEfanov@ispras.ru>
Link: https://lore.kernel.org/r/20230517125247.434103-1-VEfanov@ispras.ru
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>

drm/sched: Rename to drm_sched_can_queue()

Rename drm_sched_ready() to drm_sched_can_queue(). "ready" can mean many
things and is thus meaningless in this context. Instead, rename to a name
which precisely conveys what is being checked.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com>
Link: https://lore.kernel.org/r/20230517233550.377847-1-luben.tuikov@amd.com

drm/sched: Rename to drm_sched_wakeup_if_can_queue()

Rename drm_sched_wakeup() to drm_sched_wakeup_if_canqueue() since the former
is misleading, as it wakes up the GPU scheduler _only if_ more jobs can be
queued to the underlying hardware.

This distinction is important to make, since the wake conditional in the GPU
scheduler thread wakes up when other conditions are also true, e.g. when there
are jobs to be cleaned. For instance, a user might want to wake up the
scheduler only because there are more jobs to clean, but whether we can queue
more jobs is irrelevant.

v2: Separate "canqueue" to "can_queue". (Alex D.)

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://lore.kernel.org/r/20230517233550.377847-2-luben.tuikov@amd.com
Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com>

drm/scheduler: avoid infinite loop if entity's dependency is a scheduled error fence

[Why]
drm_sched_entity_add_dependency_cb ignores the scheduled fence and return false.
If entity's dependency is a scheduler error fence and drm_sched_stop is called
due to TDR, drm_sched_entity_pop_job will wait for the dependency infinitely.

[How]
Do not wait or ignore the scheduled error fence, add drm_sched_entity_wakeup
callback for the dependency with scheduled error fence.

Signed-off-by: ZhenGuo Yin <zhenguo.yin@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/sched: Make sure we wait for all dependencies in kill_jobs_cb()

drm_sched_entity_kill_jobs_cb() logic is omitting the last fence popped
from the dependency array that was waited upon before
drm_sched_entity_kill() was called (drm_sched_entity::dependency field),
so we're basically waiting for all dependencies except one.

In theory, this wait shouldn't be needed because resources should have
their users registered to the dma_resv object, thus guaranteeing that
future jobs wanting to access these resources wait on all the previous
users (depending on the access type, of course). But we want to keep
these explicit waits in the kill entity path just in case.

Let's make sure we keep all dependencies in the array in
drm_sched_job_dependency(), so we can iterate over the array and wait
in drm_sched_entity_kill_jobs_cb().

We also make sure we wait on drm_sched_fence::finished if we were
originally asked to wait on drm_sched_fence::scheduled. In that case,
we assume the intent was to delegate the wait to the firmware/GPU or
rely on the pipelining done at the entity/scheduler level, but when
killing jobs, we really want to wait for completion not just scheduling.

v2:
- Don't evict deps in drm_sched_job_dependency()

v3:
- Always wait for drm_sched_fence::finished fences in
  drm_sched_entity_kill_jobs_cb() when we see a sched_fence

v4:
- Fix commit message
- Fix a use-after-free bug

v5:
- Flag deps on which we should only wait for the scheduled event
  at insertion time

v6:
- Back to v4 implementation
- Add Christian's R-b

Cc: Frank Binns <frank.binns@imgtec.com>
Cc: Sarah Walker <sarah.walker@imgtec.com>
Cc: Donald Robson <donald.robson@imgtec.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Suggested-by: "Christian König" <christian.koenig@amd.com>
Reviewed-by: "Christian König" <christian.koenig@amd.com>
Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230619071921.3465992-1-boris.brezillon@collabora.com

drm/sched: Call drm_sched_fence_set_parent() from drm_sched_fence_scheduled()

Drivers that can delegate waits to the firmware/GPU pass the scheduled
fence to drm_sched_job_add_dependency(), and issue wait commands to
the firmware/GPU at job submission time. For this to be possible, they
need all their 'native' dependencies to have a valid parent since this
is where the actual HW fence information are encoded.

In drm_sched_main(), we currently call drm_sched_fence_set_parent()
after drm_sched_fence_scheduled(), leaving a short period of time
during which the job depending on this fence can be submitted.

Since setting parent and signaling the fence are two things that are
kinda related (you can't have a parent if the job hasn't been
scheduled),
it probably makes sense to pass the parent fence to
drm_sched_fence_scheduled() and let it call drm_sched_fence_set_parent()
before it signals the scheduled fence.

Here is a detailed description of the race we are fixing here:

Thread A				Thread B

- calls drm_sched_fence_scheduled()
- signals s_fence->scheduled which
  wakes up thread B

					- entity dep signaled, checking
					  the next dep
					- no more deps waiting
					- entity is picked for job
					  submission by drm_gpu_scheduler
					- run_job() is called
					- run_job() tries to
					  collect native fence info from
					  s_fence->parent, but it's
					  NULL =>
					  BOOM, we can't do our native
					  wait

- calls drm_sched_fence_set_parent()

v2:
* Fix commit message

v3:
* Add a detailed description of the race to the commit message
* Add Luben's R-b

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Frank Binns <frank.binns@imgtec.com>
Cc: Sarah Walker <sarah.walker@imgtec.com>
Cc: Donald Robson <donald.robson@imgtec.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230623075204.382350-1-boris.brezillon@collabora.com

dma-buf: add dma_fence_timestamp helper

When a fence signals there is a very small race window where the timestamp
isn't updated yet. sync_file solves this by busy waiting for the
timestamp to appear, but on other ocassions didn't handled this
correctly.

Provide a dma_fence_timestamp() helper function for this and use it in
all appropriate cases.

Another alternative would be to grab the spinlock when that happens.

v2 by teddy: add a wait parameter to wait for the timestamp to show up, in case
   the accurate timestamp is needed and/or the timestamp is not based on
   ktime (e.g. hw timestamp)
v3 chk: drop the parameter again for unified handling

Signed-off-by: Yunxiang Li <Yunxiang.Li@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: 1774baa64f ("drm/scheduler: Change scheduled fence track v2")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
CC: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20230929104725.2358-1-christian.koenig@amd.com

drm/sched: Convert the GPU scheduler to variable number of run-queues

The GPU scheduler has now a variable number of run-queues, which are set
up at
drm_sched_init() time. This way, each driver announces how many
run-queues it
requires (supports) per each GPU scheduler it creates. Note, that
run-queues
correspond to scheduler "priorities", thus if the number of run-queues
is set
to 1 at drm_sched_init(), then that scheduler supports a single
run-queue,
i.e. single "priority". If a driver further sets a single entity per
run-queue, then this creates a 1-to-1 correspondence between a scheduler
and
a scheduled entity.

Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Abhinav Kumar <quic_abhinavk@quicinc.com>
Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Cc: Danilo Krummrich <dakr@redhat.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Emma Anholt <emma@anholt.net>
Cc: etnaviv@lists.freedesktop.org
Cc: lima@lists.freedesktop.org
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Cc: nouveau@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/20231023032251.164775-1-luben.tuikov@amd.com

drm/sched: Add drm_sched_wqueue_* helpers

Add scheduler wqueue ready, stop, and start helpers to hide the
implementation details of the scheduler from the drivers.

v2:
  - s/sched_wqueue/sched_wqueue (Luben)
  - Remove the extra white line after the return-statement (Luben)
  - update drm_sched_wqueue_ready comment (Luben)

Cc: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://lore.kernel.org/r/20231031032439.1558703-2-matthew.brost@intel.com
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>

drm/sched: Convert drm scheduler to use a work queue rather than kthread

In Xe, the new Intel GPU driver, a choice has made to have a 1 to 1
mapping between a drm_gpu_scheduler and drm_sched_entity. At first this
seems a bit odd but let us explain the reasoning below.

1. In Xe the submission order from multiple drm_sched_entity is not
guaranteed to be the same completion even if targeting the same hardware
engine. This is because in Xe we have a firmware scheduler, the GuC,
which allowed to reorder, timeslice, and preempt submissions. If a using
shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls
apart as the TDR expects submission order == completion order. Using a
dedicated drm_gpu_scheduler per drm_sched_entity solve this problem.

2. In Xe submissions are done via programming a ring buffer (circular
buffer), a drm_gpu_scheduler provides a limit on number of jobs, if the
limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we get flow
control on the ring for free.

A problem with this design is currently a drm_gpu_scheduler uses a
kthread for submission / job cleanup. This doesn't scale if a large
number of drm_gpu_scheduler are used. To work around the scaling issue,
use a worker rather than kthread for submission / job cleanup.

v2:
  - (Rob Clark) Fix msm build
  - Pass in run work queue
v3:
  - (Boris) don't have loop in worker
v4:
  - (Tvrtko) break out submit ready, stop, start helpers into own patch
v5:
  - (Boris) default to ordered work queue
v6:
  - (Luben / checkpatch) fix alignment in msm_ringbuffer.c
  - (Luben) s/drm_sched_submit_queue/drm_sched_wqueue_enqueue
  - (Luben) Update comment for drm_sched_wqueue_enqueue
  - (Luben) Positive check for submit_wq in drm_sched_init
  - (Luben) s/alloc_submit_wq/own_submit_wq
v7:
  - (Luben) s/drm_sched_wqueue_enqueue/drm_sched_run_job_queue
v8:
  - (Luben) Adjust var names / comments

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://lore.kernel.org/r/20231031032439.1558703-3-matthew.brost@intel.com
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>

drm/sched: Split free_job into own work item

Rather than call free_job and run_job in same work item have a dedicated
work item for each. This aligns with the design and intended use of work
queues.

v2:
   - Test for DMA_FENCE_FLAG_TIMESTAMP_BIT before setting
     timestamp in free_job() work item (Danilo)
v3:
  - Drop forward dec of drm_sched_select_entity (Boris)
  - Return in drm_sched_run_job_work if entity NULL (Boris)
v4:
  - Replace dequeue with peek and invert logic (Luben)
  - Wrap to 100 lines (Luben)
  - Update comments for *_queue / *_queue_if_ready functions (Luben)
v5:
  - Drop peek argument, blindly reinit idle (Luben)
  - s/drm_sched_free_job_queue_if_ready/drm_sched_free_job_queue_if_done (Luben)
  - Update work_run_job & work_free_job kernel doc (Luben)
v6:
  - Do not move drm_sched_select_entity in file (Luben)

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20231031032439.1558703-4-matthew.brost@intel.com
Reviewed-by: Luben Tuikov <ltuikov89@gmail.com>
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>

drm/sched: Add drm_sched_start_timeout_unlocked helper

Also add a lockdep assert to drm_sched_start_timeout.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://lore.kernel.org/r/20231031032439.1558703-5-matthew.brost@intel.com
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>

drm/sched: Add a helper to queue TDR immediately

Add a helper whereby a driver can invoke TDR immediately.

v2:
 - Drop timeout args, rename function, use mod delayed work (Luben)
v3:
 - s/XE/Xe (Luben)
 - present tense in commit message (Luben)
 - Adjust comment for drm_sched_tdr_queue_imm (Luben)
v4:
 - Adjust commit message (Luben)

Cc: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://lore.kernel.org/r/20231031032439.1558703-6-matthew.brost@intel.com
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>

drm/sched: Rename drm_sched_get_cleanup_job to be more descriptive

"Get cleanup job" makes it sound like helper is returning a job which will
execute some cleanup, or something, while the kerneldoc itself accurately
says "fetch the next _finished_ job". So lets rename the helper to be self
documenting.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Luben Tuikov <ltuikov89@gmail.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231102105538.391648-2-tvrtko.ursulin@linux.intel.com
Reviewed-by: Luben Tuikov <ltuikov89@gmail.com>
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>

drm/sched: Move free worker re-queuing out of the if block

Whether or not there are more jobs to clean up does not depend on the
existance of the current job, given both drm_sched_get_finished_job and
drm_sched_free_job_queue_if_done take and drop the job list lock.
Therefore it is confusing to make it read like there is a dependency.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Luben Tuikov <ltuikov89@gmail.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231102105538.391648-3-tvrtko.ursulin@linux.intel.com
Reviewed-by: Luben Tuikov <ltuikov89@gmail.com>
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>

drm/sched: Rename drm_sched_free_job_queue to be more descriptive

The current name makes it sound like helper will free a queue, while what
it does is it enqueues the free job worker.

Rename it to drm_sched_run_free_queue to align with existing
drm_sched_run_job_queue.

Despite that creating an illusion there are two queues, while in reality
there is only one, at least it creates a consistent naming for the two
enqueuing helpers.

At the same time simplify the "if done" helper by dropping the suffix and
adding a double underscore prefix to the one which just enqueues.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Luben Tuikov <ltuikov89@gmail.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231102105538.391648-4-tvrtko.ursulin@linux.intel.com
Reviewed-by: Luben Tuikov <ltuikov89@gmail.com>
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>

drm/sched: Rename drm_sched_run_job_queue_if_ready and clarify kerneldoc

"If ready" is not immediately clear what it means - is the scheduler
ready or something else? Drop the suffix, clarify kerneldoc, and employ
the same naming scheme as in drm_sched_run_free_queue:

 - drm_sched_run_job_queue   - enqueues if there is something to enqueue
                               *and* scheduler is ready (can queue)
 - __drm_sched_run_job_queue - low-level helper to simply queue the job

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Luben Tuikov <ltuikov89@gmail.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231102105538.391648-5-tvrtko.ursulin@linux.intel.com
Reviewed-by: Luben Tuikov <ltuikov89@gmail.com>
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>

drm/sched: Drop suffix from drm_sched_wakeup_if_can_queue

Because a) helper is exported to other parts of the scheduler and
b) there isn't a plain drm_sched_wakeup to begin with, I think we can
drop the suffix and by doing so separate the intimiate knowledge
between the scheduler components a bit better.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Luben Tuikov <ltuikov89@gmail.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231102105538.391648-6-tvrtko.ursulin@linux.intel.com
Reviewed-by: Luben Tuikov <ltuikov89@gmail.com>
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>

drm/sched: Don't disturb the entity when in RR-mode scheduling

Don't call drm_sched_select_entity() in drm_sched_run_job_queue().  In fact,
rename __drm_sched_run_job_queue() to just drm_sched_run_job_queue(), and let
it do just that, schedule the work item for execution.

The problem is that drm_sched_run_job_queue() calls drm_sched_select_entity()
to determine if the scheduler has an entity ready in one of its run-queues,
and in the case of the Round-Robin (RR) scheduling, the function
drm_sched_rq_select_entity_rr() does just that, selects the _next_ entity
which is ready, sets up the run-queue and completion and returns that
entity. The FIFO scheduling algorithm is unaffected.

Now, since drm_sched_run_job_work() also calls drm_sched_select_entity(), then
in the case of RR scheduling, that would result in drm_sched_select_entity()
having been called twice, which may result in skipping a ready entity if more
than one entity is ready. This commit fixes this by eliminating the call to
drm_sched_select_entity() from drm_sched_run_job_queue(), and leaves it only
in drm_sched_run_job_work().

v2: Rebased on top of Tvrtko's renames series of patches. (Luben)
    Add fixes-tag. (Tvrtko)

Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>
Fixes: f7fe64ad0f22ff ("drm/sched: Split free_job into own work item")
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231107041020.10035-2-ltuikov89@gmail.com

drm/sched: Qualify drm_sched_wakeup() by drm_sched_entity_is_ready()

Don't "wake up" the GPU scheduler unless the entity is ready, as well as we
can queue to the scheduler, i.e. there is no point in waking up the scheduler
for the entity unless the entity is ready.

Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>
Fixes: bc8d6a9df99038 ("drm/sched: Don't disturb the entity when in RR-mode scheduling")
Reviewed-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231110000123.72565-2-ltuikov89@gmail.com

drm/sched: implement dynamic job-flow control

Currently, job flow control is implemented simply by limiting the number
of jobs in flight. Therefore, a scheduler is initialized with a credit
limit that corresponds to the number of jobs which can be sent to the
hardware.

This implies that for each job, drivers need to account for the maximum
job size possible in order to not overflow the ring buffer.

However, there are drivers, such as Nouveau, where the job size has a
rather large range. For such drivers it can easily happen that job
submissions not even filling the ring by 1% can block subsequent
submissions, which, in the worst case, can lead to the ring run dry.

In order to overcome this issue, allow for tracking the actual job size
instead of the number of jobs. Therefore, add a field to track a job's
credit count, which represents the number of credits a job contributes
to the scheduler's credit limit.

Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Reviewed-by: Luben Tuikov <ltuikov89@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231110001638.71750-1-dakr@redhat.com

drm/sched: Fix bounds limiting when given a malformed entity

If we're given a malformed entity in drm_sched_entity_init()--shouldn't
happen, but we verify--with out-of-bounds priority value, we set it to an
allowed value. Fix the expression which sets this limit.

Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>
Fixes: 56e449603f0ac5 ("drm/sched: Convert the GPU scheduler to variable number of run-queues")
Link: https://patchwork.freedesktop.org/patch/msgid/20231123122422.167832-2-ltuikov89@gmail.com
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/dbb91dbe-ef77-4d79-aaf9-2adb171c1d7a@amd.com

drm/sched: Rename priority MIN to LOW

Rename DRM_SCHED_PRIORITY_MIN to DRM_SCHED_PRIORITY_LOW.

This mirrors DRM_SCHED_PRIORITY_HIGH, for a list of DRM scheduler priorities
in ascending order,
  DRM_SCHED_PRIORITY_LOW,
  DRM_SCHED_PRIORITY_NORMAL,
  DRM_SCHED_PRIORITY_HIGH,
  DRM_SCHED_PRIORITY_KERNEL.

Cc: Rob Clark <robdclark@gmail.com>
Cc: Abhinav Kumar <quic_abhinavk@quicinc.com>
Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Cc: Danilo Krummrich <dakr@redhat.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231124052752.6915-5-ltuikov89@gmail.com

drm/sched: Reverse run-queue priority enumeration

Reverse run-queue priority enumeration such that the higest priority is now 0,
and for each consecutive integer the prioirty diminishes.

Run-queues correspond to priorities. To an external observer a scheduler
created with a single run-queue, and another created with
DRM_SCHED_PRIORITY_COUNT number of run-queues, should always schedule
sched->sched_rq[0] with the same "priority", as that index run-queue exists in
both schedulers, i.e. a scheduler with one run-queue or many. This patch makes
it so.

In other words, the "priority" of sched->sched_rq[n], n >= 0, is the same for
any scheduler created with any allowable number of run-queues (priorities), 0
to DRM_SCHED_PRIORITY_COUNT.

Cc: Rob Clark <robdclark@gmail.com>
Cc: Abhinav Kumar <quic_abhinavk@quicinc.com>
Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Cc: Danilo Krummrich <dakr@redhat.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231124052752.6915-6-ltuikov89@gmail.com

drm/sched: Partial revert of "Qualify drm_sched_wakeup() by drm_sched_entity_is_ready()"

Commit f3123c2590005c, in combination with the use of work queues by the GPU
scheduler, leads to random lock-ups of the GUI.

This is a partial revert of of commit f3123c2590005c since drm_sched_wakeup() still
needs its entity argument to pass it to drm_sched_can_queue().

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2994
Link: https://lists.freedesktop.org/archives/dri-devel/2023-November/431606.html

Signed-off-by: Bert Karwatzki <spasswolf@web.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20231127160955.87879-1-spasswolf@web.de
Link: https://lore.kernel.org/r/36bece178ff5dc705065e53d1e5e41f6db6d87e4.camel@web.de
Fixes: f3123c2590005c ("drm/sched: Qualify drm_sched_wakeup() by drm_sched_entity_is_ready()")
Reviewed-by: Luben Tuikov <ltuikov89@gmail.com>
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>

drm/sched: One function call less in drm_sched_init() after error detection

The kfree() function was called in one case by the
drm_sched_init() function during error handling
even if the passed data structure member contained a null pointer.
This issue was detected by using the Coccinelle software.

Thus adjust a jump target.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Link: https://patchwork.freedesktop.org/patch/msgid/85066512-983d-480c-a44d-32405ab1b80e@web.de
Reviewed-by: Luben Tuikov <ltuikov89@gmail.com>
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>

drm/sched: Return an error code only as a constant in drm_sched_init()

Return an error code without storing it in an intermediate variable.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Link: https://patchwork.freedesktop.org/patch/msgid/85f8004e-f0c9-42d9-8c59-30f1b4e0b89e@web.de
Reviewed-by: Luben Tuikov <ltuikov89@gmail.com>
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>

drm/sched: Drain all entities in DRM sched run job worker

All entities must be drained in the DRM scheduler run job worker to
avoid the following case. An entity found that is ready, no job found
ready on entity, and run job worker goes idle with other entities + jobs
ready. Draining all ready entities (i.e. loop over all ready entities)
in the run job worker ensures all job that are ready will be scheduled.

Cc: Thorsten Leemhuis <regressions@leemhuis.info>
Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Closes: https://lore.kernel.org/all/CABXGCsM2VLs489CH-vF-1539-s3in37=bwuOWtoeeE+q26zE+Q@mail.gmail.com/
Reported-and-tested-by: Mario Limonciello <mario.limonciello@amd.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3124
Link: https://lore.kernel.org/all/20240123021155.2775-1-mario.limonciello@amd.com/
Reported-and-tested-by: Vlastimil Babka <vbabka@suse.cz>
Closes: https://lore.kernel.org/dri-devel/05ddb2da-b182-4791-8ef7-82179fd159a8@amd.com/T/#m0c31d4d1b9ae9995bb880974c4f1dbaddc33a48a
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240124210811.1639040-1-matthew.brost@intel.com

drm/drm_exec: Work around a WW mutex lockdep oddity

If *any* object of a certain WW mutex class is locked, lockdep will
consider *all* mutexes of that class as locked. Also the lock allocation
tracking code will apparently register only the address of the first
mutex of a given class locked in a sequence.
This has the odd consequence that if that first mutex is unlocked while
other mutexes of the same class remain locked and then its memory then
freed, the lock alloc tracking code will incorrectly assume that memory
is freed with a held lock in there.

For now, work around that for drm_exec by releasing the first grabbed
object lock last.

v2:
- Fix a typo (Danilo Krummrich)
- Reword the commit message a bit.
- Add a Fixes: tag

Related lock alloc tracking warning:
[  322.660067] =========================
[  322.660070] WARNING: held lock freed!
[  322.660074] 6.5.0-rc7+ #155 Tainted: G     U           N
[  322.660078] -------------------------
[  322.660081] kunit_try_catch/4981 is freeing memory ffff888112adc000-ffff888112adc3ff, with a lock still held there!
[  322.660089] ffff888112adc1a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: drm_exec_lock_obj+0x11a/0x600 [drm_exec]
[  322.660104] 2 locks held by kunit_try_catch/4981:
[  322.660108]  #0: ffffc9000343fe18 (reservation_ww_class_acquire){+.+.}-{0:0}, at: test_early_put+0x22f/0x490 [drm_exec_test]
[  322.660123]  #1: ffff888112adc1a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: drm_exec_lock_obj+0x11a/0x600 [drm_exec]
[  322.660135]
               stack backtrace:
[  322.660139] CPU: 7 PID: 4981 Comm: kunit_try_catch Tainted: G     U           N 6.5.0-rc7+ #155
[  322.660146] Hardware name: ASUS System Product Name/PRIME B560M-A AC, BIOS 0403 01/26/2021
[  322.660152] Call Trace:
[  322.660155]  <TASK>
[  322.660158]  dump_stack_lvl+0x57/0x90
[  322.660164]  debug_check_no_locks_freed+0x20b/0x2b0
[  322.660172]  slab_free_freelist_hook+0xa1/0x160
[  322.660179]  ? drm_exec_unlock_all+0x168/0x2a0 [drm_exec]
[  322.660186]  __kmem_cache_free+0xb2/0x290
[  322.660192]  drm_exec_unlock_all+0x168/0x2a0 [drm_exec]
[  322.660200]  drm_exec_fini+0xf/0x1c0 [drm_exec]
[  322.660206]  test_early_put+0x289/0x490 [drm_exec_test]
[  322.660215]  ? __pfx_test_early_put+0x10/0x10 [drm_exec_test]
[  322.660222]  ? __kasan_check_byte+0xf/0x40
[  322.660227]  ? __ksize+0x63/0x140
[  322.660233]  ? drmm_add_final_kfree+0x3e/0xa0 [drm]
[  322.660289]  ? _raw_spin_unlock_irqrestore+0x30/0x60
[  322.660294]  ? lockdep_hardirqs_on+0x7d/0x100
[  322.660301]  ? __pfx_kunit_try_run_case+0x10/0x10 [kunit]
[  322.660310]  ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10 [kunit]
[  322.660319]  kunit_generic_run_threadfn_adapter+0x4a/0x90 [kunit]
[  322.660328]  kthread+0x2e7/0x3c0
[  322.660334]  ? __pfx_kthread+0x10/0x10
[  322.660339]  ret_from_fork+0x2d/0x70
[  322.660345]  ? __pfx_kthread+0x10/0x10
[  322.660349]  ret_from_fork_asm+0x1b/0x30
[  322.660358]  </TASK>
[  322.660818]     ok 8 test_early_put

Cc: Christian König <christian.koenig@amd.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Danilo Krummrich <dakr@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Fixes: 09593216bff1 ("drm: execution context for GEM buffers v7")
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Danilo Krummrich <dakr@redhat.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230906095039.3320-4-thomas.hellstrom@linux.intel.com

drm/panthor: Add uAPI

Panthor follows the lead of other recently submitted drivers with
ioctls allowing us to support modern Vulkan features, like sparse memory
binding:

- Pretty standard GEM management ioctls (BO_CREATE and BO_MMAP_OFFSET),
  with the 'exclusive-VM' bit to speed-up BO reservation on job
submission
- VM management ioctls (VM_CREATE, VM_DESTROY and VM_BIND). The VM_BIND
  ioctl is loosely based on the Xe model, and can handle both
  asynchronous and synchronous requests
- GPU execution context creation/destruction, tiler heap context
creation
  and job submission. Those ioctls reflect how the hardware/scheduler
  works and are thus driver specific.

We also have a way to expose IO regions, such that the usermode driver
can directly access specific/well-isolate registers, like the
LATEST_FLUSH register used to implement cache-flush reduction.

This uAPI intentionally keeps usermode queues out of the scope, which
explains why doorbell registers and command stream ring-buffers are not
directly exposed to userspace.

v6:
- Add Maxime's and Heiko's acks

v5:
- Fix typo
- Add Liviu's R-b

v4:
- Add a VM_GET_STATE ioctl
- Fix doc
- Expose the CORE_FEATURES register so we can deal with variants in the
  UMD
- Add Steve's R-b

v3:
- Add the concept of sync-only VM operation
- Fix support for 32-bit userspace
- Rework drm_panthor_vm_create to pass the user VA size instead of
  the kernel VA size (suggested by Robin Murphy)
- Typo fixes
- Explicitly cast enums with top bit set to avoid compiler warnings in
  -pedantic mode.
- Drop property core_group_count as it can be easily calculated by the
  number of bits set in l2_present.

Co-developed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Acked-by: Maxime Ripard <mripard@kernel.org>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240229162230.2634044-2-boris.brezillon@collabora.com

drm/panthor: Add GPU register definitions

Those are the registers directly accessible through the MMIO range.

FW registers are exposed in panthor_fw.h.

v6:
- Add Maxime's and Heiko's acks

v4:
- Add the CORE_FEATURES register (needed for GPU variants)
- Add Steve's R-b

v3:
- Add macros to extract GPU ID info
- Formatting changes
- Remove AS_TRANSCFG_ADRMODE_LEGACY - it doesn't exist post-CSF
- Remove CSF_GPU_LATEST_FLUSH_ID_DEFAULT
- Add GPU_L2_FEATURES_LINE_SIZE for extracting the GPU cache line size

Co-developed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Steven Price <steven.price@arm.com> # MIT+GPL2 relicensing,Arm
Acked-by: Grant Likely <grant.likely@linaro.org> # MIT+GPL2 relicensing,Linaro
Acked-by: Boris Brezillon <boris.brezillon@collabora.com> # MIT+GPL2 relicensing,Collabora
Reviewed-by: Steven Price <steven.price@arm.com>
Acked-by: Maxime Ripard <mripard@kernel.org>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240229162230.2634044-3-boris.brezillon@collabora.com

drm/panthor: Add the device logical block

The panthor driver is designed in a modular way, where each logical
block is dealing with a specific HW-block or software feature. In order
for those blocks to communicate with each other, we need a central
panthor_device collecting all the blocks, and exposing some common
features, like interrupt handling, power management, reset, ...

This what this panthor_device logical block is about.

v6:
- Add Maxime's and Heiko's acks
- Keep header inclusion alphabetically ordered

v5:
- Suspend the MMU/GPU blocks if panthor_fw_resume() fails in
  panthor_device_resume()
- Move the pm_runtime_use_autosuspend() call before drm_dev_register()
- Add Liviu's R-b

v4:
- Check drmm_mutex_init() return code
- Fix panthor_device_reset_work() out path
- Fix the race in the unplug logic
- Fix typos
- Unplug blocks when something fails in panthor_device_init()
- Add Steve's R-b

v3:
- Add acks for the MIT+GPL2 relicensing
- Fix 32-bit support
- Shorten the sections protected by panthor_device::pm::mmio_lock to fix
  lock ordering issues.
- Rename panthor_device::pm::lock into panthor_device::pm::mmio_lock to
  better reflect what this lock is protecting
- Use dev_err_probe()
- Make sure we call drm_dev_exit() when something fails half-way in
  panthor_device_reset_work()
- Replace CSF_GPU_LATEST_FLUSH_ID_DEFAULT with a constant '1' and a
  comment to explain. Also remove setting the dummy flush ID on suspend.
- Remove drm_WARN_ON() in panthor_exception_name()
- Check pirq->suspended in panthor_xxx_irq_raw_handler()

Co-developed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Steven Price <steven.price@arm.com> # MIT+GPL2 relicensing,Arm
Acked-by: Grant Likely <grant.likely@linaro.org> # MIT+GPL2 relicensing,Linaro
Acked-by: Boris Brezillon <boris.brezillon@collabora.com> # MIT+GPL2 relicensing,Collabora
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Acked-by: Maxime Ripard <mripard@kernel.org>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240229162230.2634044-4-boris.brezillon@collabora.com

drm/panthor: Add the GPU logical block

Handles everything that's not related to the FW, the MMU or the
scheduler. This is the block dealing with the GPU property retrieval,
the GPU block power on/off logic, and some global operations, like
global cache flushing.

v6:
- Add Maxime's and Heiko's acks

v5:
- Fix GPU_MODEL() kernel doc
- Fix test in panthor_gpu_block_power_off()
- Add Steve's R-b

v4:
- Expose CORE_FEATURES through DEV_QUERY

v3:
- Add acks for the MIT/GPL2 relicensing
- Use macros to extract GPU ID info
- Make sure we reset clear pending_reqs bits when wait_event_timeout()
  times out but the corresponding bit is cleared in GPU_INT_RAWSTAT
  (can happen if the IRQ is masked or HW takes to long to call the IRQ
  handler)
- GPU_MODEL now takes separate arch and product majors to be more
  readable.
- Drop GPU_IRQ_MCU_STATUS_CHANGED from interrupt mask.
- Handle GPU_IRQ_PROTM_FAULT correctly (don't output registers that are
  not updated for protected interrupts).
- Minor code tidy ups

Cc: Alexey Sheplyakov <asheplyakov@basealt.ru> # MIT+GPL2 relicensing
Co-developed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Steven Price <steven.price@arm.com> # MIT+GPL2 relicensing,Arm
Acked-by: Grant Likely <grant.likely@linaro.org> # MIT+GPL2 relicensing,Linaro
Acked-by: Boris Brezillon <boris.brezillon@collabora.com> # MIT+GPL2 relicensing,Collabora
Reviewed-by: Steven Price <steven.price@arm.com>
Acked-by: Maxime Ripard <mripard@kernel.org>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240229162230.2634044-5-boris.brezillon@collabora.com

drm/panthor: Add GEM logical block

Anything relating to GEM object management is placed here. Nothing
particularly interesting here, given the implementation is based on
drm_gem_shmem_object, which is doing most of the work.

v6:
- Add Maxime's and Heiko's acks
- Return a page-aligned BO size to userspace when creating a BO
- Keep header inclusion alphabetically ordered

v5:
- Add Liviu's and Steve's R-b

v4:
- Force kernel BOs to be GPU mapped
- Make panthor_kernel_bo_destroy() robust against ERR/NULL BO pointers
  to simplify the call sites

v3:
- Add acks for the MIT/GPL2 relicensing
- Provide a panthor_kernel_bo abstraction for buffer objects managed by
  the kernel (will replace panthor_fw_mem and be used everywhere we were
  using panthor_gem_create_and_map() before)
- Adjust things to match drm_gpuvm changes
- Change return of panthor_gem_create_with_handle() to int

Co-developed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Steven Price <steven.price@arm.com> # MIT+GPL2 relicensing,Arm
Acked-by: Grant Likely <grant.likely@linaro.org> # MIT+GPL2 relicensing,Linaro
Acked-by: Boris Brezillon <boris.brezillon@collabora.com> # MIT+GPL2 relicensing,Collabora
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Acked-by: Maxime Ripard <mripard@kernel.org>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240229162230.2634044-6-boris.brezillon@collabora.com

drm/panthor: Add the devfreq logical block

Every thing related to devfreq in placed in panthor_devfreq.c, and
helpers that can be called by other logical blocks are exposed through
panthor_devfreq.h.

This implementation is loosely based on the panfrost implementation,
the only difference being that we don't count device users, because
the idle/active state will be managed by the scheduler logic.

v6:
- Add Maxime's and Heiko's acks
- Keep header inclusion alphabetically ordered

v4:
- Add Clément's A-b for the relicensing

v3:
- Add acks for the MIT/GPL2 relicensing

v2:
- Added in v2

Cc: Clément Péron <peron.clem@gmail.com> # MIT+GPL2 relicensing
Reviewed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Steven Price <steven.price@arm.com> # MIT+GPL2 relicensing,Arm
Acked-by: Grant Likely <grant.likely@linaro.org> # MIT+GPL2 relicensing,Linaro
Acked-by: Boris Brezillon <boris.brezillon@collabora.com> # MIT+GPL2 relicensing,Collabora
Acked-by: Clément Péron <peron.clem@gmail.com> # MIT+GPL2 relicensing
Acked-by: Maxime Ripard <mripard@kernel.org>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240229162230.2634044-7-boris.brezillon@collabora.com

drm/panthor: Add the MMU/VM logical block

MMU and VM management is related and placed in the same source file.

Page table updates are delegated to the io-pgtable-arm driver that's in
the iommu subsystem.

The VM management logic is based on drm_gpuva_mgr, and is assuming the
VA space is mostly managed by the usermode driver, except for a reserved
portion of this VA-space that's used for kernel objects (like the heap
contexts/chunks).

Both asynchronous and synchronous VM operations are supported, and
internal helpers are exposed to allow other logical blocks to map their
buffers in the GPU VA space.

There's one VM_BIND queue per-VM (meaning the Vulkan driver can only
expose one sparse-binding queue), and this bind queue is managed with
a 1:1 drm_sched_entity:drm_gpu_scheduler, such that each VM gets its own
independent execution queue, avoiding VM operation serialization at the
device level (things are still serialized at the VM level).

The rest is just implementation details that are hopefully well explained
in the documentation.

v6:
- Add Maxime's and Heiko's acks
- Add Steve's R-b
- Adjust the TRANSCFG value to account for SW VA space limitation on
  32-bit systems
- Keep header inclusion alphabetically ordered

v5:
- Fix a double panthor_vm_cleanup_op_ctx() call
- Fix a race between panthor_vm_prepare_map_op_ctx() and
  panthor_vm_bo_put()
- Fix panthor_vm_pool_destroy_vm() kernel doc
- Fix paddr adjustment in panthor_vm_map_pages()
- Fix bo_offset calculation in panthor_vm_get_bo_for_va()

v4:
- Add an helper to return the VM state
- Check drmm_mutex_init() return code
- Remove the VM from the AS reclaim list when panthor_vm_active() is
  called
- Count the number of active VM users instead of considering there's
  at most one user (several scheduling groups can point to the same
  vM)
- Pre-allocate a VMA object for unmap operations (unmaps can trigger
  a sm_step_remap() call)
- Check vm->root_page_table instead of vm->pgtbl_ops to detect if
  the io-pgtable is trying to allocate the root page table
- Don't memset() the va_node in panthor_vm_alloc_va(), make it a
  caller requirement
- Fix the kernel doc in a few places
- Drop the panthor_vm::base offset constraint and modify
  panthor_vm_put() to explicitly check for a NULL value
- Fix unbalanced vm_bo refcount in panthor_gpuva_sm_step_remap()
- Drop stale comments about the shared_bos list
- Patch mmu_features::va_bits on 32-bit builds to reflect the
  io_pgtable limitation and let the UMD know about it

v3:
- Add acks for the MIT/GPL2 relicensing
- Propagate MMU faults to the scheduler
- Move pages pinning/unpinning out of the dma_signalling path
- Fix 32-bit support
- Rework the user/kernel VA range calculation
- Make the auto-VA range explicit (auto-VA range doesn't cover the full
  kernel-VA range on the MCU VM)
- Let callers of panthor_vm_alloc_va() allocate the drm_mm_node
  (embedded in panthor_kernel_bo now)
- Adjust things to match the latest drm_gpuvm changes (extobj tracking,
  resv prep and more)
- Drop the per-AS lock and use slots_lock (fixes a race on vm->as.id)
- Set as.id to -1 when reusing an address space from the LRU list
- Drop misleading comment about page faults
- Remove check for irq being assigned in panthor_mmu_unplug()

Co-developed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Steven Price <steven.price@arm.com> # MIT+GPL2 relicensing,Arm
Acked-by: Grant Likely <grant.likely@linaro.org> # MIT+GPL2 relicensing,Linaro
Acked-by: Boris Brezillon <boris.brezillon@collabora.com> # MIT+GPL2 relicensing,Collabora
Reviewed-by: Steven Price <steven.price@arm.com>
Acked-by: Maxime Ripard <mripard@kernel.org>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240229162230.2634044-8-boris.brezillon@collabora.com

drm/panthor: Add the FW logical block

Contains everything that's FW related, that includes the code dealing
with the microcontroller unit (MCU) that's running the FW, and anything
related to allocating memory shared between the FW and the CPU.

A few global FW events are processed in the IRQ handler, the rest is
forwarded to the scheduler, since scheduling is the primary reason for
the FW existence, and also the main source of FW <-> kernel
interactions.

v6:
- Add Maxime's and Heiko's acks
- Keep header inclusion alphabetically ordered

v5:
- Fix typo in GLB_PERFCNT_SAMPLE definition
- Fix unbalanced panthor_vm_idle/active() calls
- Fallback to a slow reset when the fast reset fails
- Add extra information when reporting a FW boot failure

v4:
- Add a MODULE_FIRMWARE() entry for gen 10.8
- Fix a wrong return ERR_PTR() in panthor_fw_load_section_entry()
- Fix typos
- Add Steve's R-b

v3:
- Make the FW path more future-proof (Liviu)
- Use one waitqueue for all FW events
- Simplify propagation of FW events to the scheduler logic
- Drop the panthor_fw_mem abstraction and use panthor_kernel_bo instead
- Account for the panthor_vm changes
- Replace magic number with 0x7fffffff with ~0 to better signify that
  it's the maximum permitted value.
- More accurate rounding when computing the firmware timeout.
- Add a 'sub iterator' helper function. This also adds a check that a
  firmware entry doesn't overflow the firmware image.
- Drop __packed from FW structures, natural alignment is good enough.
- Other minor code improvements.

Co-developed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Acked-by: Maxime Ripard <mripard@kernel.org>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240229162230.2634044-9-boris.brezillon@collabora.com

drm/panthor: Add the heap logical block

Tiler heap growing requires some kernel driver involvement: when the
tiler runs out of heap memory, it will raise an exception which is
either directly handled by the firmware if some free heap chunks are
available in the heap context, or passed back to the kernel otherwise.
The heap helpers will be used by the scheduler logic to allocate more
heap chunks to a heap context, when such a situation happens.

Heap context creation is explicitly requested by userspace (using
the TILER_HEAP_CREATE ioctl), and the returned context is attached to a
queue through some command stream instruction.

All the kernel does is keep the list of heap chunks allocated to a
context, so they can be freed when TILER_HEAP_DESTROY is called, or
extended when the FW requests a new chunk.

v6:
- Add Maxime's and Heiko's acks

v5:
- Fix FIXME comment
- Add Steve's R-b

v4:
- Rework locking to allow concurrent calls to panthor_heap_grow()
- Add a helper to return a heap chunk if we couldn't pass it to the
  FW because the group was scheduled out

v3:
- Add a FIXME for the heap OOM deadlock
- Use the panthor_kernel_bo abstraction for the heap context and heap
  chunks
- Drop the panthor_heap_gpu_ctx struct as it is opaque to the driver
- Ensure that the heap context is aligned to the GPU cache line size
- Minor code tidy ups

Co-developed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Acked-by: Maxime Ripard <mripard@kernel.org>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240229162230.2634044-10-boris.brezillon@collabora.com

drm/panthor: Add the scheduler logical block

This is the piece of software interacting with the FW scheduler, and
taking care of some scheduling aspects when the FW comes short of slots
scheduling slots. Indeed, the FW only expose a few slots, and the kernel
has to give all submission contexts, a chance to execute their jobs.

The kernel-side scheduler is timeslice-based, with a round-robin queue
per priority level.

Job submission is handled with a 1:1 drm_sched_entity:drm_gpu_scheduler,
allowing us to delegate the dependency tracking to the core.

All the gory details should be documented inline.

v6:
- Add Maxime's and Heiko's acks
- Make sure the scheduler is initialized before queueing the tick work
  in the MMU fault handler
- Keep header inclusion alphabetically ordered

v5:
- Fix typos
- Call panthor_kernel_bo_destroy(group->syncobjs) unconditionally
- Don't move the group to the waiting list tail when it was already
  waiting for a different syncobj
- Fix fatal_queues flagging in the tiler OOM path
- Don't warn when more than one job timesout on a group
- Add a warning message when we fail to allocate a heap chunk
- Add Steve's R-b

v4:
- Check drmm_mutex_init() return code
- s/drm_gem_vmap_unlocked/drm_gem_vunmap_unlocked/ in
  panthor_queue_put_syncwait_obj()
- Drop unneeded WARN_ON() in cs_slot_sync_queue_state_locked()
- Use atomic_xchg() instead of atomic_fetch_and(0)
- Fix typos
- Let panthor_kernel_bo_destroy() check for IS_ERR_OR_NULL() BOs
- Defer TILER_OOM event handling to a separate workqueue to prevent
  deadlocks when the heap chunk allocation is blocked on mem-reclaim.
  This is just a temporary solution, until we add support for
  non-blocking/failable allocations
- Pass the scheduler workqueue to drm_sched instead of instantiating
  a separate one (no longer needed now that heap chunk allocation
  happens on a dedicated wq)
- Set WQ_MEM_RECLAIM on the scheduler workqueue, so we can handle
  job timeouts when the system is under mem pressure, and hopefully
  free up some memory retained by these jobs

v3:
- Rework the FW event handling logic to avoid races
- Make sure MMU faults kill the group immediately
- Use the panthor_kernel_bo abstraction for group/queue buffers
- Make in_progress an atomic_t, so we can check it without the reset lock
  held
- Don't limit the number of groups per context to the FW scheduler
  capacity. Fix the limit to 128 for now.
- Add a panthor_job_vm() helper
- Account for panthor_vm changes
- Add our job fence as DMA_RESV_USAGE_WRITE to all external objects
  (was previously DMA_RESV_USAGE_BOOKKEEP). I don't get why, given
  we're supposed to be fully-explicit, but other drivers do that, so
  there must be a good reason
- Account for drm_sched changes
- Provide a panthor_queue_put_syncwait_obj()
- Unconditionally return groups to their idle list in
  panthor_sched_suspend()
- Condition of sched_queue_{,delayed_}work fixed to be only when a reset
  isn't pending or in progress.
- Several typos in comments fixed.

Co-developed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Acked-by: Maxime Ripard <mripard@kernel.org>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240229162230.2634044-11-boris.brezillon@collabora.com

drm/panthor: Add the driver frontend block

This is the last piece missing to expose the driver to the outside
world.

This is basically a wrapper between the ioctls and the other logical
blocks.

v6:
- Add Maxime's and Heiko's acks
- Return a page-aligned BO size to userspace
- Keep header inclusion alphabetically ordered

v5:
- Account for the drm_exec_init() prototype change
- Include platform_device.h

v4:
- Add an ioctl to let the UMD query the VM state
- Fix kernel doc
- Let panthor_device_init() call panthor_device_init()
- Fix cleanup ordering in the panthor_init() error path
- Add Steve's and Liviu's R-b

v3:
- Add acks for the MIT/GPL2 relicensing
- Fix 32-bit support
- Account for panthor_vm and panthor_sched changes
- Simplify the resv preparation/update logic
- Use a linked list rather than xarray for list of signals.
- Simplify panthor_get_uobj_array by returning the newly allocated
  array.
- Drop the "DOC" for job submission helpers and move the relevant
  comments to panthor_ioctl_group_submit().
- Add helpers sync_op_is_signal()/sync_op_is_wait().
- Simplify return type of panthor_submit_ctx_add_sync_signal() and
  panthor_submit_ctx_get_sync_signal().
- Drop WARN_ON from panthor_submit_ctx_add_job().
- Fix typos in comments.

Co-developed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Steven Price <steven.price@arm.com> # MIT+GPL2 relicensing,Arm
Acked-by: Grant Likely <grant.likely@linaro.org> # MIT+GPL2 relicensing,Linaro
Acked-by: Boris Brezillon <boris.brezillon@collabora.com> # MIT+GPL2 relicensing,Collabora
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Acked-by: Maxime Ripard <mripard@kernel.org>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240229162230.2634044-12-boris.brezillon@collabora.com

drm/panthor: Allow driver compilation

Now that all blocks are available, we can add/update Kconfig/Makefile
files to allow compilation.

v6:
- Add Maxime's and Heiko's acks
- Keep source files alphabetically ordered in the Makefile

v4:
- Add Steve's R-b

v3:
- Add a dep on DRM_GPUVM
- Fix dependencies in Kconfig
- Expand help text to (hopefully) describe which GPUs are to be
  supported by this driver and which are for panfrost.

Co-developed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Steven Price <steven.price@arm.com> # MIT+GPL2 relicensing,Arm
Acked-by: Grant Likely <grant.likely@linaro.org> # MIT+GPL2 relicensing,Linaro
Acked-by: Boris Brezillon <boris.brezillon@collabora.com> # MIT+GPL2 relicensing,Collabora
Reviewed-by: Steven Price <steven.price@arm.com>
Acked-by: Maxime Ripard <mripard@kernel.org>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240229162230.2634044-13-boris.brezillon@collabora.com

dt-bindings: gpu: mali-valhall-csf: Add support for Arm Mali CSF GPUs

Arm has introduced a new v10 GPU architecture that replaces the Job
Manager
interface with a new Command Stream Frontend. It adds firmware driven
command stream queues that can be used by kernel and user space to
submit
jobs to the GPU.

Add the initial schema for the device tree that is based on support for
RK3588 SoC. The minimum number of clocks is one for the IP, but on
Rockchip
platforms they will tend to expose the semi-independent clocks for
better
power management.

v6:
- Add Maxime's and Heiko's acks

v5:
- Move the opp-table node under the gpu node

v4:
- Fix formatting issue

v3:
- Cleanup commit message to remove redundant text
- Added opp-table property and re-ordered entries
- Clarified power-domains and power-domain-names requirements for
RK3588.
- Cleaned up example

Note: power-domains and power-domain-names requirements for other
platforms
are still work in progress, hence the bindings are left incomplete here.

v2:
- New commit

Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Cc: Krzysztof Kozlowski <krzysztof.kozlowski+dt@linaro.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Conor Dooley <conor+dt@kernel.org>
Cc: devicetree@vger.kernel.org
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Maxime Ripard <mripard@kernel.org>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240229162230.2634044-14-boris.brezillon@collabora.com

drm/panthor: Add an entry to MAINTAINERS

Add an entry for the Panthor driver to the MAINTAINERS file.

v6:
- Add Maxime's and Heiko's acks

v4:
- Add Steve's R-b

v3:
- Add bindings document as an 'F:' line.
- Add Steven and Liviu as co-maintainers.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Acked-by: Maxime Ripard <mripard@kernel.org>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240229162230.2634044-15-boris.brezillon@collabora.com

drm/panthor: remove debugfs

dma-buf/dma-fence: Add deadline awareness

Add a way to hint to the fence signaler of an upcoming deadline, such as
vblank, which the fence waiter would prefer not to miss.  This is to aid
the fence signaler in making power management decisions, like boosting
frequency as the deadline approaches and awareness of missing deadlines
so that can be factored in to the frequency scaling.

v2: Drop dma_fence::deadline and related logic to filter duplicate
    deadlines, to avoid increasing dma_fence size.  The fence-context
    implementation will need similar logic to track deadlines of all
    the fences on the same timeline.  [ckoenig]
v3: Clarify locking wrt. set_deadline callback
v4: Clarify in docs comment that this is a hint
v5: Drop DMA_FENCE_FLAG_HAS_DEADLINE_BIT.
v6: More docs
v7: Fix typo, clarify past deadlines

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.com>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>

drm/gem: Take reservation lock for vmap/vunmap operations

The new common dma-buf locking convention will require buffer importers
to hold the reservation lock around mapping operations. Make DRM GEM core
to take the lock around the vmapping operations and update DRM drivers to
use the locked functions for the case where DRM core now holds the lock.
This patch prepares DRM core and drivers to the common dynamic dma-buf
locking convention.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221017172229.42269-4-dmitry.osipenko@collabora.com

dma-buf: Add unlocked variant of vmapping functions

Add unlocked variant of dma_buf_vmap/vunmap() that will be utilized
by drivers that don't take the reservation lock explicitly.

Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221017172229.42269-2-dmitry.osipenko@collabora.com

dma-buf: Add unlocked variant of attachment-mapping functions

Add unlocked variant of dma_buf_map/unmap_attachment() that will
be used by drivers that don't take the reservation lock explicitly.

Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221017172229.42269-3-dmitry.osipenko@collabora.com

dma-buf: Move dma_buf_vmap() to dynamic locking specification

Move dma_buf_vmap/vunmap() functions to the dynamic locking
specification by asserting that the reservation lock is held.

Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221017172229.42269-16-dmitry.osipenko@collabora.com

drm/gpuvm: Helper to get range of unmap from a remap op.

Determining the start and range of the unmap stage of a remap op is a
common piece of code currently implemented by multiple drivers. Add a
helper for this.

Changes since v7:
- Renamed helper to drm_gpuva_op_remap_to_unmap_range()
- Improved documentation

Changes since v6:
- Remove use of __always_inline

Signed-off-by: Donald Robson <donald.robson@imgtec.com>
Signed-off-by: Sarah Walker <sarah.walker@imgtec.com>
Reviewed-by: Danilo Krummrich <dakr@redhat.com>
Link: https://lore.kernel.org/r/8a0a5b5eeec459d3c60fcdaa5a638ad14a18a59e.1700668843.git.donald.robson@imgtec.com
Signed-off-by: Maxime Ripard <mripard@kernel.org>

drm: Enable PRIME import/export for all drivers

Call drm_gem_prime_handle_to_fd() and drm_gem_prime_fd_to_handle() by
default if no PRIME import/export helpers have been set. Both functions
are the default for almost all drivers.

DRM drivers implement struct drm_driver.gem_prime_import_sg_table
to import dma-buf objects from other drivers. Having the function
drm_gem_prime_fd_to_handle() functions set by default allows each
driver to import dma-buf objects to itself, even without support for
other drivers.

For drm_gem_prime_handle_to_fd() it is similar: using it by default
allows each driver to export to itself, even without support for other
drivers.

This functionality enables userspace to share per-driver buffers
across process boundaries via PRIME (e.g., wlroots requires this
functionality). The patch generalizes a pattern that has previously
been implemented by GEM VRAM helpers [1] to work with any driver.
For example, gma500 can now run the wlroots-based sway compositor.

v2:
	* clean up docs and TODO comments (Simon, Zack)
	* clean up style in drm_getcap()

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://lore.kernel.org/dri-devel/20230302143502.500661-1-contact@emersion.fr/ # 1
Reviewed-by: Simon Ser <contact@emersion.fr>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230620080252.16368-2-tzimmermann@suse.de

drm: Clear fd/handle callbacks in struct drm_driver

Clear all assignments of struct drm_driver's fd/handle callbacks to
drm_gem_prime_fd_to_handle() and drm_gem_prime_handle_to_fd(). These
functions are called by default. Add a TODO item to convert vmwgfx
to the defaults as well.

v2:
	* remove TODO item (Zack)
	* also update amdgpu's amdgpu_partition_driver

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Simon Ser <contact@emersion.fr>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Jeffrey Hugo <quic_jhugo@quicinc.com> # qaic
Link: https://patchwork.freedesktop.org/patch/msgid/20230620080252.16368-3-tzimmermann@suse.de

drm/panthor: Fix IO-page mmap() for 32-bit userspace on 64-bit kernel

When mapping an IO region, the pseudo-file offset is dependent on the
userspace architecture. panthor_device_mmio_offset() abstracts that
away for us by turning a userspace MMIO offset into its kernel
equivalent, but we were not updating vm_area_struct::vm_pgoff
accordingly, leading us to attach the MMIO region to the wrong file
offset.

This has implications when we start mixing 64 bit and 32 bit apps, but
that's only really a problem when we start having more that 2^43 bytes of
memory allocated, which is very unlikely to happen.

What's more problematic is the fact this turns our
unmap_mapping_range(DRM_PANTHOR_USER_MMIO_OFFSET) calls, which are
supposed to kill the MMIO mapping when entering suspend, into NOPs.
Which means we either keep the dummy flush_id mapping active at all
times, or we risk a BUS_FAULT if the MMIO region was mapped, and the
GPU is suspended after that.

Solve that by patching vm_pgoff early in panthor_mmap(). With
this in place, we no longer need the panthor_device_mmio_offset()
helper.

v3:
- No changes

v2:
- Kill panthor_device_mmio_offset()

Fixes: 5fe909cae118 ("drm/panthor: Add the device logical block")
Reported-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Reported-by: Lukas F. Hartmann <lukas@mntmn.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10835
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>

drm/panthor: Fix ordering in _irq_suspend()

Make sure we set suspended=true last to avoid generating an irq storm
in the unlikely case where an IRQ happens between the suspended=true
assignment and the _INT_MASK update.

We also move the mask=0 assignment before writing to the _INT_MASK
register to prevent the thread handler from unmasking the interrupt
behind our back. This means we might lose events if there were some
pending when we get to suspend the IRQ, but that's fine.
The synchronize_irq() we have in the _irq_suspend() path was not
there to make sure all IRQs are processed, just to make sure we don't
have registers accesses coming from the irq handlers after
_irq_suspend() has been called. If there's a need to have all pending
IRQs processed, it should happen before _irq_suspend() is called.

v3:
- Add Steve's R-b

v2:
- New patch

Fixes: 5fe909cae118 ("drm/panthor: Add the device logical block")
Reported-by: Steven Price <steven.price@arm.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Acked-by: Liviu Dudau <liviu.dudau@arm.com>

drm/panthor: Drop the dev_enter/exit() sections in _irq_suspend/resume()

There's no reason for _irq_suspend/resume() to be called after the
device has been unplugged, and keeping this dev_enter/exit()
section in _irq_suspend() is turns _irq_suspend() into a NOP
when called from the _unplug() functions, which we don't want.

v3:
- New patch

Fixes: 5fe909cae118 ("drm/panthor: Add the device logical block")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Reviewed-by: Steven Price <steven.price@arm.com>
2025-09-08 21:42:52 +02:00

2906 lines
90 KiB
C

/*
* Copyright 2000 VA Linux Systems, Inc., Sunnyvale, California.
* All Rights Reserved.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* VA LINUX SYSTEMS AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
* OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
* ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*/
#include <drm/amdgpu_drm.h>
#include <drm/drm_drv.h>
#include <drm/drm_gem.h>
#include <drm/drm_vblank.h>
#include <drm/drm_managed.h>
#include "amdgpu_drv.h"
#include <drm/drm_pciids.h>
#include <linux/module.h>
#include <linux/pm_runtime.h>
#include <linux/vga_switcheroo.h>
#include <drm/drm_probe_helper.h>
#include <linux/mmu_notifier.h>
#include <linux/suspend.h>
#include <linux/cc_platform.h>
#include <linux/fb.h>
#include <linux/dynamic_debug.h>
#include "amdgpu.h"
#include "amdgpu_irq.h"
#include "amdgpu_dma_buf.h"
#include "amdgpu_sched.h"
#include "amdgpu_fdinfo.h"
#include "amdgpu_amdkfd.h"
#include "amdgpu_ras.h"
#include "amdgpu_xgmi.h"
#include "amdgpu_reset.h"
/*
* KMS wrapper.
* - 3.0.0 - initial driver
* - 3.1.0 - allow reading more status registers (GRBM, SRBM, SDMA, CP)
* - 3.2.0 - GFX8: Uses EOP_TC_WB_ACTION_EN, so UMDs don't have to do the same
* at the end of IBs.
* - 3.3.0 - Add VM support for UVD on supported hardware.
* - 3.4.0 - Add AMDGPU_INFO_NUM_EVICTIONS.
* - 3.5.0 - Add support for new UVD_NO_OP register.
* - 3.6.0 - kmd involves use CONTEXT_CONTROL in ring buffer.
* - 3.7.0 - Add support for VCE clock list packet
* - 3.8.0 - Add support raster config init in the kernel
* - 3.9.0 - Add support for memory query info about VRAM and GTT.
* - 3.10.0 - Add support for new fences ioctl, new gem ioctl flags
* - 3.11.0 - Add support for sensor query info (clocks, temp, etc).
* - 3.12.0 - Add query for double offchip LDS buffers
* - 3.13.0 - Add PRT support
* - 3.14.0 - Fix race in amdgpu_ctx_get_fence() and note new functionality
* - 3.15.0 - Export more gpu info for gfx9
* - 3.16.0 - Add reserved vmid support
* - 3.17.0 - Add AMDGPU_NUM_VRAM_CPU_PAGE_FAULTS.
* - 3.18.0 - Export gpu always on cu bitmap
* - 3.19.0 - Add support for UVD MJPEG decode
* - 3.20.0 - Add support for local BOs
* - 3.21.0 - Add DRM_AMDGPU_FENCE_TO_HANDLE ioctl
* - 3.22.0 - Add DRM_AMDGPU_SCHED ioctl
* - 3.23.0 - Add query for VRAM lost counter
* - 3.24.0 - Add high priority compute support for gfx9
* - 3.25.0 - Add support for sensor query info (stable pstate sclk/mclk).
* - 3.26.0 - GFX9: Process AMDGPU_IB_FLAG_TC_WB_NOT_INVALIDATE.
* - 3.27.0 - Add new chunk to AMDGPU_CS to enable BO_LIST creation.
* - 3.28.0 - Add AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES
* - 3.29.0 - Add AMDGPU_IB_FLAG_RESET_GDS_MAX_WAVE_ID
* - 3.30.0 - Add AMDGPU_SCHED_OP_CONTEXT_PRIORITY_OVERRIDE.
* - 3.31.0 - Add support for per-flip tiling attribute changes with DC
* - 3.32.0 - Add syncobj timeline support to AMDGPU_CS.
* - 3.33.0 - Fixes for GDS ENOMEM failures in AMDGPU_CS.
* - 3.34.0 - Non-DC can flip correctly between buffers with different pitches
* - 3.35.0 - Add drm_amdgpu_info_device::tcc_disabled_mask
* - 3.36.0 - Allow reading more status registers on si/cik
* - 3.37.0 - L2 is invalidated before SDMA IBs, needed for correctness
* - 3.38.0 - Add AMDGPU_IB_FLAG_EMIT_MEM_SYNC
* - 3.39.0 - DMABUF implicit sync does a full pipeline sync
* - 3.40.0 - Add AMDGPU_IDS_FLAGS_TMZ
* - 3.41.0 - Add video codec query
* - 3.42.0 - Add 16bpc fixed point display support
* - 3.43.0 - Add device hot plug/unplug support
* - 3.44.0 - DCN3 supports DCC independent block settings: !64B && 128B, 64B && 128B
* - 3.45.0 - Add context ioctl stable pstate interface
* - 3.46.0 - To enable hot plug amdgpu tests in libdrm
* - 3.47.0 - Add AMDGPU_GEM_CREATE_DISCARDABLE and AMDGPU_VM_NOALLOC flags
* - 3.48.0 - Add IP discovery version info to HW INFO
* 3.49.0 - Add gang submit into CS IOCTL
*/
#define KMS_DRIVER_MAJOR 3
#define KMS_DRIVER_MINOR 49
#define KMS_DRIVER_PATCHLEVEL 0
int amdgpu_vram_limit;
int amdgpu_vis_vram_limit;
int amdgpu_gart_size = -1; /* auto */
int amdgpu_gtt_size = -1; /* auto */
int amdgpu_moverate = -1; /* auto */
int amdgpu_audio = -1;
int amdgpu_disp_priority;
int amdgpu_hw_i2c;
int amdgpu_pcie_gen2 = -1;
int amdgpu_msi = -1;
char amdgpu_lockup_timeout[AMDGPU_MAX_TIMEOUT_PARAM_LENGTH];
int amdgpu_dpm = -1;
int amdgpu_fw_load_type = -1;
int amdgpu_aspm = -1;
int amdgpu_runtime_pm = -1;
uint amdgpu_ip_block_mask = 0xffffffff;
int amdgpu_bapm = -1;
int amdgpu_deep_color;
int amdgpu_vm_size = -1;
int amdgpu_vm_fragment_size = -1;
int amdgpu_vm_block_size = -1;
int amdgpu_vm_fault_stop;
int amdgpu_vm_debug;
int amdgpu_vm_update_mode = -1;
int amdgpu_exp_hw_support;
int amdgpu_dc = -1;
int amdgpu_sched_jobs = 32;
int amdgpu_sched_hw_submission = 2;
uint amdgpu_pcie_gen_cap;
uint amdgpu_pcie_lane_cap;
u64 amdgpu_cg_mask = 0xffffffffffffffff;
uint amdgpu_pg_mask = 0xffffffff;
uint amdgpu_sdma_phase_quantum = 32;
char *amdgpu_disable_cu = NULL;
char *amdgpu_virtual_display = NULL;
/*
* OverDrive(bit 14) disabled by default
* GFX DCS(bit 19) disabled by default
*/
uint amdgpu_pp_feature_mask = 0xfff7bfff;
uint amdgpu_force_long_training;
int amdgpu_job_hang_limit;
int amdgpu_lbpw = -1;
int amdgpu_compute_multipipe = -1;
int amdgpu_gpu_recovery = -1; /* auto */
int amdgpu_emu_mode;
uint amdgpu_smu_memory_pool_size;
int amdgpu_smu_pptable_id = -1;
/*
* FBC (bit 0) disabled by default
* MULTI_MON_PP_MCLK_SWITCH (bit 1) enabled by default
* - With this, for multiple monitors in sync(e.g. with the same model),
* mclk switching will be allowed. And the mclk will be not foced to the
* highest. That helps saving some idle power.
* DISABLE_FRACTIONAL_PWM (bit 2) disabled by default
* PSR (bit 3) disabled by default
* EDP NO POWER SEQUENCING (bit 4) disabled by default
*/
uint amdgpu_dc_feature_mask = 2;
uint amdgpu_dc_debug_mask;
uint amdgpu_dc_visual_confirm;
int amdgpu_async_gfx_ring = 1;
int amdgpu_mcbp;
int amdgpu_discovery = -1;
int amdgpu_mes;
int amdgpu_mes_kiq;
int amdgpu_noretry = -1;
int amdgpu_force_asic_type = -1;
int amdgpu_tmz = -1; /* auto */
uint amdgpu_freesync_vid_mode;
int amdgpu_reset_method = -1; /* auto */
int amdgpu_num_kcq = -1;
int amdgpu_smartshift_bias;
int amdgpu_use_xgmi_p2p = 1;
int amdgpu_vcnfw_log;
int amdgpu_sg_display = -1; /* auto */
static void amdgpu_drv_delayed_reset_work_handler(struct work_struct *work);
DECLARE_DYNDBG_CLASSMAP(drm_debug_classes, DD_CLASS_TYPE_DISJOINT_BITS, 0,
"DRM_UT_CORE",
"DRM_UT_DRIVER",
"DRM_UT_KMS",
"DRM_UT_PRIME",
"DRM_UT_ATOMIC",
"DRM_UT_VBL",
"DRM_UT_STATE",
"DRM_UT_LEASE",
"DRM_UT_DP",
"DRM_UT_DRMRES");
struct amdgpu_mgpu_info mgpu_info = {
.mutex = __MUTEX_INITIALIZER(mgpu_info.mutex),
.delayed_reset_work = __DELAYED_WORK_INITIALIZER(
mgpu_info.delayed_reset_work,
amdgpu_drv_delayed_reset_work_handler, 0),
};
int amdgpu_ras_enable = -1;
uint amdgpu_ras_mask = 0xffffffff;
int amdgpu_bad_page_threshold = -1;
struct amdgpu_watchdog_timer amdgpu_watchdog_timer = {
.timeout_fatal_disable = false,
.period = 0x0, /* default to 0x0 (timeout disable) */
};
/**
* DOC: vramlimit (int)
* Restrict the total amount of VRAM in MiB for testing. The default is 0 (Use full VRAM).
*/
MODULE_PARM_DESC(vramlimit, "Restrict VRAM for testing, in megabytes");
module_param_named(vramlimit, amdgpu_vram_limit, int, 0600);
/**
* DOC: vis_vramlimit (int)
* Restrict the amount of CPU visible VRAM in MiB for testing. The default is 0 (Use full CPU visible VRAM).
*/
MODULE_PARM_DESC(vis_vramlimit, "Restrict visible VRAM for testing, in megabytes");
module_param_named(vis_vramlimit, amdgpu_vis_vram_limit, int, 0444);
/**
* DOC: gartsize (uint)
* Restrict the size of GART in Mib (32, 64, etc.) for testing. The default is -1 (The size depends on asic).
*/
MODULE_PARM_DESC(gartsize, "Size of GART to setup in megabytes (32, 64, etc., -1=auto)");
module_param_named(gartsize, amdgpu_gart_size, uint, 0600);
/**
* DOC: gttsize (int)
* Restrict the size of GTT domain in MiB for testing. The default is -1 (It's VRAM size if 3GB < VRAM < 3/4 RAM,
* otherwise 3/4 RAM size).
*/
MODULE_PARM_DESC(gttsize, "Size of the GTT domain in megabytes (-1 = auto)");
module_param_named(gttsize, amdgpu_gtt_size, int, 0600);
/**
* DOC: moverate (int)
* Set maximum buffer migration rate in MB/s. The default is -1 (8 MB/s).
*/
MODULE_PARM_DESC(moverate, "Maximum buffer migration rate in MB/s. (32, 64, etc., -1=auto, 0=1=disabled)");
module_param_named(moverate, amdgpu_moverate, int, 0600);
/**
* DOC: audio (int)
* Set HDMI/DPAudio. Only affects non-DC display handling. The default is -1 (Enabled), set 0 to disabled it.
*/
MODULE_PARM_DESC(audio, "Audio enable (-1 = auto, 0 = disable, 1 = enable)");
module_param_named(audio, amdgpu_audio, int, 0444);
/**
* DOC: disp_priority (int)
* Set display Priority (1 = normal, 2 = high). Only affects non-DC display handling. The default is 0 (auto).
*/
MODULE_PARM_DESC(disp_priority, "Display Priority (0 = auto, 1 = normal, 2 = high)");
module_param_named(disp_priority, amdgpu_disp_priority, int, 0444);
/**
* DOC: hw_i2c (int)
* To enable hw i2c engine. Only affects non-DC display handling. The default is 0 (Disabled).
*/
MODULE_PARM_DESC(hw_i2c, "hw i2c engine enable (0 = disable)");
module_param_named(hw_i2c, amdgpu_hw_i2c, int, 0444);
/**
* DOC: pcie_gen2 (int)
* To disable PCIE Gen2/3 mode (0 = disable, 1 = enable). The default is -1 (auto, enabled).
*/
MODULE_PARM_DESC(pcie_gen2, "PCIE Gen2 mode (-1 = auto, 0 = disable, 1 = enable)");
module_param_named(pcie_gen2, amdgpu_pcie_gen2, int, 0444);
/**
* DOC: msi (int)
* To disable Message Signaled Interrupts (MSI) functionality (1 = enable, 0 = disable). The default is -1 (auto, enabled).
*/
MODULE_PARM_DESC(msi, "MSI support (1 = enable, 0 = disable, -1 = auto)");
module_param_named(msi, amdgpu_msi, int, 0444);
/**
* DOC: lockup_timeout (string)
* Set GPU scheduler timeout value in ms.
*
* The format can be [Non-Compute] or [GFX,Compute,SDMA,Video]. That is there can be one or
* multiple values specified. 0 and negative values are invalidated. They will be adjusted
* to the default timeout.
*
* - With one value specified, the setting will apply to all non-compute jobs.
* - With multiple values specified, the first one will be for GFX.
* The second one is for Compute. The third and fourth ones are
* for SDMA and Video.
*
* By default(with no lockup_timeout settings), the timeout for all non-compute(GFX, SDMA and Video)
* jobs is 10000. The timeout for compute is 60000.
*/
MODULE_PARM_DESC(lockup_timeout, "GPU lockup timeout in ms (default: for bare metal 10000 for non-compute jobs and 60000 for compute jobs; "
"for passthrough or sriov, 10000 for all jobs."
" 0: keep default value. negative: infinity timeout), "
"format: for bare metal [Non-Compute] or [GFX,Compute,SDMA,Video]; "
"for passthrough or sriov [all jobs] or [GFX,Compute,SDMA,Video].");
module_param_string(lockup_timeout, amdgpu_lockup_timeout, sizeof(amdgpu_lockup_timeout), 0444);
/**
* DOC: dpm (int)
* Override for dynamic power management setting
* (0 = disable, 1 = enable)
* The default is -1 (auto).
*/
MODULE_PARM_DESC(dpm, "DPM support (1 = enable, 0 = disable, -1 = auto)");
module_param_named(dpm, amdgpu_dpm, int, 0444);
/**
* DOC: fw_load_type (int)
* Set different firmware loading type for debugging, if supported.
* Set to 0 to force direct loading if supported by the ASIC. Set
* to -1 to select the default loading mode for the ASIC, as defined
* by the driver. The default is -1 (auto).
*/
MODULE_PARM_DESC(fw_load_type, "firmware loading type (3 = rlc backdoor autoload if supported, 2 = smu load if supported, 1 = psp load, 0 = force direct if supported, -1 = auto)");
module_param_named(fw_load_type, amdgpu_fw_load_type, int, 0444);
/**
* DOC: aspm (int)
* To disable ASPM (1 = enable, 0 = disable). The default is -1 (auto, enabled).
*/
MODULE_PARM_DESC(aspm, "ASPM support (1 = enable, 0 = disable, -1 = auto)");
module_param_named(aspm, amdgpu_aspm, int, 0444);
/**
* DOC: runpm (int)
* Override for runtime power management control for dGPUs. The amdgpu driver can dynamically power down
* the dGPUs when they are idle if supported. The default is -1 (auto enable).
* Setting the value to 0 disables this functionality.
*/
MODULE_PARM_DESC(runpm, "PX runtime pm (2 = force enable with BAMACO, 1 = force enable with BACO, 0 = disable, -1 = auto)");
module_param_named(runpm, amdgpu_runtime_pm, int, 0444);
/**
* DOC: ip_block_mask (uint)
* Override what IP blocks are enabled on the GPU. Each GPU is a collection of IP blocks (gfx, display, video, etc.).
* Use this parameter to disable specific blocks. Note that the IP blocks do not have a fixed index. Some asics may not have
* some IPs or may include multiple instances of an IP so the ordering various from asic to asic. See the driver output in
* the kernel log for the list of IPs on the asic. The default is 0xffffffff (enable all blocks on a device).
*/
MODULE_PARM_DESC(ip_block_mask, "IP Block Mask (all blocks enabled (default))");
module_param_named(ip_block_mask, amdgpu_ip_block_mask, uint, 0444);
/**
* DOC: bapm (int)
* Bidirectional Application Power Management (BAPM) used to dynamically share TDP between CPU and GPU. Set value 0 to disable it.
* The default -1 (auto, enabled)
*/
MODULE_PARM_DESC(bapm, "BAPM support (1 = enable, 0 = disable, -1 = auto)");
module_param_named(bapm, amdgpu_bapm, int, 0444);
/**
* DOC: deep_color (int)
* Set 1 to enable Deep Color support. Only affects non-DC display handling. The default is 0 (disabled).
*/
MODULE_PARM_DESC(deep_color, "Deep Color support (1 = enable, 0 = disable (default))");
module_param_named(deep_color, amdgpu_deep_color, int, 0444);
/**
* DOC: vm_size (int)
* Override the size of the GPU's per client virtual address space in GiB. The default is -1 (automatic for each asic).
*/
MODULE_PARM_DESC(vm_size, "VM address space size in gigabytes (default 64GB)");
module_param_named(vm_size, amdgpu_vm_size, int, 0444);
/**
* DOC: vm_fragment_size (int)
* Override VM fragment size in bits (4, 5, etc. 4 = 64K, 9 = 2M). The default is -1 (automatic for each asic).
*/
MODULE_PARM_DESC(vm_fragment_size, "VM fragment size in bits (4, 5, etc. 4 = 64K (default), Max 9 = 2M)");
module_param_named(vm_fragment_size, amdgpu_vm_fragment_size, int, 0444);
/**
* DOC: vm_block_size (int)
* Override VM page table size in bits (default depending on vm_size and hw setup). The default is -1 (automatic for each asic).
*/
MODULE_PARM_DESC(vm_block_size, "VM page table size in bits (default depending on vm_size)");
module_param_named(vm_block_size, amdgpu_vm_block_size, int, 0444);
/**
* DOC: vm_fault_stop (int)
* Stop on VM fault for debugging (0 = never, 1 = print first, 2 = always). The default is 0 (No stop).
*/
MODULE_PARM_DESC(vm_fault_stop, "Stop on VM fault (0 = never (default), 1 = print first, 2 = always)");
module_param_named(vm_fault_stop, amdgpu_vm_fault_stop, int, 0444);
/**
* DOC: vm_debug (int)
* Debug VM handling (0 = disabled, 1 = enabled). The default is 0 (Disabled).
*/
MODULE_PARM_DESC(vm_debug, "Debug VM handling (0 = disabled (default), 1 = enabled)");
module_param_named(vm_debug, amdgpu_vm_debug, int, 0644);
/**
* DOC: vm_update_mode (int)
* Override VM update mode. VM updated by using CPU (0 = never, 1 = Graphics only, 2 = Compute only, 3 = Both). The default
* is -1 (Only in large BAR(LB) systems Compute VM tables will be updated by CPU, otherwise 0, never).
*/
MODULE_PARM_DESC(vm_update_mode, "VM update using CPU (0 = never (default except for large BAR(LB)), 1 = Graphics only, 2 = Compute only (default for LB), 3 = Both");
module_param_named(vm_update_mode, amdgpu_vm_update_mode, int, 0444);
/**
* DOC: exp_hw_support (int)
* Enable experimental hw support (1 = enable). The default is 0 (disabled).
*/
MODULE_PARM_DESC(exp_hw_support, "experimental hw support (1 = enable, 0 = disable (default))");
module_param_named(exp_hw_support, amdgpu_exp_hw_support, int, 0444);
/**
* DOC: dc (int)
* Disable/Enable Display Core driver for debugging (1 = enable, 0 = disable). The default is -1 (automatic for each asic).
*/
MODULE_PARM_DESC(dc, "Display Core driver (1 = enable, 0 = disable, -1 = auto (default))");
module_param_named(dc, amdgpu_dc, int, 0444);
/**
* DOC: sched_jobs (int)
* Override the max number of jobs supported in the sw queue. The default is 32.
*/
MODULE_PARM_DESC(sched_jobs, "the max number of jobs supported in the sw queue (default 32)");
module_param_named(sched_jobs, amdgpu_sched_jobs, int, 0444);
/**
* DOC: sched_hw_submission (int)
* Override the max number of HW submissions. The default is 2.
*/
MODULE_PARM_DESC(sched_hw_submission, "the max number of HW submissions (default 2)");
module_param_named(sched_hw_submission, amdgpu_sched_hw_submission, int, 0444);
/**
* DOC: ppfeaturemask (hexint)
* Override power features enabled. See enum PP_FEATURE_MASK in drivers/gpu/drm/amd/include/amd_shared.h.
* The default is the current set of stable power features.
*/
MODULE_PARM_DESC(ppfeaturemask, "all power features enabled (default))");
module_param_named(ppfeaturemask, amdgpu_pp_feature_mask, hexint, 0444);
/**
* DOC: forcelongtraining (uint)
* Force long memory training in resume.
* The default is zero, indicates short training in resume.
*/
MODULE_PARM_DESC(forcelongtraining, "force memory long training");
module_param_named(forcelongtraining, amdgpu_force_long_training, uint, 0444);
/**
* DOC: pcie_gen_cap (uint)
* Override PCIE gen speed capabilities. See the CAIL flags in drivers/gpu/drm/amd/include/amd_pcie.h.
* The default is 0 (automatic for each asic).
*/
MODULE_PARM_DESC(pcie_gen_cap, "PCIE Gen Caps (0: autodetect (default))");
module_param_named(pcie_gen_cap, amdgpu_pcie_gen_cap, uint, 0444);
/**
* DOC: pcie_lane_cap (uint)
* Override PCIE lanes capabilities. See the CAIL flags in drivers/gpu/drm/amd/include/amd_pcie.h.
* The default is 0 (automatic for each asic).
*/
MODULE_PARM_DESC(pcie_lane_cap, "PCIE Lane Caps (0: autodetect (default))");
module_param_named(pcie_lane_cap, amdgpu_pcie_lane_cap, uint, 0444);
/**
* DOC: cg_mask (ullong)
* Override Clockgating features enabled on GPU (0 = disable clock gating). See the AMD_CG_SUPPORT flags in
* drivers/gpu/drm/amd/include/amd_shared.h. The default is 0xffffffffffffffff (all enabled).
*/
MODULE_PARM_DESC(cg_mask, "Clockgating flags mask (0 = disable clock gating)");
module_param_named(cg_mask, amdgpu_cg_mask, ullong, 0444);
/**
* DOC: pg_mask (uint)
* Override Powergating features enabled on GPU (0 = disable power gating). See the AMD_PG_SUPPORT flags in
* drivers/gpu/drm/amd/include/amd_shared.h. The default is 0xffffffff (all enabled).
*/
MODULE_PARM_DESC(pg_mask, "Powergating flags mask (0 = disable power gating)");
module_param_named(pg_mask, amdgpu_pg_mask, uint, 0444);
/**
* DOC: sdma_phase_quantum (uint)
* Override SDMA context switch phase quantum (x 1K GPU clock cycles, 0 = no change). The default is 32.
*/
MODULE_PARM_DESC(sdma_phase_quantum, "SDMA context switch phase quantum (x 1K GPU clock cycles, 0 = no change (default 32))");
module_param_named(sdma_phase_quantum, amdgpu_sdma_phase_quantum, uint, 0444);
/**
* DOC: disable_cu (charp)
* Set to disable CUs (It's set like se.sh.cu,...). The default is NULL.
*/
MODULE_PARM_DESC(disable_cu, "Disable CUs (se.sh.cu,...)");
module_param_named(disable_cu, amdgpu_disable_cu, charp, 0444);
/**
* DOC: virtual_display (charp)
* Set to enable virtual display feature. This feature provides a virtual display hardware on headless boards
* or in virtualized environments. It will be set like xxxx:xx:xx.x,x;xxxx:xx:xx.x,x. It's the pci address of
* the device, plus the number of crtcs to expose. E.g., 0000:26:00.0,4 would enable 4 virtual crtcs on the pci
* device at 26:00.0. The default is NULL.
*/
MODULE_PARM_DESC(virtual_display,
"Enable virtual display feature (the virtual_display will be set like xxxx:xx:xx.x,x;xxxx:xx:xx.x,x)");
module_param_named(virtual_display, amdgpu_virtual_display, charp, 0444);
/**
* DOC: job_hang_limit (int)
* Set how much time allow a job hang and not drop it. The default is 0.
*/
MODULE_PARM_DESC(job_hang_limit, "how much time allow a job hang and not drop it (default 0)");
module_param_named(job_hang_limit, amdgpu_job_hang_limit, int ,0444);
/**
* DOC: lbpw (int)
* Override Load Balancing Per Watt (LBPW) support (1 = enable, 0 = disable). The default is -1 (auto, enabled).
*/
MODULE_PARM_DESC(lbpw, "Load Balancing Per Watt (LBPW) support (1 = enable, 0 = disable, -1 = auto)");
module_param_named(lbpw, amdgpu_lbpw, int, 0444);
MODULE_PARM_DESC(compute_multipipe, "Force compute queues to be spread across pipes (1 = enable, 0 = disable, -1 = auto)");
module_param_named(compute_multipipe, amdgpu_compute_multipipe, int, 0444);
/**
* DOC: gpu_recovery (int)
* Set to enable GPU recovery mechanism (1 = enable, 0 = disable). The default is -1 (auto, disabled except SRIOV).
*/
MODULE_PARM_DESC(gpu_recovery, "Enable GPU recovery mechanism, (1 = enable, 0 = disable, -1 = auto)");
module_param_named(gpu_recovery, amdgpu_gpu_recovery, int, 0444);
/**
* DOC: emu_mode (int)
* Set value 1 to enable emulation mode. This is only needed when running on an emulator. The default is 0 (disabled).
*/
MODULE_PARM_DESC(emu_mode, "Emulation mode, (1 = enable, 0 = disable)");
module_param_named(emu_mode, amdgpu_emu_mode, int, 0444);
/**
* DOC: ras_enable (int)
* Enable RAS features on the GPU (0 = disable, 1 = enable, -1 = auto (default))
*/
MODULE_PARM_DESC(ras_enable, "Enable RAS features on the GPU (0 = disable, 1 = enable, -1 = auto (default))");
module_param_named(ras_enable, amdgpu_ras_enable, int, 0444);
/**
* DOC: ras_mask (uint)
* Mask of RAS features to enable (default 0xffffffff), only valid when ras_enable == 1
* See the flags in drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
*/
MODULE_PARM_DESC(ras_mask, "Mask of RAS features to enable (default 0xffffffff), only valid when ras_enable == 1");
module_param_named(ras_mask, amdgpu_ras_mask, uint, 0444);
/**
* DOC: timeout_fatal_disable (bool)
* Disable Watchdog timeout fatal error event
*/
MODULE_PARM_DESC(timeout_fatal_disable, "disable watchdog timeout fatal error (false = default)");
module_param_named(timeout_fatal_disable, amdgpu_watchdog_timer.timeout_fatal_disable, bool, 0644);
/**
* DOC: timeout_period (uint)
* Modify the watchdog timeout max_cycles as (1 << period)
*/
MODULE_PARM_DESC(timeout_period, "watchdog timeout period (0 = timeout disabled, 1 ~ 0x23 = timeout maxcycles = (1 << period)");
module_param_named(timeout_period, amdgpu_watchdog_timer.period, uint, 0644);
/**
* DOC: si_support (int)
* Set SI support driver. This parameter works after set config CONFIG_DRM_AMDGPU_SI. For SI asic, when radeon driver is enabled,
* set value 0 to use radeon driver, while set value 1 to use amdgpu driver. The default is using radeon driver when it available,
* otherwise using amdgpu driver.
*/
#ifdef CONFIG_DRM_AMDGPU_SI
#if defined(CONFIG_DRM_RADEON) || defined(CONFIG_DRM_RADEON_MODULE)
int amdgpu_si_support = 0;
MODULE_PARM_DESC(si_support, "SI support (1 = enabled, 0 = disabled (default))");
#else
int amdgpu_si_support = 1;
MODULE_PARM_DESC(si_support, "SI support (1 = enabled (default), 0 = disabled)");
#endif
module_param_named(si_support, amdgpu_si_support, int, 0444);
#endif
/**
* DOC: cik_support (int)
* Set CIK support driver. This parameter works after set config CONFIG_DRM_AMDGPU_CIK. For CIK asic, when radeon driver is enabled,
* set value 0 to use radeon driver, while set value 1 to use amdgpu driver. The default is using radeon driver when it available,
* otherwise using amdgpu driver.
*/
#ifdef CONFIG_DRM_AMDGPU_CIK
#if defined(CONFIG_DRM_RADEON) || defined(CONFIG_DRM_RADEON_MODULE)
int amdgpu_cik_support = 0;
MODULE_PARM_DESC(cik_support, "CIK support (1 = enabled, 0 = disabled (default))");
#else
int amdgpu_cik_support = 1;
MODULE_PARM_DESC(cik_support, "CIK support (1 = enabled (default), 0 = disabled)");
#endif
module_param_named(cik_support, amdgpu_cik_support, int, 0444);
#endif
/**
* DOC: smu_memory_pool_size (uint)
* It is used to reserve gtt for smu debug usage, setting value 0 to disable it. The actual size is value * 256MiB.
* E.g. 0x1 = 256Mbyte, 0x2 = 512Mbyte, 0x4 = 1 Gbyte, 0x8 = 2GByte. The default is 0 (disabled).
*/
MODULE_PARM_DESC(smu_memory_pool_size,
"reserve gtt for smu debug usage, 0 = disable,"
"0x1 = 256Mbyte, 0x2 = 512Mbyte, 0x4 = 1 Gbyte, 0x8 = 2GByte");
module_param_named(smu_memory_pool_size, amdgpu_smu_memory_pool_size, uint, 0444);
/**
* DOC: async_gfx_ring (int)
* It is used to enable gfx rings that could be configured with different prioritites or equal priorities
*/
MODULE_PARM_DESC(async_gfx_ring,
"Asynchronous GFX rings that could be configured with either different priorities (HP3D ring and LP3D ring), or equal priorities (0 = disabled, 1 = enabled (default))");
module_param_named(async_gfx_ring, amdgpu_async_gfx_ring, int, 0444);
/**
* DOC: mcbp (int)
* It is used to enable mid command buffer preemption. (0 = disabled (default), 1 = enabled)
*/
MODULE_PARM_DESC(mcbp,
"Enable Mid-command buffer preemption (0 = disabled (default), 1 = enabled)");
module_param_named(mcbp, amdgpu_mcbp, int, 0444);
/**
* DOC: discovery (int)
* Allow driver to discover hardware IP information from IP Discovery table at the top of VRAM.
* (-1 = auto (default), 0 = disabled, 1 = enabled, 2 = use ip_discovery table from file)
*/
MODULE_PARM_DESC(discovery,
"Allow driver to discover hardware IPs from IP Discovery table at the top of VRAM");
module_param_named(discovery, amdgpu_discovery, int, 0444);
/**
* DOC: mes (int)
* Enable Micro Engine Scheduler. This is a new hw scheduling engine for gfx, sdma, and compute.
* (0 = disabled (default), 1 = enabled)
*/
MODULE_PARM_DESC(mes,
"Enable Micro Engine Scheduler (0 = disabled (default), 1 = enabled)");
module_param_named(mes, amdgpu_mes, int, 0444);
/**
* DOC: mes_kiq (int)
* Enable Micro Engine Scheduler KIQ. This is a new engine pipe for kiq.
* (0 = disabled (default), 1 = enabled)
*/
MODULE_PARM_DESC(mes_kiq,
"Enable Micro Engine Scheduler KIQ (0 = disabled (default), 1 = enabled)");
module_param_named(mes_kiq, amdgpu_mes_kiq, int, 0444);
/**
* DOC: noretry (int)
* Disable XNACK retry in the SQ by default on GFXv9 hardware. On ASICs that
* do not support per-process XNACK this also disables retry page faults.
* (0 = retry enabled, 1 = retry disabled, -1 auto (default))
*/
MODULE_PARM_DESC(noretry,
"Disable retry faults (0 = retry enabled, 1 = retry disabled, -1 auto (default))");
module_param_named(noretry, amdgpu_noretry, int, 0644);
/**
* DOC: force_asic_type (int)
* A non negative value used to specify the asic type for all supported GPUs.
*/
MODULE_PARM_DESC(force_asic_type,
"A non negative value used to specify the asic type for all supported GPUs");
module_param_named(force_asic_type, amdgpu_force_asic_type, int, 0444);
/**
* DOC: use_xgmi_p2p (int)
* Enables/disables XGMI P2P interface (0 = disable, 1 = enable).
*/
MODULE_PARM_DESC(use_xgmi_p2p,
"Enable XGMI P2P interface (0 = disable; 1 = enable (default))");
module_param_named(use_xgmi_p2p, amdgpu_use_xgmi_p2p, int, 0444);
#ifdef CONFIG_HSA_AMD
/**
* DOC: sched_policy (int)
* Set scheduling policy. Default is HWS(hardware scheduling) with over-subscription.
* Setting 1 disables over-subscription. Setting 2 disables HWS and statically
* assigns queues to HQDs.
*/
int sched_policy = KFD_SCHED_POLICY_HWS;
module_param(sched_policy, int, 0444);
MODULE_PARM_DESC(sched_policy,
"Scheduling policy (0 = HWS (Default), 1 = HWS without over-subscription, 2 = Non-HWS (Used for debugging only)");
/**
* DOC: hws_max_conc_proc (int)
* Maximum number of processes that HWS can schedule concurrently. The maximum is the
* number of VMIDs assigned to the HWS, which is also the default.
*/
int hws_max_conc_proc = -1;
module_param(hws_max_conc_proc, int, 0444);
MODULE_PARM_DESC(hws_max_conc_proc,
"Max # processes HWS can execute concurrently when sched_policy=0 (0 = no concurrency, #VMIDs for KFD = Maximum(default))");
/**
* DOC: cwsr_enable (int)
* CWSR(compute wave store and resume) allows the GPU to preempt shader execution in
* the middle of a compute wave. Default is 1 to enable this feature. Setting 0
* disables it.
*/
int cwsr_enable = 1;
module_param(cwsr_enable, int, 0444);
MODULE_PARM_DESC(cwsr_enable, "CWSR enable (0 = Off, 1 = On (Default))");
/**
* DOC: max_num_of_queues_per_device (int)
* Maximum number of queues per device. Valid setting is between 1 and 4096. Default
* is 4096.
*/
int max_num_of_queues_per_device = KFD_MAX_NUM_OF_QUEUES_PER_DEVICE_DEFAULT;
module_param(max_num_of_queues_per_device, int, 0444);
MODULE_PARM_DESC(max_num_of_queues_per_device,
"Maximum number of supported queues per device (1 = Minimum, 4096 = default)");
/**
* DOC: send_sigterm (int)
* Send sigterm to HSA process on unhandled exceptions. Default is not to send sigterm
* but just print errors on dmesg. Setting 1 enables sending sigterm.
*/
int send_sigterm;
module_param(send_sigterm, int, 0444);
MODULE_PARM_DESC(send_sigterm,
"Send sigterm to HSA process on unhandled exception (0 = disable, 1 = enable)");
/**
* DOC: debug_largebar (int)
* Set debug_largebar as 1 to enable simulating large-bar capability on non-large bar
* system. This limits the VRAM size reported to ROCm applications to the visible
* size, usually 256MB.
* Default value is 0, diabled.
*/
int debug_largebar;
module_param(debug_largebar, int, 0444);
MODULE_PARM_DESC(debug_largebar,
"Debug large-bar flag used to simulate large-bar capability on non-large bar machine (0 = disable, 1 = enable)");
/**
* DOC: ignore_crat (int)
* Ignore CRAT table during KFD initialization. By default, KFD uses the ACPI CRAT
* table to get information about AMD APUs. This option can serve as a workaround on
* systems with a broken CRAT table.
*
* Default is auto (according to asic type, iommu_v2, and crat table, to decide
* whether use CRAT)
*/
int ignore_crat;
module_param(ignore_crat, int, 0444);
MODULE_PARM_DESC(ignore_crat,
"Ignore CRAT table during KFD initialization (0 = auto (default), 1 = ignore CRAT)");
/**
* DOC: halt_if_hws_hang (int)
* Halt if HWS hang is detected. Default value, 0, disables the halt on hang.
* Setting 1 enables halt on hang.
*/
int halt_if_hws_hang;
module_param(halt_if_hws_hang, int, 0644);
MODULE_PARM_DESC(halt_if_hws_hang, "Halt if HWS hang is detected (0 = off (default), 1 = on)");
/**
* DOC: hws_gws_support(bool)
* Assume that HWS supports GWS barriers regardless of what firmware version
* check says. Default value: false (rely on MEC2 firmware version check).
*/
bool hws_gws_support;
module_param(hws_gws_support, bool, 0444);
MODULE_PARM_DESC(hws_gws_support, "Assume MEC2 FW supports GWS barriers (false = rely on FW version check (Default), true = force supported)");
/**
* DOC: queue_preemption_timeout_ms (int)
* queue preemption timeout in ms (1 = Minimum, 9000 = default)
*/
int queue_preemption_timeout_ms = 9000;
module_param(queue_preemption_timeout_ms, int, 0644);
MODULE_PARM_DESC(queue_preemption_timeout_ms, "queue preemption timeout in ms (1 = Minimum, 9000 = default)");
/**
* DOC: debug_evictions(bool)
* Enable extra debug messages to help determine the cause of evictions
*/
bool debug_evictions;
module_param(debug_evictions, bool, 0644);
MODULE_PARM_DESC(debug_evictions, "enable eviction debug messages (false = default)");
/**
* DOC: no_system_mem_limit(bool)
* Disable system memory limit, to support multiple process shared memory
*/
bool no_system_mem_limit;
module_param(no_system_mem_limit, bool, 0644);
MODULE_PARM_DESC(no_system_mem_limit, "disable system memory limit (false = default)");
/**
* DOC: no_queue_eviction_on_vm_fault (int)
* If set, process queues will not be evicted on gpuvm fault. This is to keep the wavefront context for debugging (0 = queue eviction, 1 = no queue eviction). The default is 0 (queue eviction).
*/
int amdgpu_no_queue_eviction_on_vm_fault = 0;
MODULE_PARM_DESC(no_queue_eviction_on_vm_fault, "No queue eviction on VM fault (0 = queue eviction, 1 = no queue eviction)");
module_param_named(no_queue_eviction_on_vm_fault, amdgpu_no_queue_eviction_on_vm_fault, int, 0444);
#endif
/**
* DOC: pcie_p2p (bool)
* Enable PCIe P2P (requires large-BAR). Default value: true (on)
*/
#ifdef CONFIG_HSA_AMD_P2P
bool pcie_p2p = true;
module_param(pcie_p2p, bool, 0444);
MODULE_PARM_DESC(pcie_p2p, "Enable PCIe P2P (requires large-BAR). (N = off, Y = on(default))");
#endif
/**
* DOC: dcfeaturemask (uint)
* Override display features enabled. See enum DC_FEATURE_MASK in drivers/gpu/drm/amd/include/amd_shared.h.
* The default is the current set of stable display features.
*/
MODULE_PARM_DESC(dcfeaturemask, "all stable DC features enabled (default))");
module_param_named(dcfeaturemask, amdgpu_dc_feature_mask, uint, 0444);
/**
* DOC: dcdebugmask (uint)
* Override display features enabled. See enum DC_DEBUG_MASK in drivers/gpu/drm/amd/include/amd_shared.h.
*/
MODULE_PARM_DESC(dcdebugmask, "all debug options disabled (default))");
module_param_named(dcdebugmask, amdgpu_dc_debug_mask, uint, 0444);
MODULE_PARM_DESC(visualconfirm, "Visual confirm (0 = off (default), 1 = MPO, 5 = PSR)");
module_param_named(visualconfirm, amdgpu_dc_visual_confirm, uint, 0444);
/**
* DOC: abmlevel (uint)
* Override the default ABM (Adaptive Backlight Management) level used for DC
* enabled hardware. Requires DMCU to be supported and loaded.
* Valid levels are 0-4. A value of 0 indicates that ABM should be disabled by
* default. Values 1-4 control the maximum allowable brightness reduction via
* the ABM algorithm, with 1 being the least reduction and 4 being the most
* reduction.
*
* Defaults to 0, or disabled. Userspace can still override this level later
* after boot.
*/
uint amdgpu_dm_abm_level;
MODULE_PARM_DESC(abmlevel, "ABM level (0 = off (default), 1-4 = backlight reduction level) ");
module_param_named(abmlevel, amdgpu_dm_abm_level, uint, 0444);
int amdgpu_backlight = -1;
MODULE_PARM_DESC(backlight, "Backlight control (0 = pwm, 1 = aux, -1 auto (default))");
module_param_named(backlight, amdgpu_backlight, bint, 0444);
/**
* DOC: tmz (int)
* Trusted Memory Zone (TMZ) is a method to protect data being written
* to or read from memory.
*
* The default value: 0 (off). TODO: change to auto till it is completed.
*/
MODULE_PARM_DESC(tmz, "Enable TMZ feature (-1 = auto (default), 0 = off, 1 = on)");
module_param_named(tmz, amdgpu_tmz, int, 0444);
/**
* DOC: freesync_video (uint)
* Enable the optimization to adjust front porch timing to achieve seamless
* mode change experience when setting a freesync supported mode for which full
* modeset is not needed.
*
* The Display Core will add a set of modes derived from the base FreeSync
* video mode into the corresponding connector's mode list based on commonly
* used refresh rates and VRR range of the connected display, when users enable
* this feature. From the userspace perspective, they can see a seamless mode
* change experience when the change between different refresh rates under the
* same resolution. Additionally, userspace applications such as Video playback
* can read this modeset list and change the refresh rate based on the video
* frame rate. Finally, the userspace can also derive an appropriate mode for a
* particular refresh rate based on the FreeSync Mode and add it to the
* connector's mode list.
*
* Note: This is an experimental feature.
*
* The default value: 0 (off).
*/
MODULE_PARM_DESC(
freesync_video,
"Enable freesync modesetting optimization feature (0 = off (default), 1 = on)");
module_param_named(freesync_video, amdgpu_freesync_vid_mode, uint, 0444);
/**
* DOC: reset_method (int)
* GPU reset method (-1 = auto (default), 0 = legacy, 1 = mode0, 2 = mode1, 3 = mode2, 4 = baco)
*/
MODULE_PARM_DESC(reset_method, "GPU reset method (-1 = auto (default), 0 = legacy, 1 = mode0, 2 = mode1, 3 = mode2, 4 = baco/bamaco)");
module_param_named(reset_method, amdgpu_reset_method, int, 0444);
/**
* DOC: bad_page_threshold (int) Bad page threshold is specifies the
* threshold value of faulty pages detected by RAS ECC, which may
* result in the GPU entering bad status when the number of total
* faulty pages by ECC exceeds the threshold value.
*/
MODULE_PARM_DESC(bad_page_threshold, "Bad page threshold(-1 = auto(default value), 0 = disable bad page retirement, -2 = ignore bad page threshold)");
module_param_named(bad_page_threshold, amdgpu_bad_page_threshold, int, 0444);
MODULE_PARM_DESC(num_kcq, "number of kernel compute queue user want to setup (8 if set to greater than 8 or less than 0, only affect gfx 8+)");
module_param_named(num_kcq, amdgpu_num_kcq, int, 0444);
/**
* DOC: vcnfw_log (int)
* Enable vcnfw log output for debugging, the default is disabled.
*/
MODULE_PARM_DESC(vcnfw_log, "Enable vcnfw log(0 = disable (default value), 1 = enable)");
module_param_named(vcnfw_log, amdgpu_vcnfw_log, int, 0444);
/**
* DOC: sg_display (int)
* Disable S/G (scatter/gather) display (i.e., display from system memory).
* This option is only relevant on APUs. Set this option to 0 to disable
* S/G display if you experience flickering or other issues under memory
* pressure and report the issue.
*/
MODULE_PARM_DESC(sg_display, "S/G Display (-1 = auto (default), 0 = disable)");
module_param_named(sg_display, amdgpu_sg_display, int, 0444);
/**
* DOC: smu_pptable_id (int)
* Used to override pptable id. id = 0 use VBIOS pptable.
* id > 0 use the soft pptable with specicfied id.
*/
MODULE_PARM_DESC(smu_pptable_id,
"specify pptable id to be used (-1 = auto(default) value, 0 = use pptable from vbios, > 0 = soft pptable id)");
module_param_named(smu_pptable_id, amdgpu_smu_pptable_id, int, 0444);
/* These devices are not supported by amdgpu.
* They are supported by the mach64, r128, radeon drivers
*/
static const u16 amdgpu_unsupported_pciidlist[] = {
/* mach64 */
0x4354,
0x4358,
0x4554,
0x4742,
0x4744,
0x4749,
0x474C,
0x474D,
0x474E,
0x474F,
0x4750,
0x4751,
0x4752,
0x4753,
0x4754,
0x4755,
0x4756,
0x4757,
0x4758,
0x4759,
0x475A,
0x4C42,
0x4C44,
0x4C47,
0x4C49,
0x4C4D,
0x4C4E,
0x4C50,
0x4C51,
0x4C52,
0x4C53,
0x5654,
0x5655,
0x5656,
/* r128 */
0x4c45,
0x4c46,
0x4d46,
0x4d4c,
0x5041,
0x5042,
0x5043,
0x5044,
0x5045,
0x5046,
0x5047,
0x5048,
0x5049,
0x504A,
0x504B,
0x504C,
0x504D,
0x504E,
0x504F,
0x5050,
0x5051,
0x5052,
0x5053,
0x5054,
0x5055,
0x5056,
0x5057,
0x5058,
0x5245,
0x5246,
0x5247,
0x524b,
0x524c,
0x534d,
0x5446,
0x544C,
0x5452,
/* radeon */
0x3150,
0x3151,
0x3152,
0x3154,
0x3155,
0x3E50,
0x3E54,
0x4136,
0x4137,
0x4144,
0x4145,
0x4146,
0x4147,
0x4148,
0x4149,
0x414A,
0x414B,
0x4150,
0x4151,
0x4152,
0x4153,
0x4154,
0x4155,
0x4156,
0x4237,
0x4242,
0x4336,
0x4337,
0x4437,
0x4966,
0x4967,
0x4A48,
0x4A49,
0x4A4A,
0x4A4B,
0x4A4C,
0x4A4D,
0x4A4E,
0x4A4F,
0x4A50,
0x4A54,
0x4B48,
0x4B49,
0x4B4A,
0x4B4B,
0x4B4C,
0x4C57,
0x4C58,
0x4C59,
0x4C5A,
0x4C64,
0x4C66,
0x4C67,
0x4E44,
0x4E45,
0x4E46,
0x4E47,
0x4E48,
0x4E49,
0x4E4A,
0x4E4B,
0x4E50,
0x4E51,
0x4E52,
0x4E53,
0x4E54,
0x4E56,
0x5144,
0x5145,
0x5146,
0x5147,
0x5148,
0x514C,
0x514D,
0x5157,
0x5158,
0x5159,
0x515A,
0x515E,
0x5460,
0x5462,
0x5464,
0x5548,
0x5549,
0x554A,
0x554B,
0x554C,
0x554D,
0x554E,
0x554F,
0x5550,
0x5551,
0x5552,
0x5554,
0x564A,
0x564B,
0x564F,
0x5652,
0x5653,
0x5657,
0x5834,
0x5835,
0x5954,
0x5955,
0x5974,
0x5975,
0x5960,
0x5961,
0x5962,
0x5964,
0x5965,
0x5969,
0x5a41,
0x5a42,
0x5a61,
0x5a62,
0x5b60,
0x5b62,
0x5b63,
0x5b64,
0x5b65,
0x5c61,
0x5c63,
0x5d48,
0x5d49,
0x5d4a,
0x5d4c,
0x5d4d,
0x5d4e,
0x5d4f,
0x5d50,
0x5d52,
0x5d57,
0x5e48,
0x5e4a,
0x5e4b,
0x5e4c,
0x5e4d,
0x5e4f,
0x6700,
0x6701,
0x6702,
0x6703,
0x6704,
0x6705,
0x6706,
0x6707,
0x6708,
0x6709,
0x6718,
0x6719,
0x671c,
0x671d,
0x671f,
0x6720,
0x6721,
0x6722,
0x6723,
0x6724,
0x6725,
0x6726,
0x6727,
0x6728,
0x6729,
0x6738,
0x6739,
0x673e,
0x6740,
0x6741,
0x6742,
0x6743,
0x6744,
0x6745,
0x6746,
0x6747,
0x6748,
0x6749,
0x674A,
0x6750,
0x6751,
0x6758,
0x6759,
0x675B,
0x675D,
0x675F,
0x6760,
0x6761,
0x6762,
0x6763,
0x6764,
0x6765,
0x6766,
0x6767,
0x6768,
0x6770,
0x6771,
0x6772,
0x6778,
0x6779,
0x677B,
0x6840,
0x6841,
0x6842,
0x6843,
0x6849,
0x684C,
0x6850,
0x6858,
0x6859,
0x6880,
0x6888,
0x6889,
0x688A,
0x688C,
0x688D,
0x6898,
0x6899,
0x689b,
0x689c,
0x689d,
0x689e,
0x68a0,
0x68a1,
0x68a8,
0x68a9,
0x68b0,
0x68b8,
0x68b9,
0x68ba,
0x68be,
0x68bf,
0x68c0,
0x68c1,
0x68c7,
0x68c8,
0x68c9,
0x68d8,
0x68d9,
0x68da,
0x68de,
0x68e0,
0x68e1,
0x68e4,
0x68e5,
0x68e8,
0x68e9,
0x68f1,
0x68f2,
0x68f8,
0x68f9,
0x68fa,
0x68fe,
0x7100,
0x7101,
0x7102,
0x7103,
0x7104,
0x7105,
0x7106,
0x7108,
0x7109,
0x710A,
0x710B,
0x710C,
0x710E,
0x710F,
0x7140,
0x7141,
0x7142,
0x7143,
0x7144,
0x7145,
0x7146,
0x7147,
0x7149,
0x714A,
0x714B,
0x714C,
0x714D,
0x714E,
0x714F,
0x7151,
0x7152,
0x7153,
0x715E,
0x715F,
0x7180,
0x7181,
0x7183,
0x7186,
0x7187,
0x7188,
0x718A,
0x718B,
0x718C,
0x718D,
0x718F,
0x7193,
0x7196,
0x719B,
0x719F,
0x71C0,
0x71C1,
0x71C2,
0x71C3,
0x71C4,
0x71C5,
0x71C6,
0x71C7,
0x71CD,
0x71CE,
0x71D2,
0x71D4,
0x71D5,
0x71D6,
0x71DA,
0x71DE,
0x7200,
0x7210,
0x7211,
0x7240,
0x7243,
0x7244,
0x7245,
0x7246,
0x7247,
0x7248,
0x7249,
0x724A,
0x724B,
0x724C,
0x724D,
0x724E,
0x724F,
0x7280,
0x7281,
0x7283,
0x7284,
0x7287,
0x7288,
0x7289,
0x728B,
0x728C,
0x7290,
0x7291,
0x7293,
0x7297,
0x7834,
0x7835,
0x791e,
0x791f,
0x793f,
0x7941,
0x7942,
0x796c,
0x796d,
0x796e,
0x796f,
0x9400,
0x9401,
0x9402,
0x9403,
0x9405,
0x940A,
0x940B,
0x940F,
0x94A0,
0x94A1,
0x94A3,
0x94B1,
0x94B3,
0x94B4,
0x94B5,
0x94B9,
0x9440,
0x9441,
0x9442,
0x9443,
0x9444,
0x9446,
0x944A,
0x944B,
0x944C,
0x944E,
0x9450,
0x9452,
0x9456,
0x945A,
0x945B,
0x945E,
0x9460,
0x9462,
0x946A,
0x946B,
0x947A,
0x947B,
0x9480,
0x9487,
0x9488,
0x9489,
0x948A,
0x948F,
0x9490,
0x9491,
0x9495,
0x9498,
0x949C,
0x949E,
0x949F,
0x94C0,
0x94C1,
0x94C3,
0x94C4,
0x94C5,
0x94C6,
0x94C7,
0x94C8,
0x94C9,
0x94CB,
0x94CC,
0x94CD,
0x9500,
0x9501,
0x9504,
0x9505,
0x9506,
0x9507,
0x9508,
0x9509,
0x950F,
0x9511,
0x9515,
0x9517,
0x9519,
0x9540,
0x9541,
0x9542,
0x954E,
0x954F,
0x9552,
0x9553,
0x9555,
0x9557,
0x955f,
0x9580,
0x9581,
0x9583,
0x9586,
0x9587,
0x9588,
0x9589,
0x958A,
0x958B,
0x958C,
0x958D,
0x958E,
0x958F,
0x9590,
0x9591,
0x9593,
0x9595,
0x9596,
0x9597,
0x9598,
0x9599,
0x959B,
0x95C0,
0x95C2,
0x95C4,
0x95C5,
0x95C6,
0x95C7,
0x95C9,
0x95CC,
0x95CD,
0x95CE,
0x95CF,
0x9610,
0x9611,
0x9612,
0x9613,
0x9614,
0x9615,
0x9616,
0x9640,
0x9641,
0x9642,
0x9643,
0x9644,
0x9645,
0x9647,
0x9648,
0x9649,
0x964a,
0x964b,
0x964c,
0x964e,
0x964f,
0x9710,
0x9711,
0x9712,
0x9713,
0x9714,
0x9715,
0x9802,
0x9803,
0x9804,
0x9805,
0x9806,
0x9807,
0x9808,
0x9809,
0x980A,
0x9900,
0x9901,
0x9903,
0x9904,
0x9905,
0x9906,
0x9907,
0x9908,
0x9909,
0x990A,
0x990B,
0x990C,
0x990D,
0x990E,
0x990F,
0x9910,
0x9913,
0x9917,
0x9918,
0x9919,
0x9990,
0x9991,
0x9992,
0x9993,
0x9994,
0x9995,
0x9996,
0x9997,
0x9998,
0x9999,
0x999A,
0x999B,
0x999C,
0x999D,
0x99A0,
0x99A2,
0x99A4,
/* radeon secondary ids */
0x3171,
0x3e70,
0x4164,
0x4165,
0x4166,
0x4168,
0x4170,
0x4171,
0x4172,
0x4173,
0x496e,
0x4a69,
0x4a6a,
0x4a6b,
0x4a70,
0x4a74,
0x4b69,
0x4b6b,
0x4b6c,
0x4c6e,
0x4e64,
0x4e65,
0x4e66,
0x4e67,
0x4e68,
0x4e69,
0x4e6a,
0x4e71,
0x4f73,
0x5569,
0x556b,
0x556d,
0x556f,
0x5571,
0x5854,
0x5874,
0x5940,
0x5941,
0x5b70,
0x5b72,
0x5b73,
0x5b74,
0x5b75,
0x5d44,
0x5d45,
0x5d6d,
0x5d6f,
0x5d72,
0x5d77,
0x5e6b,
0x5e6d,
0x7120,
0x7124,
0x7129,
0x712e,
0x712f,
0x7162,
0x7163,
0x7166,
0x7167,
0x7172,
0x7173,
0x71a0,
0x71a1,
0x71a3,
0x71a7,
0x71bb,
0x71e0,
0x71e1,
0x71e2,
0x71e6,
0x71e7,
0x71f2,
0x7269,
0x726b,
0x726e,
0x72a0,
0x72a8,
0x72b1,
0x72b3,
0x793f,
};
static const struct pci_device_id pciidlist[] = {
#ifdef CONFIG_DRM_AMDGPU_SI
{0x1002, 0x6780, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
{0x1002, 0x6784, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
{0x1002, 0x6788, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
{0x1002, 0x678A, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
{0x1002, 0x6790, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
{0x1002, 0x6791, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
{0x1002, 0x6792, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
{0x1002, 0x6798, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
{0x1002, 0x6799, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
{0x1002, 0x679A, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
{0x1002, 0x679B, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
{0x1002, 0x679E, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
{0x1002, 0x679F, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TAHITI},
{0x1002, 0x6800, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_PITCAIRN|AMD_IS_MOBILITY},
{0x1002, 0x6801, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_PITCAIRN|AMD_IS_MOBILITY},
{0x1002, 0x6802, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_PITCAIRN|AMD_IS_MOBILITY},
{0x1002, 0x6806, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_PITCAIRN},
{0x1002, 0x6808, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_PITCAIRN},
{0x1002, 0x6809, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_PITCAIRN},
{0x1002, 0x6810, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_PITCAIRN},
{0x1002, 0x6811, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_PITCAIRN},
{0x1002, 0x6816, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_PITCAIRN},
{0x1002, 0x6817, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_PITCAIRN},
{0x1002, 0x6818, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_PITCAIRN},
{0x1002, 0x6819, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_PITCAIRN},
{0x1002, 0x6600, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_OLAND|AMD_IS_MOBILITY},
{0x1002, 0x6601, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_OLAND|AMD_IS_MOBILITY},
{0x1002, 0x6602, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_OLAND|AMD_IS_MOBILITY},
{0x1002, 0x6603, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_OLAND|AMD_IS_MOBILITY},
{0x1002, 0x6604, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_OLAND|AMD_IS_MOBILITY},
{0x1002, 0x6605, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_OLAND|AMD_IS_MOBILITY},
{0x1002, 0x6606, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_OLAND|AMD_IS_MOBILITY},
{0x1002, 0x6607, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_OLAND|AMD_IS_MOBILITY},
{0x1002, 0x6608, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_OLAND},
{0x1002, 0x6610, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_OLAND},
{0x1002, 0x6611, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_OLAND},
{0x1002, 0x6613, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_OLAND},
{0x1002, 0x6617, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_OLAND|AMD_IS_MOBILITY},
{0x1002, 0x6620, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_OLAND|AMD_IS_MOBILITY},
{0x1002, 0x6621, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_OLAND|AMD_IS_MOBILITY},
{0x1002, 0x6623, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_OLAND|AMD_IS_MOBILITY},
{0x1002, 0x6631, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_OLAND},
{0x1002, 0x6820, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VERDE|AMD_IS_MOBILITY},
{0x1002, 0x6821, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VERDE|AMD_IS_MOBILITY},
{0x1002, 0x6822, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VERDE|AMD_IS_MOBILITY},
{0x1002, 0x6823, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VERDE|AMD_IS_MOBILITY},
{0x1002, 0x6824, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VERDE|AMD_IS_MOBILITY},
{0x1002, 0x6825, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VERDE|AMD_IS_MOBILITY},
{0x1002, 0x6826, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VERDE|AMD_IS_MOBILITY},
{0x1002, 0x6827, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VERDE|AMD_IS_MOBILITY},
{0x1002, 0x6828, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VERDE},
{0x1002, 0x6829, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VERDE},
{0x1002, 0x682A, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VERDE|AMD_IS_MOBILITY},
{0x1002, 0x682B, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VERDE|AMD_IS_MOBILITY},
{0x1002, 0x682C, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VERDE},
{0x1002, 0x682D, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VERDE|AMD_IS_MOBILITY},
{0x1002, 0x682F, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VERDE|AMD_IS_MOBILITY},
{0x1002, 0x6830, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VERDE|AMD_IS_MOBILITY},
{0x1002, 0x6831, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VERDE|AMD_IS_MOBILITY},
{0x1002, 0x6835, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VERDE},
{0x1002, 0x6837, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VERDE},
{0x1002, 0x6838, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VERDE},
{0x1002, 0x6839, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VERDE},
{0x1002, 0x683B, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VERDE},
{0x1002, 0x683D, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VERDE},
{0x1002, 0x683F, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VERDE},
{0x1002, 0x6660, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_HAINAN|AMD_IS_MOBILITY},
{0x1002, 0x6663, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_HAINAN|AMD_IS_MOBILITY},
{0x1002, 0x6664, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_HAINAN|AMD_IS_MOBILITY},
{0x1002, 0x6665, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_HAINAN|AMD_IS_MOBILITY},
{0x1002, 0x6667, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_HAINAN|AMD_IS_MOBILITY},
{0x1002, 0x666F, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_HAINAN|AMD_IS_MOBILITY},
#endif
#ifdef CONFIG_DRM_AMDGPU_CIK
/* Kaveri */
{0x1002, 0x1304, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KAVERI|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x1305, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KAVERI|AMD_IS_APU},
{0x1002, 0x1306, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KAVERI|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x1307, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KAVERI|AMD_IS_APU},
{0x1002, 0x1309, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KAVERI|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x130A, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KAVERI|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x130B, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KAVERI|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x130C, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KAVERI|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x130D, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KAVERI|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x130E, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KAVERI|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x130F, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KAVERI|AMD_IS_APU},
{0x1002, 0x1310, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KAVERI|AMD_IS_APU},
{0x1002, 0x1311, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KAVERI|AMD_IS_APU},
{0x1002, 0x1312, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KAVERI|AMD_IS_APU},
{0x1002, 0x1313, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KAVERI|AMD_IS_APU},
{0x1002, 0x1315, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KAVERI|AMD_IS_APU},
{0x1002, 0x1316, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KAVERI|AMD_IS_APU},
{0x1002, 0x1317, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KAVERI|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x1318, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KAVERI|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x131B, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KAVERI|AMD_IS_APU},
{0x1002, 0x131C, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KAVERI|AMD_IS_APU},
{0x1002, 0x131D, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KAVERI|AMD_IS_APU},
/* Bonaire */
{0x1002, 0x6640, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_BONAIRE|AMD_IS_MOBILITY},
{0x1002, 0x6641, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_BONAIRE|AMD_IS_MOBILITY},
{0x1002, 0x6646, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_BONAIRE|AMD_IS_MOBILITY},
{0x1002, 0x6647, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_BONAIRE|AMD_IS_MOBILITY},
{0x1002, 0x6649, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_BONAIRE},
{0x1002, 0x6650, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_BONAIRE},
{0x1002, 0x6651, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_BONAIRE},
{0x1002, 0x6658, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_BONAIRE},
{0x1002, 0x665c, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_BONAIRE},
{0x1002, 0x665d, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_BONAIRE},
{0x1002, 0x665f, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_BONAIRE},
/* Hawaii */
{0x1002, 0x67A0, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_HAWAII},
{0x1002, 0x67A1, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_HAWAII},
{0x1002, 0x67A2, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_HAWAII},
{0x1002, 0x67A8, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_HAWAII},
{0x1002, 0x67A9, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_HAWAII},
{0x1002, 0x67AA, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_HAWAII},
{0x1002, 0x67B0, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_HAWAII},
{0x1002, 0x67B1, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_HAWAII},
{0x1002, 0x67B8, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_HAWAII},
{0x1002, 0x67B9, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_HAWAII},
{0x1002, 0x67BA, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_HAWAII},
{0x1002, 0x67BE, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_HAWAII},
/* Kabini */
{0x1002, 0x9830, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KABINI|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x9831, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KABINI|AMD_IS_APU},
{0x1002, 0x9832, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KABINI|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x9833, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KABINI|AMD_IS_APU},
{0x1002, 0x9834, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KABINI|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x9835, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KABINI|AMD_IS_APU},
{0x1002, 0x9836, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KABINI|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x9837, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KABINI|AMD_IS_APU},
{0x1002, 0x9838, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KABINI|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x9839, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KABINI|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x983a, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KABINI|AMD_IS_APU},
{0x1002, 0x983b, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KABINI|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x983c, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KABINI|AMD_IS_APU},
{0x1002, 0x983d, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KABINI|AMD_IS_APU},
{0x1002, 0x983e, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KABINI|AMD_IS_APU},
{0x1002, 0x983f, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_KABINI|AMD_IS_APU},
/* mullins */
{0x1002, 0x9850, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_MULLINS|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x9851, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_MULLINS|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x9852, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_MULLINS|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x9853, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_MULLINS|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x9854, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_MULLINS|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x9855, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_MULLINS|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x9856, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_MULLINS|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x9857, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_MULLINS|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x9858, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_MULLINS|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x9859, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_MULLINS|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x985A, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_MULLINS|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x985B, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_MULLINS|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x985C, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_MULLINS|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x985D, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_MULLINS|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x985E, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_MULLINS|AMD_IS_MOBILITY|AMD_IS_APU},
{0x1002, 0x985F, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_MULLINS|AMD_IS_MOBILITY|AMD_IS_APU},
#endif
/* topaz */
{0x1002, 0x6900, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TOPAZ},
{0x1002, 0x6901, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TOPAZ},
{0x1002, 0x6902, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TOPAZ},
{0x1002, 0x6903, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TOPAZ},
{0x1002, 0x6907, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TOPAZ},
/* tonga */
{0x1002, 0x6920, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TONGA},
{0x1002, 0x6921, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TONGA},
{0x1002, 0x6928, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TONGA},
{0x1002, 0x6929, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TONGA},
{0x1002, 0x692B, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TONGA},
{0x1002, 0x692F, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TONGA},
{0x1002, 0x6930, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TONGA},
{0x1002, 0x6938, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TONGA},
{0x1002, 0x6939, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_TONGA},
/* fiji */
{0x1002, 0x7300, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_FIJI},
{0x1002, 0x730F, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_FIJI},
/* carrizo */
{0x1002, 0x9870, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_CARRIZO|AMD_IS_APU},
{0x1002, 0x9874, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_CARRIZO|AMD_IS_APU},
{0x1002, 0x9875, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_CARRIZO|AMD_IS_APU},
{0x1002, 0x9876, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_CARRIZO|AMD_IS_APU},
{0x1002, 0x9877, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_CARRIZO|AMD_IS_APU},
/* stoney */
{0x1002, 0x98E4, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_STONEY|AMD_IS_APU},
/* Polaris11 */
{0x1002, 0x67E0, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_POLARIS11},
{0x1002, 0x67E3, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_POLARIS11},
{0x1002, 0x67E8, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_POLARIS11},
{0x1002, 0x67EB, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_POLARIS11},
{0x1002, 0x67EF, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_POLARIS11},
{0x1002, 0x67FF, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_POLARIS11},
{0x1002, 0x67E1, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_POLARIS11},
{0x1002, 0x67E7, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_POLARIS11},
{0x1002, 0x67E9, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_POLARIS11},
/* Polaris10 */
{0x1002, 0x67C0, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_POLARIS10},
{0x1002, 0x67C1, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_POLARIS10},
{0x1002, 0x67C2, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_POLARIS10},
{0x1002, 0x67C4, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_POLARIS10},
{0x1002, 0x67C7, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_POLARIS10},
{0x1002, 0x67D0, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_POLARIS10},
{0x1002, 0x67DF, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_POLARIS10},
{0x1002, 0x67C8, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_POLARIS10},
{0x1002, 0x67C9, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_POLARIS10},
{0x1002, 0x67CA, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_POLARIS10},
{0x1002, 0x67CC, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_POLARIS10},
{0x1002, 0x67CF, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_POLARIS10},
{0x1002, 0x6FDF, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_POLARIS10},
/* Polaris12 */
{0x1002, 0x6980, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_POLARIS12},
{0x1002, 0x6981, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_POLARIS12},
{0x1002, 0x6985, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_POLARIS12},
{0x1002, 0x6986, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_POLARIS12},
{0x1002, 0x6987, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_POLARIS12},
{0x1002, 0x6995, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_POLARIS12},
{0x1002, 0x6997, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_POLARIS12},
{0x1002, 0x699F, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_POLARIS12},
/* VEGAM */
{0x1002, 0x694C, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGAM},
{0x1002, 0x694E, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGAM},
{0x1002, 0x694F, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGAM},
/* Vega 10 */
{0x1002, 0x6860, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA10},
{0x1002, 0x6861, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA10},
{0x1002, 0x6862, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA10},
{0x1002, 0x6863, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA10},
{0x1002, 0x6864, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA10},
{0x1002, 0x6867, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA10},
{0x1002, 0x6868, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA10},
{0x1002, 0x6869, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA10},
{0x1002, 0x686a, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA10},
{0x1002, 0x686b, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA10},
{0x1002, 0x686c, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA10},
{0x1002, 0x686d, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA10},
{0x1002, 0x686e, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA10},
{0x1002, 0x686f, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA10},
{0x1002, 0x687f, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA10},
/* Vega 12 */
{0x1002, 0x69A0, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA12},
{0x1002, 0x69A1, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA12},
{0x1002, 0x69A2, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA12},
{0x1002, 0x69A3, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA12},
{0x1002, 0x69AF, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA12},
/* Vega 20 */
{0x1002, 0x66A0, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA20},
{0x1002, 0x66A1, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA20},
{0x1002, 0x66A2, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA20},
{0x1002, 0x66A3, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA20},
{0x1002, 0x66A4, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA20},
{0x1002, 0x66A7, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA20},
{0x1002, 0x66AF, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA20},
/* Raven */
{0x1002, 0x15dd, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_RAVEN|AMD_IS_APU},
{0x1002, 0x15d8, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_RAVEN|AMD_IS_APU},
/* Arcturus */
{0x1002, 0x738C, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_ARCTURUS},
{0x1002, 0x7388, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_ARCTURUS},
{0x1002, 0x738E, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_ARCTURUS},
{0x1002, 0x7390, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_ARCTURUS},
/* Navi10 */
{0x1002, 0x7310, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVI10},
{0x1002, 0x7312, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVI10},
{0x1002, 0x7318, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVI10},
{0x1002, 0x7319, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVI10},
{0x1002, 0x731A, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVI10},
{0x1002, 0x731B, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVI10},
{0x1002, 0x731E, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVI10},
{0x1002, 0x731F, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVI10},
/* Navi14 */
{0x1002, 0x7340, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVI14},
{0x1002, 0x7341, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVI14},
{0x1002, 0x7347, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVI14},
{0x1002, 0x734F, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVI14},
/* Renoir */
{0x1002, 0x15E7, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_RENOIR|AMD_IS_APU},
{0x1002, 0x1636, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_RENOIR|AMD_IS_APU},
{0x1002, 0x1638, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_RENOIR|AMD_IS_APU},
{0x1002, 0x164C, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_RENOIR|AMD_IS_APU},
/* Navi12 */
{0x1002, 0x7360, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVI12},
{0x1002, 0x7362, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVI12},
/* Sienna_Cichlid */
{0x1002, 0x73A0, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
{0x1002, 0x73A1, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
{0x1002, 0x73A2, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
{0x1002, 0x73A3, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
{0x1002, 0x73A5, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
{0x1002, 0x73A8, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
{0x1002, 0x73A9, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
{0x1002, 0x73AB, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
{0x1002, 0x73AC, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
{0x1002, 0x73AD, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
{0x1002, 0x73AE, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
{0x1002, 0x73AF, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
{0x1002, 0x73BF, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
/* Van Gogh */
{0x1002, 0x163F, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VANGOGH|AMD_IS_APU},
/* Yellow Carp */
{0x1002, 0x164D, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_YELLOW_CARP|AMD_IS_APU},
{0x1002, 0x1681, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_YELLOW_CARP|AMD_IS_APU},
/* Navy_Flounder */
{0x1002, 0x73C0, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVY_FLOUNDER},
{0x1002, 0x73C1, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVY_FLOUNDER},
{0x1002, 0x73C3, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVY_FLOUNDER},
{0x1002, 0x73DA, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVY_FLOUNDER},
{0x1002, 0x73DB, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVY_FLOUNDER},
{0x1002, 0x73DC, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVY_FLOUNDER},
{0x1002, 0x73DD, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVY_FLOUNDER},
{0x1002, 0x73DE, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVY_FLOUNDER},
{0x1002, 0x73DF, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVY_FLOUNDER},
/* DIMGREY_CAVEFISH */
{0x1002, 0x73E0, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
{0x1002, 0x73E1, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
{0x1002, 0x73E2, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
{0x1002, 0x73E3, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
{0x1002, 0x73E8, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
{0x1002, 0x73E9, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
{0x1002, 0x73EA, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
{0x1002, 0x73EB, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
{0x1002, 0x73EC, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
{0x1002, 0x73ED, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
{0x1002, 0x73EF, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
{0x1002, 0x73FF, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_DIMGREY_CAVEFISH},
/* Aldebaran */
{0x1002, 0x7408, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_ALDEBARAN},
{0x1002, 0x740C, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_ALDEBARAN},
{0x1002, 0x740F, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_ALDEBARAN},
{0x1002, 0x7410, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_ALDEBARAN},
/* CYAN_SKILLFISH */
{0x1002, 0x13FE, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_CYAN_SKILLFISH|AMD_IS_APU},
{0x1002, 0x143F, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_CYAN_SKILLFISH|AMD_IS_APU},
/* BEIGE_GOBY */
{0x1002, 0x7420, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_BEIGE_GOBY},
{0x1002, 0x7421, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_BEIGE_GOBY},
{0x1002, 0x7422, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_BEIGE_GOBY},
{0x1002, 0x7423, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_BEIGE_GOBY},
{0x1002, 0x7424, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_BEIGE_GOBY},
{0x1002, 0x743F, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_BEIGE_GOBY},
{ PCI_DEVICE(0x1002, PCI_ANY_ID),
.class = PCI_CLASS_DISPLAY_VGA << 8,
.class_mask = 0xffffff,
.driver_data = CHIP_IP_DISCOVERY },
{ PCI_DEVICE(0x1002, PCI_ANY_ID),
.class = PCI_CLASS_DISPLAY_OTHER << 8,
.class_mask = 0xffffff,
.driver_data = CHIP_IP_DISCOVERY },
{0, 0, 0}
};
MODULE_DEVICE_TABLE(pci, pciidlist);
static const struct drm_driver amdgpu_kms_driver;
static void amdgpu_get_secondary_funcs(struct amdgpu_device *adev)
{
struct pci_dev *p = NULL;
int i;
/* 0 - GPU
* 1 - audio
* 2 - USB
* 3 - UCSI
*/
for (i = 1; i < 4; i++) {
p = pci_get_domain_bus_and_slot(pci_domain_nr(adev->pdev->bus),
adev->pdev->bus->number, i);
if (p) {
pm_runtime_get_sync(&p->dev);
pm_runtime_mark_last_busy(&p->dev);
pm_runtime_put_autosuspend(&p->dev);
pci_dev_put(p);
}
}
}
static int amdgpu_pci_probe(struct pci_dev *pdev,
const struct pci_device_id *ent)
{
struct drm_device *ddev;
struct amdgpu_device *adev;
unsigned long flags = ent->driver_data;
int ret, retry = 0, i;
bool supports_atomic = false;
/* skip devices which are owned by radeon */
for (i = 0; i < ARRAY_SIZE(amdgpu_unsupported_pciidlist); i++) {
if (amdgpu_unsupported_pciidlist[i] == pdev->device)
return -ENODEV;
}
if (amdgpu_aspm == -1 && !pcie_aspm_enabled(pdev))
amdgpu_aspm = 0;
if (amdgpu_virtual_display ||
amdgpu_device_asic_has_dc_support(flags & AMD_ASIC_MASK))
supports_atomic = true;
if ((flags & AMD_EXP_HW_SUPPORT) && !amdgpu_exp_hw_support) {
DRM_INFO("This hardware requires experimental hardware support.\n"
"See modparam exp_hw_support\n");
return -ENODEV;
}
/* differentiate between P10 and P11 asics with the same DID */
if (pdev->device == 0x67FF &&
(pdev->revision == 0xE3 ||
pdev->revision == 0xE7 ||
pdev->revision == 0xF3 ||
pdev->revision == 0xF7)) {
flags &= ~AMD_ASIC_MASK;
flags |= CHIP_POLARIS10;
}
/* Due to hardware bugs, S/G Display on raven requires a 1:1 IOMMU mapping,
* however, SME requires an indirect IOMMU mapping because the encryption
* bit is beyond the DMA mask of the chip.
*/
if (cc_platform_has(CC_ATTR_MEM_ENCRYPT) &&
((flags & AMD_ASIC_MASK) == CHIP_RAVEN)) {
dev_info(&pdev->dev,
"SME is not compatible with RAVEN\n");
return -ENOTSUPP;
}
#ifdef CONFIG_DRM_AMDGPU_SI
if (!amdgpu_si_support) {
switch (flags & AMD_ASIC_MASK) {
case CHIP_TAHITI:
case CHIP_PITCAIRN:
case CHIP_VERDE:
case CHIP_OLAND:
case CHIP_HAINAN:
dev_info(&pdev->dev,
"SI support provided by radeon.\n");
dev_info(&pdev->dev,
"Use radeon.si_support=0 amdgpu.si_support=1 to override.\n"
);
return -ENODEV;
}
}
#endif
#ifdef CONFIG_DRM_AMDGPU_CIK
if (!amdgpu_cik_support) {
switch (flags & AMD_ASIC_MASK) {
case CHIP_KAVERI:
case CHIP_BONAIRE:
case CHIP_HAWAII:
case CHIP_KABINI:
case CHIP_MULLINS:
dev_info(&pdev->dev,
"CIK support provided by radeon.\n");
dev_info(&pdev->dev,
"Use radeon.cik_support=0 amdgpu.cik_support=1 to override.\n"
);
return -ENODEV;
}
}
#endif
adev = devm_drm_dev_alloc(&pdev->dev, &amdgpu_kms_driver, typeof(*adev), ddev);
if (IS_ERR(adev))
return PTR_ERR(adev);
adev->dev = &pdev->dev;
adev->pdev = pdev;
ddev = adev_to_drm(adev);
if (!supports_atomic)
ddev->driver_features &= ~DRIVER_ATOMIC;
ret = pci_enable_device(pdev);
if (ret)
return ret;
pci_set_drvdata(pdev, ddev);
ret = amdgpu_driver_load_kms(adev, flags);
if (ret)
goto err_pci;
retry_init:
ret = drm_dev_register(ddev, flags);
if (ret == -EAGAIN && ++retry <= 3) {
DRM_INFO("retry init %d\n", retry);
/* Don't request EX mode too frequently which is attacking */
msleep(5000);
goto retry_init;
} else if (ret) {
goto err_pci;
}
/*
* 1. don't init fbdev on hw without DCE
* 2. don't init fbdev if there are no connectors
*/
if (adev->mode_info.mode_config_initialized &&
!list_empty(&adev_to_drm(adev)->mode_config.connector_list)) {
/* select 8 bpp console on low vram cards */
if (adev->gmc.real_vram_size <= (32*1024*1024))
drm_fbdev_generic_setup(adev_to_drm(adev), 8);
else
drm_fbdev_generic_setup(adev_to_drm(adev), 32);
}
ret = amdgpu_debugfs_init(adev);
if (ret)
DRM_ERROR("Creating debugfs files failed (%d).\n", ret);
if (adev->pm.rpm_mode != AMDGPU_RUNPM_NONE) {
/* only need to skip on ATPX */
if (amdgpu_device_supports_px(ddev))
dev_pm_set_driver_flags(ddev->dev, DPM_FLAG_NO_DIRECT_COMPLETE);
/* we want direct complete for BOCO */
if (amdgpu_device_supports_boco(ddev))
dev_pm_set_driver_flags(ddev->dev, DPM_FLAG_SMART_PREPARE |
DPM_FLAG_SMART_SUSPEND |
DPM_FLAG_MAY_SKIP_RESUME);
pm_runtime_use_autosuspend(ddev->dev);
pm_runtime_set_autosuspend_delay(ddev->dev, 5000);
pm_runtime_allow(ddev->dev);
pm_runtime_mark_last_busy(ddev->dev);
pm_runtime_put_autosuspend(ddev->dev);
pci_wake_from_d3(pdev, TRUE);
/*
* For runpm implemented via BACO, PMFW will handle the
* timing for BACO in and out:
* - put ASIC into BACO state only when both video and
* audio functions are in D3 state.
* - pull ASIC out of BACO state when either video or
* audio function is in D0 state.
* Also, at startup, PMFW assumes both functions are in
* D0 state.
*
* So if snd driver was loaded prior to amdgpu driver
* and audio function was put into D3 state, there will
* be no PMFW-aware D-state transition(D0->D3) on runpm
* suspend. Thus the BACO will be not correctly kicked in.
*
* Via amdgpu_get_secondary_funcs(), the audio dev is put
* into D0 state. Then there will be a PMFW-aware D-state
* transition(D0->D3) on runpm suspend.
*/
if (amdgpu_device_supports_baco(ddev) &&
!(adev->flags & AMD_IS_APU) &&
(adev->asic_type >= CHIP_NAVI10))
amdgpu_get_secondary_funcs(adev);
}
return 0;
err_pci:
pci_disable_device(pdev);
return ret;
}
static void
amdgpu_pci_remove(struct pci_dev *pdev)
{
struct drm_device *dev = pci_get_drvdata(pdev);
struct amdgpu_device *adev = drm_to_adev(dev);
drm_dev_unplug(dev);
if (adev->pm.rpm_mode != AMDGPU_RUNPM_NONE) {
pm_runtime_get_sync(dev->dev);
pm_runtime_forbid(dev->dev);
}
if (adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2) &&
!amdgpu_sriov_vf(adev)) {
bool need_to_reset_gpu = false;
if (adev->gmc.xgmi.num_physical_nodes > 1) {
struct amdgpu_hive_info *hive;
hive = amdgpu_get_xgmi_hive(adev);
if (hive->device_remove_count == 0)
need_to_reset_gpu = true;
hive->device_remove_count++;
amdgpu_put_xgmi_hive(hive);
} else {
need_to_reset_gpu = true;
}
/* Workaround for ASICs need to reset SMU.
* Called only when the first device is removed.
*/
if (need_to_reset_gpu) {
struct amdgpu_reset_context reset_context;
adev->shutdown = true;
memset(&reset_context, 0, sizeof(reset_context));
reset_context.method = AMD_RESET_METHOD_NONE;
reset_context.reset_req_dev = adev;
set_bit(AMDGPU_NEED_FULL_RESET, &reset_context.flags);
set_bit(AMDGPU_RESET_FOR_DEVICE_REMOVE, &reset_context.flags);
amdgpu_device_gpu_recover(adev, NULL, &reset_context);
}
}
amdgpu_driver_unload_kms(dev);
/*
* Flush any in flight DMA operations from device.
* Clear the Bus Master Enable bit and then wait on the PCIe Device
* StatusTransactions Pending bit.
*/
pci_disable_device(pdev);
pci_wait_for_pending_transaction(pdev);
}
static void
amdgpu_pci_shutdown(struct pci_dev *pdev)
{
struct drm_device *dev = pci_get_drvdata(pdev);
struct amdgpu_device *adev = drm_to_adev(dev);
if (amdgpu_ras_intr_triggered())
return;
/* if we are running in a VM, make sure the device
* torn down properly on reboot/shutdown.
* unfortunately we can't detect certain
* hypervisors so just do this all the time.
*/
if (!amdgpu_passthrough(adev))
adev->mp1_state = PP_MP1_STATE_UNLOAD;
amdgpu_device_ip_suspend(adev);
adev->mp1_state = PP_MP1_STATE_NONE;
}
/**
* amdgpu_drv_delayed_reset_work_handler - work handler for reset
*
* @work: work_struct.
*/
static void amdgpu_drv_delayed_reset_work_handler(struct work_struct *work)
{
struct list_head device_list;
struct amdgpu_device *adev;
int i, r;
struct amdgpu_reset_context reset_context;
memset(&reset_context, 0, sizeof(reset_context));
mutex_lock(&mgpu_info.mutex);
if (mgpu_info.pending_reset == true) {
mutex_unlock(&mgpu_info.mutex);
return;
}
mgpu_info.pending_reset = true;
mutex_unlock(&mgpu_info.mutex);
/* Use a common context, just need to make sure full reset is done */
reset_context.method = AMD_RESET_METHOD_NONE;
set_bit(AMDGPU_NEED_FULL_RESET, &reset_context.flags);
for (i = 0; i < mgpu_info.num_dgpu; i++) {
adev = mgpu_info.gpu_ins[i].adev;
reset_context.reset_req_dev = adev;
r = amdgpu_device_pre_asic_reset(adev, &reset_context);
if (r) {
dev_err(adev->dev, "GPU pre asic reset failed with err, %d for drm dev, %s ",
r, adev_to_drm(adev)->unique);
}
if (!queue_work(system_unbound_wq, &adev->xgmi_reset_work))
r = -EALREADY;
}
for (i = 0; i < mgpu_info.num_dgpu; i++) {
adev = mgpu_info.gpu_ins[i].adev;
flush_work(&adev->xgmi_reset_work);
adev->gmc.xgmi.pending_reset = false;
}
/* reset function will rebuild the xgmi hive info , clear it now */
for (i = 0; i < mgpu_info.num_dgpu; i++)
amdgpu_xgmi_remove_device(mgpu_info.gpu_ins[i].adev);
INIT_LIST_HEAD(&device_list);
for (i = 0; i < mgpu_info.num_dgpu; i++)
list_add_tail(&mgpu_info.gpu_ins[i].adev->reset_list, &device_list);
/* unregister the GPU first, reset function will add them back */
list_for_each_entry(adev, &device_list, reset_list)
amdgpu_unregister_gpu_instance(adev);
/* Use a common context, just need to make sure full reset is done */
set_bit(AMDGPU_SKIP_HW_RESET, &reset_context.flags);
r = amdgpu_do_asic_reset(&device_list, &reset_context);
if (r) {
DRM_ERROR("reinit gpus failure");
return;
}
for (i = 0; i < mgpu_info.num_dgpu; i++) {
adev = mgpu_info.gpu_ins[i].adev;
if (!adev->kfd.init_complete)
amdgpu_amdkfd_device_init(adev);
amdgpu_ttm_set_buffer_funcs_status(adev, true);
}
return;
}
static int amdgpu_pmops_prepare(struct device *dev)
{
struct drm_device *drm_dev = dev_get_drvdata(dev);
struct amdgpu_device *adev = drm_to_adev(drm_dev);
/* Return a positive number here so
* DPM_FLAG_SMART_SUSPEND works properly
*/
if (amdgpu_device_supports_boco(drm_dev) &&
pm_runtime_suspended(dev))
return 1;
/* if we will not support s3 or s2i for the device
* then skip suspend
*/
if (!amdgpu_acpi_is_s0ix_active(adev) &&
!amdgpu_acpi_is_s3_active(adev))
return 1;
return amdgpu_device_prepare(drm_dev);
}
static void amdgpu_pmops_complete(struct device *dev)
{
/* nothing to do */
}
static int amdgpu_pmops_suspend(struct device *dev)
{
struct drm_device *drm_dev = dev_get_drvdata(dev);
struct amdgpu_device *adev = drm_to_adev(drm_dev);
adev->suspend_complete = false;
if (amdgpu_acpi_is_s0ix_active(adev))
adev->in_s0ix = true;
else if (amdgpu_acpi_is_s3_active(adev))
adev->in_s3 = true;
if (!adev->in_s0ix && !adev->in_s3)
return 0;
return amdgpu_device_suspend(drm_dev, true);
}
static int amdgpu_pmops_suspend_noirq(struct device *dev)
{
struct drm_device *drm_dev = dev_get_drvdata(dev);
struct amdgpu_device *adev = drm_to_adev(drm_dev);
adev->suspend_complete = true;
if (amdgpu_acpi_should_gpu_reset(adev))
return amdgpu_asic_reset(adev);
return 0;
}
static int amdgpu_pmops_resume(struct device *dev)
{
struct drm_device *drm_dev = dev_get_drvdata(dev);
struct amdgpu_device *adev = drm_to_adev(drm_dev);
int r;
if (!adev->in_s0ix && !adev->in_s3)
return 0;
/* Avoids registers access if device is physically gone */
if (!pci_device_is_present(adev->pdev))
adev->no_hw_access = true;
r = amdgpu_device_resume(drm_dev, true);
if (amdgpu_acpi_is_s0ix_active(adev))
adev->in_s0ix = false;
else
adev->in_s3 = false;
return r;
}
static int amdgpu_pmops_freeze(struct device *dev)
{
struct drm_device *drm_dev = dev_get_drvdata(dev);
struct amdgpu_device *adev = drm_to_adev(drm_dev);
int r;
adev->in_s4 = true;
r = amdgpu_device_suspend(drm_dev, true);
adev->in_s4 = false;
if (r)
return r;
if (amdgpu_acpi_should_gpu_reset(adev))
return amdgpu_asic_reset(adev);
return 0;
}
static int amdgpu_pmops_thaw(struct device *dev)
{
struct drm_device *drm_dev = dev_get_drvdata(dev);
return amdgpu_device_resume(drm_dev, true);
}
static int amdgpu_pmops_poweroff(struct device *dev)
{
struct drm_device *drm_dev = dev_get_drvdata(dev);
return amdgpu_device_suspend(drm_dev, true);
}
static int amdgpu_pmops_restore(struct device *dev)
{
struct drm_device *drm_dev = dev_get_drvdata(dev);
return amdgpu_device_resume(drm_dev, true);
}
static int amdgpu_runtime_idle_check_display(struct device *dev)
{
struct pci_dev *pdev = to_pci_dev(dev);
struct drm_device *drm_dev = pci_get_drvdata(pdev);
struct amdgpu_device *adev = drm_to_adev(drm_dev);
if (adev->mode_info.num_crtc) {
struct drm_connector *list_connector;
struct drm_connector_list_iter iter;
int ret = 0;
/* XXX: Return busy if any displays are connected to avoid
* possible display wakeups after runtime resume due to
* hotplug events in case any displays were connected while
* the GPU was in suspend. Remove this once that is fixed.
*/
mutex_lock(&drm_dev->mode_config.mutex);
drm_connector_list_iter_begin(drm_dev, &iter);
drm_for_each_connector_iter(list_connector, &iter) {
if (list_connector->status == connector_status_connected) {
ret = -EBUSY;
break;
}
}
drm_connector_list_iter_end(&iter);
mutex_unlock(&drm_dev->mode_config.mutex);
if (ret)
return ret;
if (amdgpu_device_has_dc_support(adev)) {
struct drm_crtc *crtc;
drm_for_each_crtc(crtc, drm_dev) {
drm_modeset_lock(&crtc->mutex, NULL);
if (crtc->state->active)
ret = -EBUSY;
drm_modeset_unlock(&crtc->mutex);
if (ret < 0)
break;
}
} else {
mutex_lock(&drm_dev->mode_config.mutex);
drm_modeset_lock(&drm_dev->mode_config.connection_mutex, NULL);
drm_connector_list_iter_begin(drm_dev, &iter);
drm_for_each_connector_iter(list_connector, &iter) {
if (list_connector->dpms == DRM_MODE_DPMS_ON) {
ret = -EBUSY;
break;
}
}
drm_connector_list_iter_end(&iter);
drm_modeset_unlock(&drm_dev->mode_config.connection_mutex);
mutex_unlock(&drm_dev->mode_config.mutex);
}
if (ret)
return ret;
}
return 0;
}
static int amdgpu_pmops_runtime_suspend(struct device *dev)
{
struct pci_dev *pdev = to_pci_dev(dev);
struct drm_device *drm_dev = pci_get_drvdata(pdev);
struct amdgpu_device *adev = drm_to_adev(drm_dev);
int ret, i;
if (adev->pm.rpm_mode == AMDGPU_RUNPM_NONE) {
pm_runtime_forbid(dev);
return -EBUSY;
}
ret = amdgpu_runtime_idle_check_display(dev);
if (ret)
return ret;
/* wait for all rings to drain before suspending */
for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
struct amdgpu_ring *ring = adev->rings[i];
if (ring && ring->sched.ready) {
ret = amdgpu_fence_wait_empty(ring);
if (ret)
return -EBUSY;
}
}
adev->in_runpm = true;
if (amdgpu_device_supports_px(drm_dev))
drm_dev->switch_power_state = DRM_SWITCH_POWER_CHANGING;
/*
* By setting mp1_state as PP_MP1_STATE_UNLOAD, MP1 will do some
* proper cleanups and put itself into a state ready for PNP. That
* can address some random resuming failure observed on BOCO capable
* platforms.
* TODO: this may be also needed for PX capable platform.
*/
if (amdgpu_device_supports_boco(drm_dev))
adev->mp1_state = PP_MP1_STATE_UNLOAD;
ret = amdgpu_device_prepare(drm_dev);
if (ret)
return ret;
ret = amdgpu_device_suspend(drm_dev, false);
if (ret) {
adev->in_runpm = false;
if (amdgpu_device_supports_boco(drm_dev))
adev->mp1_state = PP_MP1_STATE_NONE;
return ret;
}
if (amdgpu_device_supports_boco(drm_dev))
adev->mp1_state = PP_MP1_STATE_NONE;
if (amdgpu_device_supports_px(drm_dev)) {
/* Only need to handle PCI state in the driver for ATPX
* PCI core handles it for _PR3.
*/
amdgpu_device_cache_pci_state(pdev);
pci_disable_device(pdev);
pci_ignore_hotplug(pdev);
pci_set_power_state(pdev, PCI_D3cold);
drm_dev->switch_power_state = DRM_SWITCH_POWER_DYNAMIC_OFF;
} else if (amdgpu_device_supports_boco(drm_dev)) {
/* nothing to do */
} else if (amdgpu_device_supports_baco(drm_dev)) {
amdgpu_device_baco_enter(drm_dev);
}
return 0;
}
static int amdgpu_pmops_runtime_resume(struct device *dev)
{
struct pci_dev *pdev = to_pci_dev(dev);
struct drm_device *drm_dev = pci_get_drvdata(pdev);
struct amdgpu_device *adev = drm_to_adev(drm_dev);
int ret;
if (adev->pm.rpm_mode == AMDGPU_RUNPM_NONE)
return -EINVAL;
/* Avoids registers access if device is physically gone */
if (!pci_device_is_present(adev->pdev))
adev->no_hw_access = true;
if (amdgpu_device_supports_px(drm_dev)) {
drm_dev->switch_power_state = DRM_SWITCH_POWER_CHANGING;
/* Only need to handle PCI state in the driver for ATPX
* PCI core handles it for _PR3.
*/
pci_set_power_state(pdev, PCI_D0);
amdgpu_device_load_pci_state(pdev);
ret = pci_enable_device(pdev);
if (ret)
return ret;
pci_set_master(pdev);
} else if (amdgpu_device_supports_boco(drm_dev)) {
/* Only need to handle PCI state in the driver for ATPX
* PCI core handles it for _PR3.
*/
pci_set_master(pdev);
} else if (amdgpu_device_supports_baco(drm_dev)) {
amdgpu_device_baco_exit(drm_dev);
}
ret = amdgpu_device_resume(drm_dev, false);
if (ret) {
if (amdgpu_device_supports_px(drm_dev))
pci_disable_device(pdev);
return ret;
}
if (amdgpu_device_supports_px(drm_dev))
drm_dev->switch_power_state = DRM_SWITCH_POWER_ON;
adev->in_runpm = false;
return 0;
}
static int amdgpu_pmops_runtime_idle(struct device *dev)
{
struct drm_device *drm_dev = dev_get_drvdata(dev);
struct amdgpu_device *adev = drm_to_adev(drm_dev);
/* we don't want the main rpm_idle to call suspend - we want to autosuspend */
int ret = 1;
if (adev->pm.rpm_mode == AMDGPU_RUNPM_NONE) {
pm_runtime_forbid(dev);
return -EBUSY;
}
ret = amdgpu_runtime_idle_check_display(dev);
pm_runtime_mark_last_busy(dev);
pm_runtime_autosuspend(dev);
return ret;
}
long amdgpu_drm_ioctl(struct file *filp,
unsigned int cmd, unsigned long arg)
{
struct drm_file *file_priv = filp->private_data;
struct drm_device *dev;
long ret;
dev = file_priv->minor->dev;
ret = pm_runtime_get_sync(dev->dev);
if (ret < 0)
goto out;
ret = drm_ioctl(filp, cmd, arg);
pm_runtime_mark_last_busy(dev->dev);
out:
pm_runtime_put_autosuspend(dev->dev);
return ret;
}
static const struct dev_pm_ops amdgpu_pm_ops = {
.prepare = amdgpu_pmops_prepare,
.complete = amdgpu_pmops_complete,
.suspend = amdgpu_pmops_suspend,
.suspend_noirq = amdgpu_pmops_suspend_noirq,
.resume = amdgpu_pmops_resume,
.freeze = amdgpu_pmops_freeze,
.thaw = amdgpu_pmops_thaw,
.poweroff = amdgpu_pmops_poweroff,
.restore = amdgpu_pmops_restore,
.runtime_suspend = amdgpu_pmops_runtime_suspend,
.runtime_resume = amdgpu_pmops_runtime_resume,
.runtime_idle = amdgpu_pmops_runtime_idle,
};
static int amdgpu_flush(struct file *f, fl_owner_t id)
{
struct drm_file *file_priv = f->private_data;
struct amdgpu_fpriv *fpriv = file_priv->driver_priv;
long timeout = MAX_WAIT_SCHED_ENTITY_Q_EMPTY;
timeout = amdgpu_ctx_mgr_entity_flush(&fpriv->ctx_mgr, timeout);
timeout = amdgpu_vm_wait_idle(&fpriv->vm, timeout);
return timeout >= 0 ? 0 : timeout;
}
static const struct file_operations amdgpu_driver_kms_fops = {
.owner = THIS_MODULE,
.open = drm_open,
.flush = amdgpu_flush,
.release = drm_release,
.unlocked_ioctl = amdgpu_drm_ioctl,
.mmap = drm_gem_mmap,
.poll = drm_poll,
.read = drm_read,
#ifdef CONFIG_COMPAT
.compat_ioctl = amdgpu_kms_compat_ioctl,
#endif
#ifdef CONFIG_PROC_FS
.show_fdinfo = amdgpu_show_fdinfo
#endif
};
int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
{
struct drm_file *file;
if (!filp)
return -EINVAL;
if (filp->f_op != &amdgpu_driver_kms_fops) {
return -EINVAL;
}
file = filp->private_data;
*fpriv = file->driver_priv;
return 0;
}
const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(AMDGPU_VM, amdgpu_vm_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(AMDGPU_SCHED, amdgpu_sched_ioctl, DRM_MASTER),
DRM_IOCTL_DEF_DRV(AMDGPU_BO_LIST, amdgpu_bo_list_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(AMDGPU_FENCE_TO_HANDLE, amdgpu_cs_fence_to_handle_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
/* KMS */
DRM_IOCTL_DEF_DRV(AMDGPU_GEM_MMAP, amdgpu_gem_mmap_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(AMDGPU_GEM_WAIT_IDLE, amdgpu_gem_wait_idle_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(AMDGPU_CS, amdgpu_cs_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(AMDGPU_INFO, amdgpu_info_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(AMDGPU_WAIT_CS, amdgpu_cs_wait_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(AMDGPU_WAIT_FENCES, amdgpu_cs_wait_fences_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(AMDGPU_GEM_METADATA, amdgpu_gem_metadata_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
};
static const struct drm_driver amdgpu_kms_driver = {
.driver_features =
DRIVER_ATOMIC |
DRIVER_GEM |
DRIVER_RENDER | DRIVER_MODESET | DRIVER_SYNCOBJ |
DRIVER_SYNCOBJ_TIMELINE,
.open = amdgpu_driver_open_kms,
.postclose = amdgpu_driver_postclose_kms,
.lastclose = amdgpu_driver_lastclose_kms,
.ioctls = amdgpu_ioctls_kms,
.num_ioctls = ARRAY_SIZE(amdgpu_ioctls_kms),
.dumb_create = amdgpu_mode_dumb_create,
.dumb_map_offset = amdgpu_mode_dumb_mmap,
.fops = &amdgpu_driver_kms_fops,
.release = &amdgpu_driver_release_kms,
.gem_prime_import = amdgpu_gem_prime_import,
.gem_prime_mmap = drm_gem_prime_mmap,
.name = DRIVER_NAME,
.desc = DRIVER_DESC,
.date = DRIVER_DATE,
.major = KMS_DRIVER_MAJOR,
.minor = KMS_DRIVER_MINOR,
.patchlevel = KMS_DRIVER_PATCHLEVEL,
};
static struct pci_error_handlers amdgpu_pci_err_handler = {
.error_detected = amdgpu_pci_error_detected,
.mmio_enabled = amdgpu_pci_mmio_enabled,
.slot_reset = amdgpu_pci_slot_reset,
.resume = amdgpu_pci_resume,
};
extern const struct attribute_group amdgpu_vram_mgr_attr_group;
extern const struct attribute_group amdgpu_gtt_mgr_attr_group;
extern const struct attribute_group amdgpu_vbios_version_attr_group;
static const struct attribute_group *amdgpu_sysfs_groups[] = {
&amdgpu_vram_mgr_attr_group,
&amdgpu_gtt_mgr_attr_group,
&amdgpu_vbios_version_attr_group,
NULL,
};
static struct pci_driver amdgpu_kms_pci_driver = {
.name = DRIVER_NAME,
.id_table = pciidlist,
.probe = amdgpu_pci_probe,
.remove = amdgpu_pci_remove,
.shutdown = amdgpu_pci_shutdown,
.driver.pm = &amdgpu_pm_ops,
.err_handler = &amdgpu_pci_err_handler,
.dev_groups = amdgpu_sysfs_groups,
};
static int __init amdgpu_init(void)
{
int r;
if (drm_firmware_drivers_only())
return -EINVAL;
r = amdgpu_sync_init();
if (r)
goto error_sync;
r = amdgpu_fence_slab_init();
if (r)
goto error_fence;
DRM_INFO("amdgpu kernel modesetting enabled.\n");
amdgpu_register_atpx_handler();
amdgpu_acpi_detect();
/* Ignore KFD init failures. Normal when CONFIG_HSA_AMD is not set. */
amdgpu_amdkfd_init();
/* let modprobe override vga console setting */
return pci_register_driver(&amdgpu_kms_pci_driver);
error_fence:
amdgpu_sync_fini();
error_sync:
return r;
}
static void __exit amdgpu_exit(void)
{
amdgpu_amdkfd_fini();
pci_unregister_driver(&amdgpu_kms_pci_driver);
amdgpu_unregister_atpx_handler();
amdgpu_sync_fini();
amdgpu_fence_slab_fini();
mmu_notifier_synchronize();
}
module_init(amdgpu_init);
module_exit(amdgpu_exit);
MODULE_AUTHOR(DRIVER_AUTHOR);
MODULE_DESCRIPTION(DRIVER_DESC);
MODULE_LICENSE("GPL and additional rights");