Add debugfs entry to enable or disable job submission to
specific vcn instances. The entry is created only when
there is more than an instance and is unified queue type.
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Userspace wants to run jobs on a specific sdma ring for verification purposes.
This debugfs entry helps to disable or enable submitting jobs to a specific ring.
This entry is populated only if there are at least two or more cores in the sdma ip.
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tim Huang <tim.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
compute/gfx may have multiple rings on some hardware.
In some cases, userspace wants to run jobs on a specific ring for validation purposes.
This debugfs entry helps to disable or enable submitting jobs to a specific ring.
This entry is populated only if there are at least two or more cores in the gfx/compute ip.
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tim Huang <tim.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
JPEG_4_0_3 has up to 32 jpeg cores and a single mjpeg video decode
will use all available cores on the hardware. This debugfs entry
helps to disable or enable job submission to a cluster of cores or
one specific core in the ip for debugging. The entry is populated
only if there is at least two or more cores in the jpeg ip.
Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There are some problem with existing amdgpu_reset_dump_register_list
debugfs node. It is supposed to read a list of registers but there
could be cases when the IP is not in correct power state. Register
read in such cases could lead to more problems.
We are taking care of all such power states in devcoredump and
dumping the registers of need for debugging. So cleaning this code
and we dont need this functionality via debugfs anymore.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Since the type of pg_flags is u32, adev->pg_flags >> 16 >> 16 is 0
regardless of the values of its operands.
So removing the operations upper_32_bits and lower_32_bits.
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Suggested-by: Tim Huang <Tim.Huang@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Added support to capture unsch fw debug logs in debugfs.
To enable set amdgpu_umschfw_log =1 in boot args.
v1 - rename variable to umsch_mm_fwlog (Veera)
Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch changes the handling and lifecycle of vm->task_info object.
The major changes are:
- vm->task_info is a dynamically allocated ptr now, and its uasge is
reference counted.
- introducing two new helper funcs for task_info lifecycle management
- amdgpu_vm_get_task_info: reference counts up task_info before
returning this info
- amdgpu_vm_put_task_info: reference counts down task_info
- last put to task_info() frees task_info from the vm.
This patch also does logistical changes required for existing usage
of vm->task_info.
V2: Do not block all the prints when task_info not found (Felix)
V3: Fixed review comments from Felix
- Fix wrong indentation
- No debug message for -ENOMEM
- Add NULL check for task_info
- Do not duplicate the debug messages (ti vs no ti)
- Get first reference of task_info in vm_init(), put last
in vm_fini()
V4: Fixed review comments from Felix
- fix double reference increment in create_task_info
- change amdgpu_vm_get_task_info_pasid
- additional changes in amdgpu_gem.c while porting
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix the warning info below during mode1 reset.
[ +0.000004] Call Trace:
[ +0.000004] <TASK>
[ +0.000006] ? show_regs+0x6e/0x80
[ +0.000011] ? __flush_work.isra.0+0x2e8/0x390
[ +0.000005] ? __warn+0x91/0x150
[ +0.000009] ? __flush_work.isra.0+0x2e8/0x390
[ +0.000006] ? report_bug+0x19d/0x1b0
[ +0.000013] ? handle_bug+0x46/0x80
[ +0.000012] ? exc_invalid_op+0x1d/0x80
[ +0.000011] ? asm_exc_invalid_op+0x1f/0x30
[ +0.000014] ? __flush_work.isra.0+0x2e8/0x390
[ +0.000007] ? __flush_work.isra.0+0x208/0x390
[ +0.000007] ? _prb_read_valid+0x216/0x290
[ +0.000008] __cancel_work_timer+0x11d/0x1a0
[ +0.000007] ? try_to_grab_pending+0xe8/0x190
[ +0.000012] cancel_work_sync+0x14/0x20
[ +0.000008] amddrm_sched_stop+0x3c/0x1d0 [amd_sched]
[ +0.000032] amdgpu_device_gpu_recover+0x29a/0xe90 [amdgpu]
This warning info was printed after applying the patch
"drm/sched: Convert drm scheduler to use a work queue rather than kthread".
The root cause is that amdgpu driver tries to use the uninitialized
work_struct in the struct drm_gpu_scheduler
v2:
- Rename the function to amdgpu_ring_sched_ready and move it to
amdgpu_ring.c (Alex)
v3:
- Fix a few more checks based on Vitaly's patch (Alex)
v4:
- squash in fix noticed by Bert in
https://gitlab.freedesktop.org/drm/amd/-/issues/3139
Fixes: 11b3b9f461 ("drm/sched: Check scheduler ready before calling timeout handling")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For 'AMDGPU_FAMILY_SI' family cards, in 'si_common_early_init' func, init
'didt_rreg' and 'didt_wreg' to 'NULL'. But in func
'amdgpu_debugfs_regs_didt_read/write', using 'RREG32_DIDT' 'WREG32_DIDT'
lacks of relevant judgment. And other 'amdgpu_ip_block_version' that use
these two definitions won't be added for 'AMDGPU_FAMILY_SI'.
So, add null pointer judgment before calling.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Lu Yao <yaolu@kylinos.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add RLCG interface support for gfx v9.4.3 and multiple XCCs.
Do not enable it yet.
v2: Fix amdgpu_rlcg_reg_access_ctrl init, add support for multiple XCCs
in amdgpu_mm_wreg_mmio_rlc
v3: Use GET_INST() when indexing amdgpu_rlcg_reg_access_ctrl
Signed-off-by: Victor Lu <victorchengchi.lu@amd.com>
Reviewed-by: Zhigang Luo <zhigang.luo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There are two bugs here.
1) Drop the lock if copy_from_user() fails.
2) If the copy fails then the correct error code is -EFAULT instead of
-EINVAL.
I also broke up the long line and changed "sizeof rd->id" to
"sizeof(rd->id)".
Fixes: 553f973a0d ("drm/amd/amdgpu: Update debugfs for XCC support (v3)")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>