Dennis Li
2c960ea02f
drm/amdgpu: add RAS callback for gfx
...
Add functions for RAS error inject and query error counter
Signed-off-by: Dennis Li <Dennis.Li@amd.com >
Reviewed-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:51:08 -05:00
Dennis Li
dc23a08f03
drm/amdgpu: add define for gfx ras subblock
...
Signed-off-by: Dennis Li <Dennis.Li@amd.com >
Reviewed-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:51:01 -05:00
Dennis Li
4bb6b8c758
drm/amd/include: add define of TCP_EDC_CNT_NEW
...
Signed-off-by: Dennis Li <Dennis.Li@amd.com >
Reviewed-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:50:54 -05:00
Dennis Li
ca3f422f53
drm/amd/include: add bitfield define for EDC registers
...
Add EDC registers to support VEGA20 RAS
Signed-off-by: Dennis Li <Dennis.Li@amd.com >
Reviewed-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:50:47 -05:00
Tao Zhou
7cdc2ee300
drm/amdgpu: remove ras_reserve_vram in ras injection
...
error injection address is not in gpu address space
Signed-off-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:50:41 -05:00
Tao Zhou
e10634938b
drm/amdgpu: add check for ras error type
...
only ue and ce errors are supported
Signed-off-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:50:35 -05:00
Tao Zhou
81e02619e9
drm/amdgpu: update interrupt callback for all ras clients
...
add err_data parameter in interrupt cb for ras clients
Signed-off-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:50:29 -05:00
Tao Zhou
cf04dfd0e9
drm/amdgpu: allow ras interrupt callback to return error data
...
add error data as parameter for ras interrupt cb and process it
Signed-off-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:50:23 -05:00
Tao Zhou
8c94810357
drm/amdgpu: query umc ras error address
...
query umc ras error address, translate it to gpu 4k page view
and save it.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:50:17 -05:00
Tao Zhou
c2742aef4d
drm/amdgpu: add structures for umc error address translation
...
add related registers, callback function and channel index table
Signed-off-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:50:11 -05:00
Tao Zhou
6f102dba80
drm/amdgpu: add support for recording ras error address
...
more than one error address may be recorded in one query
Signed-off-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:50:05 -05:00
Tao Zhou
f1ed4afa13
drm/amdgpu: update algorithm of umc uncorrectable error counting
...
remove the check of ErrorCodeExt
v2: refine the if condition for ue counting
Signed-off-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:49:58 -05:00
Tao Zhou
045c021653
drm/amdgpu: switch to amdgpu_umc structure
...
create new amdgpu_umc structure to for more umc
settings in future and switch to the new structure
Signed-off-by: Tao Zhou <tao.zhou1@amd.com >
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:49:52 -05:00
Tao Zhou
5bbfb64a17
drm/amdgpu: use 64bit operation macros for umc
...
replace some 32bit macros with 64bit operations to simplify code
Signed-off-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:49:46 -05:00
Tao Zhou
4fa1c6a679
drm/amdgpu: add RREG64/WREG64(_PCIE) operations
...
add 64 bits register access functions
v2: implement 64 bit functions in low level
Signed-off-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:49:40 -05:00
Tao Zhou
05a58345db
drm/amdgpu: add ras error count after each query (v2)
...
v1: increase ras ce/ue error count
v2: log the number of correctable and uncorrectable errors
Signed-off-by: Tao Zhou <tao.zhou1@amd.com >
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:49:33 -05:00
Hawking Zhang
939e2258ce
drm/amdgpu: querry umc error count
...
check umc error count in both ras querry function and
ras interrupt handler
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:49:28 -05:00
Hawking Zhang
5b6b35aaac
drm/amdgpu: init umc v6_1 functions for vega20
...
init umc callback function for vega20 in sw early init phase
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:49:22 -05:00
Hawking Zhang
9884c2b1c3
drm/amdgpu: add umc v6_1 query error count support
...
Implement umc query_ras_error_count function to support querry
both correctable and uncorrectable error
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:49:16 -05:00
Hawking Zhang
03c9963f47
drm/amdgpu: add umc v6_1_1 IP headers
...
the change introduces IP headers for unified memory controller (umc)
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:49:10 -05:00
Hawking Zhang
245219a660
drm/amdgpu: add rsmu v_0_0_2 ip headers
...
remote smu (rsmu) is a sub-block used as ip register interface,
error handling, reset generation.etc
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:49:03 -05:00
Hawking Zhang
9e585a523b
drm/amdgpu: add amdgpu_umc_functions structure
...
This is common structure as UMC callback function
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:48:57 -05:00
Hawking Zhang
6501a77170
drm/amdgpu: init RSMU and UMC ip base address for vega20
...
the driver needs to program RSMU and UMC registers to
support vega20 RAS feature
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:48:51 -05:00
Hawking Zhang
7af25d5b7e
drm/amdgpu: move some ras data structure to amdgpu_ras.h
...
These are common structures that can be included by IP specific
source files
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:48:32 -05:00
Alex Deucher
fa1884f9d8
drm/amdgpu: drop drmP.h from vcn_v2_5.c
...
Unused.
Acked-by: Sam Ravnborg <sam@ravnborg.org >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:33:41 -05:00