Commit Graph

9 Commits

Author SHA1 Message Date
eric mcdaniel
502749c59a Fix for async compute on platforms with memory boundary restrictions on async compute dispatch indirect arguments
*** This change will incur a full shader invalidation across all platforms ***

Issues:
  - Some platforms require async compute dispatch indirect arguments to not cross specific memory boundaries
    - This places restrictions on the valid sizes for a dispatch indirect argument set.  We were not conforming to these restrictions which could result in GPU crashes on these async passes

Fixes:
  - FRHIDispatchIndirectParameters is padded out to meet per-platform memory boundary restrictions
    - This is driven via new per-platform preprocessor define PLATFORM_DISPATCH_INDIRECT_ARGUMENT_BOUNDARY_SIZE
    - Some platforms require FRHIDispatchIndirectParameters to align with their internal structure hence we cannot universally size to meet all platform's requirements

  - Introduce new FRHIDispatchIndirectParametersNoPadding for uses when we explicitly do not want the padding and otherwise avoid the memory boundary restrictions

  - Revise and expand indirect argument validation code to catch further such issues in the future

  - Update shaders which write to dispatch indirect argument buffers to account for optional per-platform padding
    - New utility function WriteDispatchIndirectArgs introduced to faciliate this
    - platforms which require other than the default nonpadded dispatch indirect arguments must define DISPATCH_INDIRECT_UINT_COUNT and their own WriteDispatchIndirectArgs in their CommonPlatform.ush

  - move creation of DispatchIndirectGraphicsCommandSignature command signature to be per-platform
    - DispatchIndirectGraphicsCommandSignature and DispatchIndirectComputeCommandSignature stride changed to account for additional padding on impacted platforms

Testing:
  - ran Lyra with and without async compute Lumen on impacted platforms as well as Win64
  - ran FN replay on impacted platforms

#rb Krzysztof.Narkowicz, Ben.Woodhouse, Benjamin.Rouveyrol
#jira UE-167950
#preflight 6359563b2e6690262a11bc06

[CL 22862498 by eric mcdaniel in ue5-main branch]
2022-10-31 10:15:11 -04:00
Marc Audy
0c3be2b6ad Merge Release-Engine-Staging to Test @ CL# 18240298
[CL 18241953 by Marc Audy in ue5-release-engine-test branch]
2021-11-18 14:37:34 -05:00
lukas hermanns
0f2c00fc92 Use explicit unsigned integer literal in ComputeGenerateMips.usf to fix compile error with HLSLcc introduced with CL 17983475.
#rb none
[FYI] Tiantian.Xie
#jira none
#rnx

#ROBOMERGE-AUTHOR: lukas.hermanns
#ROBOMERGE-SOURCE: CL 17987954 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v885-17909292)

[CL 17987974 by lukas hermanns in ue5-release-engine-test branch]
2021-10-29 17:16:24 -04:00
tiantian xie
f97ab2884f [Major] Reduce SSS performance overhead when SSS only occupies a small region (e.g., from 0.7ms to 0.2ms).
1. Reduce unnecessary UAV Clear with property velocity projection. Only subpass one UAV is cleared conditionally when separable tiles are detected.
            2. Remove subsurface view copy and use tile based recombine.
            3. Mipmaps are generated only when there are many Burley tiles ( default r.SSS.Burley.MinGenerateMipsTileCount to 4000)
[Minor] Rename pass and texture names, and use proper descriptors.

#jira UE-131573
#rb jian.ru
#ushell-cherrypick of 17979880 by Tiantian.Xie
#preflight 617c0c6b5dbdbc00016ea885

#ROBOMERGE-AUTHOR: tiantian.xie
#ROBOMERGE-SOURCE: CL 17983475 in //UE5/Main/...
#ROBOMERGE-BOT: STARSHIP (Main -> Release-Engine-Test) (v885-17909292)

[CL 17983481 by tiantian xie in ue5-release-engine-test branch]
2021-10-29 12:53:33 -04:00
Marcus Wassmer
3b81cf8201 Merging using //UE5/Main_to_//UE5/Release-Engine-Staging @14384769
autoresolved files
#rb none

[CL 14384911 by Marcus Wassmer in ue5-main branch]
2020-09-24 00:43:27 -04:00
Marc Audy
360d078ca3 Second batch of remaining Engine copyright updates.
#rnx
#rb none

[CL 10871248 by Marc Audy in Main branch]
2019-12-27 09:26:59 -05:00
thomas engel
e3e09a10c5 FGenerateMips: adding code to allow for manual caching of setups to avoid circular refs & leaks; adding code to allow for render-based fallback for mip-generation on Android ES (bringing cleaned up, deadended code from 11.2x to 11.30)
#rb none
#rnx


#ROBOMERGE-OWNER: thomas.engel
#ROBOMERGE-AUTHOR: thomas.engel
#ROBOMERGE-SOURCE: CL 10291725 via CL 10292114 via CL 10292159
#ROBOMERGE-BOT: (v593-10286020)

[CL 10292507 by thomas engel in Main branch]
2019-11-18 18:19:05 -05:00
thomas engel
584867b170 Adding sRGB compatible mip-generation code to mips generator (compute case)
#rb none


#ROBOMERGE-SOURCE: CL 9976254 via CL 9976261 via CL 9976262 via CL 9976264
#ROBOMERGE-BOT: (v560-9963197)

[CL 9976305 by thomas engel in Main branch]
2019-11-04 19:42:39 -05:00
Arciel Rekman
7ef9626fe8 Copying //UE4/Dev-Console@6677439 to Dev-Main (//UE4/Dev-Main)
#rb none

[CL 6677614 by Arciel Rekman in Main branch]
2019-05-30 14:48:02 -04:00