Commit graph

16431 commits

Author SHA1 Message Date
You, Min-Hsuan
22ff26a0be amd/vpelib: fix coverity defects
\[WHY\]
The reason for making these changes is to address the defects identified
by the Coverity scan. By fixing these defects, we can ensure that any
future defects generated by our own code changes can be easily
identified and resolved.

\[HOW\]
To implement the changes/fixes, the following steps were taken:

1. CHECKED_RETURN: All cases were aligned to check the return value.
2. DC.WEAK_CRYPTO: The use of rand() to generate random numbers was
replaced with a more secure method using platofrm API.
3. DEADCODE: Useless code that always returned true was removed.
4. DIVIDE_BY_ZERO: A check was added to ensure that the divisor is not
zero before performing division.
5. HFA: An unused header was removed.
6. MISSING_BREAK: A break statement was added in switch cases where it
was missing.
7. PASS_BY_VALUE: Parameters that were being passed by value and were
too big were changed to pass by reference.

\[TESTING\]
What testings have been done (test IDs and json file):

Reviewed-by: Tomson Chang <Tomson.Chang@amd.com>
Signed-off-by: Min-Hsuan You <Min-Hsuan.You@amd.com>
Acked-by: Chenyu Chen <Chen-Yu.Chen@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32646>
2024-12-16 08:13:40 +00:00
Koo, Anthony
54c4accdb1 amd/vpelib: Add system event logging
\[WHY\]
System event logs are different than string logging. They are meant to
generate light weight events with ID and variable args and can be
coalesced with events generated by other IP components.

\[HOW\]
Add a callback function, which is implemented by the client (Like PAL)

VPELIB adds defines for a list of possible event IDs

The client is expected to handle the callback
And translate and emit the event through
native system infrastructure like ETW logging.

\[TESTING\]
Tested on system that triggers sys event, and viewed the event through
ETW viewer

Signed-off-by : Anthony Koo <anthony.koo@amd.com>

Reviewed-by: Roy Chan <Roy.Chan@amd.com>
Acked-by: Chenyu Chen <Chen-Yu.Chen@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32646>
2024-12-16 08:13:40 +00:00
Zhao, Jiali
fe58bb70e6 amd/vpelib: 420 and 422 Output Single Segment cositing support
fix the style complaint
add 709 jfif color space handling

Reviewed-by: Roy Chan <Roy.Chan@amd.com>
Reviewed-by: Jesse Agate <Jesse.Agate@amd.com>
Signed-off-by: Jiali Zhao <Jiali.Zhao@amd.com>
Acked-by: Chenyu Chen <Chen-Yu.Chen@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32646>
2024-12-16 08:13:40 +00:00
Visan, Tiberiu
b06ee9074d amd/vpelib: fixed file headers for Palamida scan
\[WHY\]
Some header files in VPE lib did not have the proper copyright header

\[HOW\]
Proper copyrights were put in place

Reviewed-by: Roy Chan <roy.chan@amd.com>
Co-authored-by: Tiberiu Visan <tvisan@amd.com>
Acked-by: Chenyu Chen <Chen-Yu.Chen@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32646>
2024-12-16 08:13:40 +00:00
Leder, Brendan Steve
4ef45d8d4e amd/vpelib: Move bg color
Refactor bg gen as it check_bg_support simply calls into other version specific function.
Move that function directly into check_bg_support call, and refactor unnecessary functions + format fix.

Co-authored-by: Brendan <breleder@amd.com>
Reviewed-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Reviewed-by: Jesse Agate <Jesse.Agate@amd.com>
Reviewed-by: Navid Assadian <Navid.Assadian@amd.com>
Acked-by: Chenyu Chen <Chen-Yu.Chen@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32646>
2024-12-16 08:13:40 +00:00
Marek Olšák
3943ed8199 ac/lower_ngg: improve streamout code generation for gfx12/ACO to match LLVM
ACO is still not perfect:
* It generates s_wait_loadcnt 0x0-0x3 when the only required wait instruction
  is s_wait_loadcnt 0x5.
* It generates a lot of unnecessary jumps and blocks for uniform loop breaks.
  Only scc1 jumps are necessary to break the loop. This is 10x better than
  LLVM, but even ACO might consider using nir_intrinsic_ordered_add_loop_gfx12_amd
  for the best performance.

How to print the streamout asm on any GPU:
    PIGLIT_PLATFORM=gbm AMD_FORCE_FAMILY=gfx12_16pipe AMD_DEBUG=vs,mono,asm,useaco ../piglit/bin/shader-io-rate vs_out_xfb

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32570>
2024-12-16 07:35:07 +00:00
Qiang Yu
d38efee8ef aco: enable gfx12 support for radeonsi
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32570>
2024-12-16 07:35:07 +00:00
Qiang Yu
b14cc34415 ac/surf: add more modifiers to gfx12 supported list
OpenGL will export these modifiers for various sized
textures.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32570>
2024-12-16 07:35:06 +00:00
Qiang Yu
b3a218d444 ac/surface/tests: support all block sizes
We are going to add more modifiers.

GFX9 has 4K DCC and non-DCC modifiers while others only have
4K non-DCC modifiers.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32570>
2024-12-16 07:35:06 +00:00
Samuel Pitoiset
1291981ebd radv: mark HAWAII (GFX7) as Vulkan 1.3 conformant
https://www.khronos.org/conformance/adopters/conformant-products#submission_848

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32620>
2024-12-13 13:32:38 +00:00
Valentine Burley
680885d57e radv/ci: Convert Valve RADV jobs to deqp-runner suites
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32558>
2024-12-13 09:36:08 +00:00
Friedrich Vock
0c02a7e8e8 radv/rt: Remove nir_intrinsic_execute_callable instrs in monolithic mode
It's allowed to place OpExecuteCallableKHR in a SPIR-V, even if the RT
pipeline doesn't contain any callable shaders. Unreal hits this case and
crashes. We can assume the intrinsic never gets executed, so we can
simply remove it.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32613>
2024-12-13 01:50:58 +00:00
Samuel Pitoiset
c7a7f0244f radv: add radv_lower_terminate_to_discard and enable for Indiana Jones
To workaround game bug.

This fixes the rendering issue with eyes.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32606>
2024-12-12 19:54:39 +00:00
Timur Kristóf
deab81fb0d radv: Configure implicit VS primitive ID to be per-primitive.
This is beneficial to applications that rely on
the implicit primitive ID from VS.

- We don't have to disable provoking vertex reuse,
  which results in more efficient vertex processing.
- There is no LDS access needed to export the primitive ID,
  because it is already available to GS threads.
- As a consequence of not needing LDS, we can use this
  together with NGG passthrough mode.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32270>
2024-12-12 18:11:47 +00:00
Timur Kristóf
95ac0f8d76 radv: Reorder FS primitive ID input after layer and viewport.
We want to make the implicit VS primitive ID a per-primitive
output attribute, which means that this has to be last.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32270>
2024-12-12 18:11:47 +00:00
Timur Kristóf
9224b9a752 ac/nir/ngg: Add ability to store primitive ID as per-primitive.
This configuration will be enabled in RADV in a subsequent commit.

On GFX10.3:
Do this together with the primitive export, to avoid adding extra
CF, and to ensure optimal access of the export space.

On GFX11:
It's not an export but a memory store instruction, so always do
it earlier and ensure the optimal attribute ring access pattern.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32270>
2024-12-12 18:11:45 +00:00
Timur Kristóf
d670dc0c0b radv: Only set NGG_DISABLE_PROVOK_REUSE for VS.
It doesn't do anything useful for other stages.

In VS, we use this when the implicit primitive ID is needed,
so that we can export that as a per-vertex attribute of the
provoking vertex.

In TES, the patch ID (which is used as the primitive ID) is
already a per-vertex input VGPR, so it doesn't make sense to
configure this.

In GS, the primitive ID is explicitly written by the shader,
so it makes no sense to disable provoking vertex reuse in the
input.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32270>
2024-12-12 18:11:45 +00:00
Rhys Perry
9fe92689cc radv: increase maxComputeWorkGroupCount[0]
Match AMDVLK and radeonsi.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32577>
2024-12-12 17:38:47 +00:00
Rhys Perry
53d0187bab aco: decrease max_workgroup_size
Match the limit of radeonsi and RADV.

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32577>
2024-12-12 17:38:46 +00:00
Rhys Perry
87f2f77960 aco: fix max_workgroup_count[0]
This is necessary for radeonsi.

fossil-db (navi21):
Totals from 292 (0.37% of 79395) affected shaders:
Instrs: 305965 -> 306182 (+0.07%); split: -0.00%, +0.07%
CodeSize: 1624816 -> 1627212 (+0.15%); split: -0.00%, +0.15%
Latency: 5244652 -> 5243587 (-0.02%); split: -0.07%, +0.05%
InvThroughput: 1221089 -> 1225285 (+0.34%); split: -0.04%, +0.38%
Copies: 22712 -> 22702 (-0.04%)
PreSGPRs: 10713 -> 10712 (-0.01%)
PreVGPRs: 10918 -> 10920 (+0.02%)
VALU: 178613 -> 178836 (+0.12%)
SALU: 43490 -> 43493 (+0.01%); split: -0.02%, +0.03%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32577>
2024-12-12 17:38:46 +00:00
Konstantin
815ca049cd vulkan: Fix the argument order of update_as
Also moves the src argument before dst which is more consistent.

Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32488>
2024-12-12 11:15:08 +00:00
Samuel Pitoiset
370886c898 Revert "radv: disable alphaToOne except for Zink"
This reverts commit 3b010a9e60.

This should be fixed properly now.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32583>
2024-12-12 10:07:25 +00:00
Samuel Pitoiset
c3a050da07 radv: fix alpha-to-coverage with alpha-to-one without MRTZ
This injects a MRTZ export with only the alpha channel to select it
with COVERAGE_TO_MASK_ENABLE for alpha-to-coverage.

Co-Authored-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32583>
2024-12-12 10:07:25 +00:00
Samuel Pitoiset
838b1cfcbd radv: simplify determining some fragment shader info with epilogs
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32583>
2024-12-12 10:07:25 +00:00
Collabora's Gfx CI Team
8085984aa2 Uprev Piglit to 4c0fd15fd956ec70c5509bedee219d602b334464
468221c722...4c0fd15fd9

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32478>
2024-12-12 09:33:59 +00:00
Samuel Pitoiset
4d1aa9a2d0 radv: fix disabling DCC for stores with drirc
Displayable DCC should also be disabled, otherwise it's asserting
somewhere in ac_surface.c

Fixes: e3d1f27b31 ("radv: add radv_disable_dcc_stores and enable for Indiana Jones: The Great Circle")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32584>
2024-12-12 09:11:37 +00:00
Daniel Schürmann
26a3038b65 aco/lower_branches: remove edges between blocks if there is no direct branch
This way, linear predecessors and successors better reflect the
actual control flow which improves wait state insertion and hazard
mitigation.

Totals from 10252 (12.91% of 79395) affected shaders: (Navi31)

Instrs: 18824540 -> 18803823 (-0.11%); split: -0.11%, +0.00%
CodeSize: 99025464 -> 98942028 (-0.08%); split: -0.08%, +0.00%
Latency: 169291854 -> 165781877 (-2.07%); split: -2.07%, +0.00%
InvThroughput: 29701086 -> 29228602 (-1.59%); split: -1.59%, +0.00%
SClause: 510587 -> 510586 (-0.00%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32389>
2024-12-12 08:46:22 +00:00
Daniel Schürmann
22ffe72022 aco: move branch lowering optimization into separate file 'aco_lower_branches.cpp'
No fossil changes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32389>
2024-12-12 08:46:22 +00:00
Friedrich Vock
845660f2b7 aco/lower_to_hw_instr: Check the right instruction's opcode
instr is the branch instruction, its opcode won't ever be writelane. We
should check inst instead.

Found by inspection.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32389>
2024-12-12 08:46:21 +00:00
Daniel Schürmann
28ab7f0168 aco/jump_threading: remove branch sequence optimization
This optimization gets applied during postRA optimization, now.

No fossil changes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32330>
2024-12-12 08:11:22 +00:00
Daniel Schürmann
fcd94a8ca7 aco: move try_optimize_branching_sequence() to postRA optimizations
Totals from 196 (0.25% of 79206) affected shaders: (Navi31)

Instrs: 534343 -> 534438 (+0.02%); split: -0.00%, +0.02%
CodeSize: 2774852 -> 2775420 (+0.02%); split: -0.00%, +0.02%
Latency: 7103512 -> 7103021 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 959477 -> 959447 (-0.00%)
Copies: 42646 -> 42648 (+0.00%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32330>
2024-12-12 08:11:21 +00:00
Daniel Schürmann
95d44c7ce0 aco/optimizer_postRA: set branch()->never_taken if exec is constant non-zero
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32330>
2024-12-12 08:11:21 +00:00
Daniel Schürmann
d67932f69e aco/print_ir: don't print disconnected empty blocks
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32330>
2024-12-12 08:11:21 +00:00
Daniel Schürmann
22881712c8 aco/assembler: Don't emit target basic block index when chaining branches
This could erroneously cause an assertion to fail if the
target block index was larger than UINT16_MAX.

Fixes: cab5639a09 ('aco/assembler: chain branches instead of emitting long jumps')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32599>
2024-12-11 23:28:55 +00:00
Tim Huang
ad75b9f1a6 amd: add GFX v11.5.3 support
This enables support for GFX version 11.5.3.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32567>
2024-12-11 19:14:34 +00:00
Samuel Pitoiset
167f4a87c6 radv: remove remaining discard to demote options
This is the default but the option wasn't completely removed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32590>
2024-12-11 17:59:13 +00:00
Georg Lehmann
65506e635b aco/ra: don't write to scc/ttmp with s_fmac
Fixes: 4bd229ac50 ("aco/gfx11.5: select SOP2 float instructions")

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32545>
2024-12-11 12:51:18 +00:00
Georg Lehmann
0b9e2a5427 aco/ra: disallow s_cmpk with scc operand
Fixes: 2d6b0a4177 ("aco/optimizer: Optimize SOPC with literal to SOPK.")

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32545>
2024-12-11 12:51:18 +00:00
Georg Lehmann
fe0c72caec aco/ra: don't write to exec/ttmp with mulk/addk/cmovk
ttmp sgprs are readonly outside of trap handlers, so the instructions were
probably skipped. RA should also never create additional exec writes.

Fixes: e06773281b ("aco/ra: Optimize some SOP2 instructions with literal to SOPK.")

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32545>
2024-12-11 12:51:18 +00:00
Georg Lehmann
576a2e798c aco/gfx12: don't assume memory operations complete in order
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32569>
2024-12-11 12:22:59 +00:00
Samuel Pitoiset
553eb1a3fd radv: fix alpha-to-coverage with alpha-to-one when MRTZ is also exported
On AMD hardware, it's possible to export a separate alpha channel for
applying alpha-to-one after alpha-to-coverage and not before.

On GFX11+, it's already mostly supported but alpha needs to be exported
to MRTZ.a and one to MRT0.a. The hw always uses alpha for
alpha-to-coverage from MRTZ.a.

On older generations, the driver needs the same separate alpha export
but it also needs to configure the hardware with COVERAGE_TO_MASK_ENABLE
which selects alpha from MRTZ.a.

This should fix alpha-to-coverage with alpha-to-one when either
depth, stencil or samplemask are exported but it still needs a slightly
different solution without MRTZ. I will fix that later.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32523>
2024-12-11 10:50:31 +00:00
Samuel Pitoiset
70047e6bd6 aco: export alpha to MRTZ.a and one to MRT0.a for alpha-to-one on GFX11
For FS epilogs.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32523>
2024-12-11 10:50:31 +00:00
Samuel Pitoiset
1037830098 ac/nir: export alpha to MRTZ.a and one to MRT0.a for alpha-to-one on GFX11
When alpha-to-coverage and alpha-to-one are both enabled in the
fragment shader, the alpha value should be exported through MRTZ and
one to MRT0.a. Otherwise, alpha-to-one will be performed before
alpha-to-coverage.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32523>
2024-12-11 10:50:31 +00:00
Samuel Pitoiset
3b010a9e60 radv: disable alphaToOne except for Zink
It's just completely broken if alpha-to-coverage is also used. Given
this is a requirement for Zink and no known GL apps are affected, let's
disable for everything except Zink. This is going to be fixed but the
fixes aren't small enough for backporting.

This fixes a rendering issue with XPlane 12.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32576>
2024-12-11 08:28:25 +00:00
Samuel Pitoiset
310e03c8d3 radv/meta: move vk_meta_device_init() to radv_device_init_meta()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32571>
2024-12-11 07:25:29 +00:00
Samuel Pitoiset
cad6e1ea09 radv/meta: add missing vk_meta_device_finish()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32571>
2024-12-11 07:25:29 +00:00
Martin Roukala (né Peres)
b8054556a0 radv/ci: update the vkcts expectations
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32541>
2024-12-11 04:24:35 +00:00
Dave Airlie
699afb88ec radv/video: set max slice counts to 1 for h264/5 encode
Right now the driver doesn't support multi-slice encodes, so
report the correct value.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Autumn Ashton
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32580>
2024-12-11 01:11:32 +00:00
Georg Lehmann
7bb3f93fa9 radv: use nir_opt_clip_cull_const
This reduces the fixed function hardware bottleneck in vertex shading
heavy sections with d3d9 DXVK.

DXVK always declares 6 clip distances, and with
https://github.com/doitsujin/dxvk/pull/4508
it writes 0 for disabled clip planes.

But while working on this, I also discovered that other games also write
constant clip/cull distances:

Foz-DB Navi21:
Totals from 442 (0.56% of 79206) affected shaders:
MaxWaves: 10880 -> 10900 (+0.18%)
Instrs: 283569 -> 282493 (-0.38%); split: -0.38%, +0.01%
CodeSize: 1497596 -> 1491948 (-0.38%); split: -0.38%, +0.00%
VGPRs: 18928 -> 18888 (-0.21%)
Outputs: 3026 -> 2970 (-1.85%)
Latency: 1051990 -> 1048878 (-0.30%); split: -0.47%, +0.18%
InvThroughput: 184190 -> 183366 (-0.45%); split: -0.45%, +0.01%
VClause: 6071 -> 6077 (+0.10%); split: -0.38%, +0.48%
SClause: 6724 -> 6725 (+0.01%); split: -0.12%, +0.13%
Copies: 19624 -> 19160 (-2.36%); split: -2.44%, +0.08%
PreSGPRs: 18634 -> 18526 (-0.58%)
PreVGPRs: 15522 -> 15465 (-0.37%)
VALU: 156964 -> 156330 (-0.40%); split: -0.41%, +0.01%
SALU: 41802 -> 41748 (-0.13%); split: -0.13%, +0.00%

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32518>
2024-12-10 16:35:01 +00:00
Georg Lehmann
4f88dbffa4 radv: track holes in the clip/cull masks
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32518>
2024-12-10 16:35:01 +00:00