Rhys Perry
333cb92f69
ac/nir: run nir_lower_vars_to_ssa after nir_lower_task_shader
...
nir_lower_task_shader does nir_lower_returns, so we need this if the
launch_mesh_workgroups was in control flow.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13326
Backport-to: 25.1
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35411 >
(cherry picked from commit bc2edf14d8 )
2025-06-18 17:55:45 +02:00
Marek Olšák
2c122d478b
ac/nir: set X=0 for task->mesh shader dispatch when Y or Z is 0
...
The code set X=0 when Y and Z is 0, not "or".
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34432 >
2025-04-16 06:08:48 +00:00
Marek Olšák
27d5be13c6
ac/nir/cull: always do frustum culling, skip only small prim culling
...
Only small prim culling uses the viewport state, so only that must be
disabled when there are multiple viewports.
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34016 >
2025-04-07 19:44:22 +00:00
Marek Olšák
0f97dc707d
ac/nir/cull: rename skip_viewport_culling -> skip_viewport_state_culling
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34016 >
2025-04-07 19:44:22 +00:00
Marek Olšák
1d5c42528b
nir/opt_algebraic: lower 16-bit imul_high & umul_high
...
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34016 >
2025-04-07 19:44:22 +00:00
Marek Olšák
ce716d009f
ac/nir/cull: cull small prims using a point-triangle intersection test
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This is based on Timur Kristof's code, but there are a lot of differences.
The idea is that it doesn't just compute an intersection between a point
and a triangle. It computes the *distance* between a point and a triangle
and it does so in screen space. It accurately takes the subpixel precision
of the rasterizer into account, so that it works optimally at all
resolutions, all MSAA modes, and all quant modes.
The distance computation is only approximated because it only considers
the infinite lines going through triangle edges. However, it seems to be
more than sufficient in practice because the existing rounding-based small
prim culling compensates for it.
The performance improvement is up to 10% in some geometry-bound tests,
though targeted microbenchmarks can show a lot more than that.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33361 >
2025-04-01 16:12:22 +00:00
Pierre-Eric Pelloux-Prayer
785df1b980
ac/nir: fix nir_metadata value of ac_nir_lower_image_opcodes
...
This pass can insert new blocks so 'nir_metadata_control_flow' is not
preserved.
Fixes: eaf98b1422 ("ac/nir: implement image opcode emulation for CDNA, enable it in radeonsi")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34241 >
2025-03-31 15:19:29 +02:00
Timur Kristóf
64c6930bfc
ac/nir/ngg: Remove cleanup_culling_shader_after_dce.
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Not needed anymore, now that the new concept is there.
No Fossil DB changes on Navi 21.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22073 >
2025-03-29 00:47:20 +00:00
Timur Kristóf
243a80be44
ac/nir/ngg: Use deferred info for compacted arguments.
...
This means we don't have to emit dead code anymore and can only
repack the sysvals that are actually used by the deferred part.
No Fossil DB changes on Navi 21.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22073 >
2025-03-29 00:47:20 +00:00
Timur Kristóf
0b71293358
ac/nir/ngg: Gather info about what the deferred shader part uses.
...
Now that the deferred shader part is prepared before emitting
the non-deferred part, we can also gather info about what sysvals
it needs.
No Fossil DB changes on Navi 21.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22073 >
2025-03-29 00:47:20 +00:00
Timur Kristóf
e4c91c01e3
ac/nir/ngg: Prepare deferred shader part before adding culling code.
...
The previous concept was to emit the non-deferred shader part
first, including the culling code, and then modify the
non-deferred part accordingly.
This caused some issues because it was really impossible to tell
which sysvals the deferred part needs after DCE, so we had to
run an additional cleanup pass afterwards.
The new concept is to prepare the deferred part first by applying
reusable variables (from the non-deferred part) and run DCE.
This opens the possibility to accurately gather info about what
the deferred part needs.
This idea is further expanded in the next commits.
Fossil DB stats on Navi 21:
Totals from 17 (0.02% of 79377) affected shaders:
Instrs: 18063 -> 18064 (+0.01%)
CodeSize: 93368 -> 93372 (+0.00%)
Latency: 49889 -> 49899 (+0.02%); split: -0.01%, +0.03%
SALU: 2416 -> 2417 (+0.04%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22073 >
2025-03-29 00:47:20 +00:00
Timur Kristóf
e9e58fa412
ac/nir/ngg: Remove inputs_needed_by_*
...
This information will be collected by NIR core better,
no need to do it here. It is also currently unused.
No functional changes.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22073 >
2025-03-29 00:47:20 +00:00
Timur Kristóf
1e7d28a82e
ac/nir/ngg: Improve reuse of position value.
...
Instead of hand-rolled code, use nir_scalar and its
helper functions to reuse the position value.
Results in more copies, which are mitigated by
copy prop from the previous commit.
This helps eliminate some instructions, especially VMEM loads
from the deferred shader part of NGG culling shaders, which
can be reused from the position values calculated by the
non-deferred part.
Fossil DB stats on Navi 21:
Totals from 2472 (3.11% of 79377) affected shaders:
MaxWaves: 78748 -> 78772 (+0.03%)
Instrs: 636342 -> 633739 (-0.41%); split: -0.45%, +0.04%
CodeSize: 3444740 -> 3427172 (-0.51%); split: -0.53%, +0.02%
VGPRs: 62552 -> 62176 (-0.60%)
Latency: 2025711 -> 2019449 (-0.31%); split: -0.73%, +0.42%
InvThroughput: 221140 -> 221946 (+0.36%); split: -0.12%, +0.49%
VClause: 5443 -> 5278 (-3.03%); split: -3.20%, +0.17%
SClause: 8369 -> 8302 (-0.80%); split: -0.82%, +0.02%
Copies: 102435 -> 101652 (-0.76%); split: -0.87%, +0.11%
PreSGPRs: 63714 -> 63533 (-0.28%)
PreVGPRs: 48555 -> 48392 (-0.34%)
VALU: 242165 -> 241457 (-0.29%); split: -0.33%, +0.04%
SALU: 197656 -> 197482 (-0.09%); split: -0.10%, +0.01%
VMEM: 7746 -> 7571 (-2.26%)
SMEM: 10822 -> 10730 (-0.85%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22073 >
2025-03-29 00:47:20 +00:00
Timur Kristóf
f7a160d501
ac/nir/ngg: Run copy propagation.
...
Helps eliminate needless copies caused by reusing variables.
Mitigates negative stats from the next commit.
Fossil DB stats on Navi 21:
Totals from 109 (0.14% of 79377) affected shaders:
Instrs: 124480 -> 124486 (+0.00%); split: -0.00%, +0.01%
CodeSize: 651444 -> 651468 (+0.00%); split: -0.00%, +0.00%
Latency: 754120 -> 754116 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 174384 -> 174383 (-0.00%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22073 >
2025-03-29 00:47:20 +00:00
Georg Lehmann
8648d7cf75
ac/nir: set has_mul24_relaxed
...
This is only used by OpenCL.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33871 >
2025-03-27 06:24:16 +00:00
Georg Lehmann
09ff1c28d8
ac/nir/lower_ps_late: consider dcc decompression for null exports
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33835 >
2025-03-07 15:00:37 +00:00
Marek Olšák
d2141e6751
ac/nir/ngg: add an option to skip viewport-based culling
...
We can do W and face culling when we have multiple viewports, but not
frustum and small prim culling because those are dependent on the viewport.
When a shader writes the viewport index, the new option allows skipping
viewport-based culling while keeping W and face culling.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33482 >
2025-03-06 21:10:48 +00:00
Marek Olšák
d429e35169
ac/nir/cull: extract a helper calling accept_func
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33482 >
2025-03-06 21:10:48 +00:00
Marek Olšák
177c9b173e
Revert "ac/nir: clamp vertex color outputs in the right place"
...
This reverts commit b3fc49686e .
It was a rebase failure.
Fixes: b3fc49686e
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33482 >
2025-03-06 21:10:47 +00:00
Marek Olšák
e99efe7164
ac,radeonsi: don't set num_slots/src/dest_type/write_mask when they're set automatically
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33482 >
2025-03-06 21:10:47 +00:00
Georg Lehmann
975be7ac5d
ac/nir/mem_access_bit_sizes: split unaligned vec3 lds access to allow more read2/write2
...
Foz-DB Navi21:
Totals from 77 (0.10% of 79377) affected shaders:
Instrs: 69787 -> 68745 (-1.49%); split: -1.51%, +0.02%
CodeSize: 367256 -> 360060 (-1.96%); split: -1.97%, +0.01%
VGPRs: 3896 -> 3880 (-0.41%)
Latency: 335403 -> 335297 (-0.03%); split: -0.11%, +0.08%
InvThroughput: 102766 -> 102931 (+0.16%); split: -0.09%, +0.25%
VClause: 1645 -> 1643 (-0.12%); split: -0.18%, +0.06%
SClause: 1434 -> 1433 (-0.07%)
Copies: 4280 -> 4283 (+0.07%); split: -0.56%, +0.63%
PreVGPRs: 2408 -> 2421 (+0.54%); split: -0.08%, +0.62%
VALU: 45557 -> 45646 (+0.20%); split: -0.10%, +0.29%
SALU: 6458 -> 6474 (+0.25%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33448 >
2025-03-01 18:26:54 +00:00
Alyssa Rosenzweig
9a58a8257e
treewide: Switch to nir_progress
...
Via the Coccinelle patch at the end of the commit message, followed by
sed -ie 's/progress = progress | /progress |=/g' $(git grep -l 'progress = prog')
ninja -C ~/mesa/build clang-format
cd ~/mesa/src/compiler/nir && clang-format -i *.c
agxfmt
@@
identifier prog;
expression impl, metadata;
@@
-if (prog) {
-nir_metadata_preserve(impl, metadata);
-} else {
-nir_metadata_preserve(impl, nir_metadata_all);
-}
-return prog;
+return nir_progress(prog, impl, metadata);
@@
expression prog_expr, impl, metadata;
@@
-if (prog_expr) {
-nir_metadata_preserve(impl, metadata);
-return true;
-} else {
-nir_metadata_preserve(impl, nir_metadata_all);
-return false;
-}
+bool progress = prog_expr;
+return nir_progress(progress, impl, metadata);
@@
identifier prog;
expression impl, metadata;
@@
-nir_metadata_preserve(impl, prog ? (metadata) : nir_metadata_all);
-return prog;
+return nir_progress(prog, impl, metadata);
@@
identifier prog;
expression impl, metadata;
@@
-nir_metadata_preserve(impl, prog ? (metadata) : nir_metadata_all);
+nir_progress(prog, impl, metadata);
@@
expression impl, metadata;
@@
-nir_metadata_preserve(impl, metadata);
-return true;
+return nir_progress(true, impl, metadata);
@@
expression impl;
@@
-nir_metadata_preserve(impl, nir_metadata_all);
-return false;
+return nir_no_progress(impl);
@@
identifier other_prog, prog;
expression impl, metadata;
@@
-if (prog) {
-nir_metadata_preserve(impl, metadata);
-} else {
-nir_metadata_preserve(impl, nir_metadata_all);
-}
-other_prog |= prog;
+other_prog = other_prog | nir_progress(prog, impl, metadata);
@@
identifier prog;
expression impl, metadata;
@@
-if (prog) {
-nir_metadata_preserve(impl, metadata);
-} else {
-nir_metadata_preserve(impl, nir_metadata_all);
-}
+nir_progress(prog, impl, metadata);
@@
identifier other_prog, prog;
expression impl, metadata;
@@
-if (prog) {
-nir_metadata_preserve(impl, metadata);
-other_prog = true;
-} else {
-nir_metadata_preserve(impl, nir_metadata_all);
-}
+other_prog = other_prog | nir_progress(prog, impl, metadata);
@@
expression prog_expr, impl, metadata;
identifier prog;
@@
-if (prog_expr) {
-nir_metadata_preserve(impl, metadata);
-prog = true;
-} else {
-nir_metadata_preserve(impl, nir_metadata_all);
-}
+bool impl_progress = prog_expr;
+prog = prog | nir_progress(impl_progress, impl, metadata);
@@
identifier other_prog, prog;
expression impl, metadata;
@@
-if (prog) {
-other_prog = true;
-nir_metadata_preserve(impl, metadata);
-} else {
-nir_metadata_preserve(impl, nir_metadata_all);
-}
+other_prog = other_prog | nir_progress(prog, impl, metadata);
@@
expression prog_expr, impl, metadata;
identifier prog;
@@
-if (prog_expr) {
-prog = true;
-nir_metadata_preserve(impl, metadata);
-} else {
-nir_metadata_preserve(impl, nir_metadata_all);
-}
+bool impl_progress = prog_expr;
+prog = prog | nir_progress(impl_progress, impl, metadata);
@@
expression prog_expr, impl, metadata;
@@
-if (prog_expr) {
-nir_metadata_preserve(impl, metadata);
-} else {
-nir_metadata_preserve(impl, nir_metadata_all);
-}
+bool impl_progress = prog_expr;
+nir_progress(impl_progress, impl, metadata);
@@
identifier prog;
expression impl, metadata;
@@
-nir_metadata_preserve(impl, metadata);
-prog = true;
+prog = nir_progress(true, impl, metadata);
@@
identifier prog;
expression impl, metadata;
@@
-if (prog) {
-nir_metadata_preserve(impl, metadata);
-}
-return prog;
+return nir_progress(prog, impl, metadata);
@@
identifier prog;
expression impl, metadata;
@@
-if (prog) {
-nir_metadata_preserve(impl, metadata);
-}
+nir_progress(prog, impl, metadata);
@@
expression impl;
@@
-nir_metadata_preserve(impl, nir_metadata_all);
+nir_no_progress(impl);
@@
expression impl, metadata;
@@
-nir_metadata_preserve(impl, metadata);
+nir_progress(true, impl, metadata);
squashme! sed -ie 's/progress = progress | /progress |=/g' $(git grep -l 'progress = prog')
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33722 >
2025-02-26 15:19:53 +00:00
Rhys Perry
2a3dce1b59
ac/nir: fix tess factor optimization when workgroup barriers are reduced
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: b49eab68a8 ("ac/nir: use s_sendmsg(HS_TESSFACTOR) to optimize writing tess factors for gfx11")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12632
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33645 >
2025-02-24 14:07:40 +00:00
Timur Kristóf
b8797180e9
ac/nir/ngg: Add bool return value to ac_nir_lower_ngg_mesh.
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33609 >
2025-02-22 08:54:17 +01:00
Timur Kristóf
cd01e17e81
ac/nir/ngg: Add bool return value to ac_nir_lower_ngg_gs.
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33609 >
2025-02-22 08:54:17 +01:00
Timur Kristóf
25adf353cc
ac/nir/ngg: Add bool return value to ac_nir_lower_ngg_nogs.
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33609 >
2025-02-22 08:54:17 +01:00
Timur Kristóf
fad58a99e8
ac/nir: Add bool return value to ac_nir_lower_legacy_gs.
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33609 >
2025-02-22 08:54:17 +01:00
Timur Kristóf
d8ad068968
ac/nir: Add bool return value to ac_nir_lower_legacy_vs.
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33609 >
2025-02-22 08:54:17 +01:00
Timur Kristóf
407aedeff8
ac/nir: Add bool return value to ac_nir_lower_mesh_inputs_to_mem.
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33609 >
2025-02-22 08:54:17 +01:00
Timur Kristóf
9e7609b0ff
ac/nir: Add bool return value to ac_nir_lower_task_outputs_to_mem.
...
And fixup its NIR counterparts too.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33609 >
2025-02-22 08:54:17 +01:00
Timur Kristóf
65645f6841
ac/nir: Add bool return value to ac_nir_lower_gs_inputs_to_mem.
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33609 >
2025-02-22 08:54:17 +01:00
Timur Kristóf
c593110f5f
ac/nir: Add bool return value to ac_nir_lower_es_outputs_to_mem.
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33609 >
2025-02-22 08:54:17 +01:00
Timur Kristóf
6e9ede61c4
ac/nir: Add bool return value to ac_nir_lower_tes_inputs_to_mem.
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33609 >
2025-02-22 08:54:17 +01:00
Timur Kristóf
6e78aef0e9
ac/nir: Add bool return value to ac_nir_lower_hs_outputs_to_mem.
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33609 >
2025-02-22 08:54:17 +01:00
Timur Kristóf
bb3f33014d
ac/nir: Add bool return value to ac_nir_lower_hs_inputs_to_mem.
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33609 >
2025-02-22 08:54:17 +01:00
Timur Kristóf
0438cc0afb
ac/nir: Add bool return value to ac_nir_lower_ls_outputs_to_mem.
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33609 >
2025-02-22 08:54:17 +01:00
Rhys Perry
d2907d271e
ac/nir: set higher alignment for some swizzled store_buffer_amd
...
No fossil-db changes (navi31, navi21, polaris10).
fossil-db (vega10):
Totals from 37 (0.06% of 62962) affected shaders:
MaxWaves: 189 -> 180 (-4.76%)
Instrs: 45607 -> 45616 (+0.02%); split: -0.16%, +0.18%
CodeSize: 241980 -> 234908 (-2.92%)
VGPRs: 2524 -> 2784 (+10.30%)
Latency: 152476 -> 151948 (-0.35%); split: -0.38%, +0.03%
InvThroughput: 74441 -> 78360 (+5.26%); split: -0.21%, +5.47%
VClause: 902 -> 1044 (+15.74%); split: -1.55%, +17.29%
Copies: 4989 -> 6745 (+35.20%)
PreVGPRs: 2044 -> 2334 (+14.19%)
VALU: 31634 -> 33389 (+5.55%)
SALU: 2601 -> 2602 (+0.04%)
VMEM: 5774 -> 3991 (-30.88%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33531 >
2025-02-18 12:31:19 +00:00
Rhys Perry
8fd862499a
ac/nir: don't cross swizzle elements when vectorizing buffer_amd intrinsic
...
This can happen for mesh shader outputs.
No fossil-db changes (navi31, navi21, polaris10).
fossil-db (vega10):
Totals from 37 (0.06% of 62962) affected shaders:
MaxWaves: 183 -> 189 (+3.28%)
Instrs: 45037 -> 45607 (+1.27%); split: -0.09%, +1.36%
CodeSize: 231472 -> 241980 (+4.54%)
VGPRs: 2656 -> 2524 (-4.97%)
Latency: 151199 -> 152476 (+0.84%); split: -0.02%, +0.87%
InvThroughput: 75148 -> 74441 (-0.94%); split: -1.44%, +0.50%
VClause: 882 -> 902 (+2.27%); split: -4.31%, +6.58%
Copies: 6465 -> 4989 (-22.83%)
PreVGPRs: 2265 -> 2044 (-9.76%)
VALU: 33109 -> 31634 (-4.45%)
SALU: 2602 -> 2601 (-0.04%)
VMEM: 3711 -> 5774 (+55.59%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Fixes: c3d27906d8 ("radv: vectorize lowered shader IO")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33531 >
2025-02-18 12:31:19 +00:00
Daniel Schürmann
067478358f
amd: switch to nir_metadata_divergence
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30814 >
2025-02-13 10:08:43 +00:00
Timur Kristóf
91c28f67e6
ac/nir: Move surface related NIR functions to separate file.
...
This is to stop including nir related stuff in places that
actually don't need that.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33439 >
2025-02-12 22:33:07 +01:00
Timur Kristóf
305944def9
ac/nir: Don't include nir.h in headers anymore.
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33439 >
2025-02-12 22:33:07 +01:00
Timur Kristóf
cccd3aa45c
nir: Move nir_tcs_info to separate file.
...
The nir_tcs_info struct is like nir_xfb_info in the sense that
it's very specialized and not often used, so it deserves its own
header too.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33439 >
2025-02-12 22:33:07 +01:00
Georg Lehmann
fd77cc7c32
ac/nir/lower_ps: move exports after packing alu
...
If ACO's wqm section ends just before the first export, this mixing alu and
exports means the alu in question can't be reordered as much by the ILP
scheduler.
Foz-DB Navi31:
Totals from 8959 (11.31% of 79188) affected shaders:
Instrs: 5977212 -> 5978494 (+0.02%); split: -0.02%, +0.04%
CodeSize: 32982732 -> 32987876 (+0.02%); split: -0.01%, +0.03%
Latency: 35218073 -> 35216277 (-0.01%); split: -0.02%, +0.02%
InvThroughput: 5149751 -> 5149696 (-0.00%); split: -0.00%, +0.00%
SClause: 220552 -> 220551 (-0.00%); split: -0.01%, +0.01%
PreVGPRs: 313203 -> 313069 (-0.04%); split: -0.06%, +0.01%
Foz-DB Navi21:
Totals from 8895 (11.21% of 79377) affected shaders:
MaxWaves: 219280 -> 219272 (-0.00%); split: +0.00%, -0.01%
Instrs: 5393330 -> 5393366 (+0.00%); split: -0.00%, +0.00%
CodeSize: 29921900 -> 29922024 (+0.00%); split: -0.00%, +0.00%
VGPRs: 406664 -> 406688 (+0.01%); split: -0.00%, +0.01%
Latency: 35653975 -> 35652220 (-0.00%); split: -0.02%, +0.02%
InvThroughput: 7992134 -> 7992032 (-0.00%); split: -0.00%, +0.00%
SClause: 223784 -> 223786 (+0.00%)
Copies: 370984 -> 370983 (-0.00%)
PreVGPRs: 314323 -> 314330 (+0.00%); split: -0.01%, +0.01%
VALU: 3800023 -> 3800022 (-0.00%)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33417 >
2025-02-08 17:31:18 +00:00
Rhys Perry
c3d27906d8
radv: vectorize lowered shader IO
...
fossil-db (navi31):
Totals from 2329 (2.93% of 79377) affected shaders:
MaxWaves: 72152 -> 72102 (-0.07%)
Instrs: 1048791 -> 1041920 (-0.66%); split: -0.72%, +0.07%
CodeSize: 5331832 -> 5285572 (-0.87%); split: -0.90%, +0.03%
VGPRs: 113844 -> 113820 (-0.02%); split: -0.14%, +0.12%
Latency: 4349524 -> 4346374 (-0.07%); split: -0.35%, +0.28%
InvThroughput: 609449 -> 609235 (-0.04%); split: -0.27%, +0.24%
VClause: 22613 -> 22451 (-0.72%); split: -1.03%, +0.31%
SClause: 21197 -> 21177 (-0.09%); split: -0.45%, +0.35%
Copies: 81900 -> 82446 (+0.67%); split: -1.51%, +2.18%
PreSGPRs: 94697 -> 93596 (-1.16%); split: -1.23%, +0.07%
PreVGPRs: 69962 -> 70080 (+0.17%); split: -0.01%, +0.18%
VALU: 625247 -> 625390 (+0.02%); split: -0.23%, +0.25%
SALU: 101692 -> 101555 (-0.13%); split: -0.24%, +0.11%
VMEM: 46459 -> 44845 (-3.47%)
fossil-db (navi21):
Totals from 17522 (22.07% of 79377) affected shaders:
MaxWaves: 425698 -> 425460 (-0.06%); split: +0.00%, -0.06%
Instrs: 11444215 -> 11428321 (-0.14%); split: -0.14%, +0.00%
CodeSize: 59227492 -> 59019376 (-0.35%); split: -0.35%, +0.00%
VGPRs: 780920 -> 781208 (+0.04%); split: -0.00%, +0.04%
Latency: 44965072 -> 44926529 (-0.09%); split: -0.12%, +0.03%
InvThroughput: 9718148 -> 9728793 (+0.11%); split: -0.01%, +0.12%
VClause: 225732 -> 225605 (-0.06%); split: -0.10%, +0.04%
SClause: 217196 -> 217160 (-0.02%); split: -0.03%, +0.01%
Copies: 1050351 -> 1065263 (+1.42%); split: -0.03%, +1.45%
PreSGPRs: 747538 -> 747223 (-0.04%); split: -0.05%, +0.01%
PreVGPRs: 626702 -> 626748 (+0.01%); split: -0.00%, +0.01%
VALU: 6629403 -> 6643822 (+0.22%); split: -0.01%, +0.23%
SALU: 1898492 -> 1898452 (-0.00%); split: -0.00%, +0.00%
VMEM: 529942 -> 528361 (-0.30%)
fossil-db (vega10):
Totals from 1791 (2.84% of 62962) affected shaders:
MaxWaves: 12270 -> 12253 (-0.14%); split: +0.01%, -0.15%
Instrs: 602026 -> 597473 (-0.76%); split: -0.83%, +0.08%
CodeSize: 3109872 -> 3071664 (-1.23%); split: -1.26%, +0.03%
SGPRs: 137826 -> 137938 (+0.08%); split: -0.10%, +0.19%
VGPRs: 70364 -> 70520 (+0.22%); split: -0.03%, +0.26%
Latency: 4757850 -> 4781905 (+0.51%); split: -0.35%, +0.86%
InvThroughput: 2296941 -> 2310685 (+0.60%); split: -0.14%, +0.74%
VClause: 14161 -> 14050 (-0.78%); split: -1.23%, +0.44%
SClause: 14058 -> 14077 (+0.14%); split: -0.57%, +0.70%
Copies: 40954 -> 42191 (+3.02%); split: -1.69%, +4.71%
PreSGPRs: 64314 -> 63214 (-1.71%); split: -1.81%, +0.10%
PreVGPRs: 53558 -> 53894 (+0.63%); split: -0.01%, +0.64%
VALU: 449920 -> 450830 (+0.20%); split: -0.19%, +0.39%
SALU: 32973 -> 32839 (-0.41%); split: -0.76%, +0.35%
VMEM: 28796 -> 25151 (-12.66%)
fossil-db (polaris10):
Totals from 1769 (2.86% of 61794) affected shaders:
MaxWaves: 12024 -> 12021 (-0.02%)
Instrs: 474761 -> 470760 (-0.84%); split: -0.94%, +0.10%
CodeSize: 2447964 -> 2420712 (-1.11%); split: -1.15%, +0.04%
SGPRs: 129664 -> 129728 (+0.05%); split: -0.14%, +0.19%
VGPRs: 65216 -> 65560 (+0.53%); split: -0.05%, +0.58%
Latency: 4304734 -> 4318319 (+0.32%); split: -0.41%, +0.72%
InvThroughput: 2114950 -> 2122580 (+0.36%); split: -0.18%, +0.54%
VClause: 10933 -> 10808 (-1.14%); split: -1.42%, +0.27%
SClause: 11430 -> 11446 (+0.14%); split: -0.70%, +0.84%
Copies: 32290 -> 31891 (-1.24%); split: -2.80%, +1.56%
PreSGPRs: 58184 -> 57096 (-1.87%); split: -1.98%, +0.11%
PreVGPRs: 48757 -> 48874 (+0.24%); split: -0.02%, +0.26%
VALU: 359097 -> 358582 (-0.14%); split: -0.25%, +0.11%
SALU: 26279 -> 25934 (-1.31%); split: -1.75%, +0.43%
VMEM: 18825 -> 17247 (-8.38%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29242 >
2025-02-07 13:52:57 +00:00
Rhys Perry
f034aa9cd3
radv: don't use bit_sizes_int to skip nir_lower_bit_size
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29242 >
2025-02-07 13:52:57 +00:00
Rhys Perry
19394f44df
ac/nir: set memory_modes for lowered TES input loads
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29242 >
2025-02-07 13:52:57 +00:00
Rhys Perry
1fca72ddc8
ac/nir/ngg: update bit_sizes_int
...
This is used for RADV's bit size lowering.
fossil-db (navi21):
Totals from 4520 (5.69% of 79377) affected shaders:
(no stat changes)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29242 >
2025-02-07 13:52:57 +00:00
Timur Kristóf
150123349a
ac/nir/ngg: Use SALU to calculate which threads store to attribute ring in GS.
...
This trades 1 VALU (v_cmpx) instruction in GS to 2 SALU,
and removes a VALU->SALU dependency for the branch that stores
attributes.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33218 >
2025-01-30 15:26:46 +00:00
Timur Kristóf
5560763e99
ac/nir/ngg: Move GS lowering to separate file.
...
Both the VS/TES and GS lowering passes have grown a lot over time,
and therefore the C file has become unwieldy. Mitigate that by
moving the GS lowering out to a separate file.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33218 >
2025-01-30 15:26:46 +00:00
Timur Kristóf
ab8ec78c93
ac/nir/ngg: Don't call has_input_primitive in GS lowering.
...
The entire GS lowering will be moved to another file, which
won't have this function.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33218 >
2025-01-30 15:26:46 +00:00