Commit graph

397 commits

Author SHA1 Message Date
Samuel Pitoiset
876e6a3bfe radv/rt: fix memory leak in lower_rt_instructions_monolithic()
Found with ASAN.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37844>
2025-10-14 06:54:02 +00:00
Samuel Pitoiset
08dbab0600 radv: rename shader arg descriptor_sets to descriptors
It's more generic and descriptor heaps will use it too.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37786>
2025-10-10 13:22:03 +00:00
Samuel Pitoiset
609ae4e647 radv: rename indirect_descriptor_sets to indirect_descriptors
With descriptor heap the driver will also have to emit indirect
descriptor heaps in some cases.

Rename couple of things to make them more generic.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37786>
2025-10-10 13:22:03 +00:00
Samuel Pitoiset
08ddf2f878 radv: lower embedded/immutable samplers earlier
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Lowering them earlier right after VTN would allow us to implement
embedded samplers for descriptor heap properly for merged shaders.

Non-immediate samplers are still lowered in
radv_nir_apply_pipeline_layout because they require shader arguments.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37688>
2025-10-07 09:25:28 +00:00
Georg Lehmann
cf30742a66 radv,aco: don't end monolithic ray tracing with unconditional terminate
The terminate requires more code and blocks us from deallocating VGPRs early.

Foz-DB Navi31:
Totals from 63 (0.08% of 80273) affected shaders:
Instrs: 3372702 -> 3372467 (-0.01%)
CodeSize: 17441676 -> 17440736 (-0.01%)
Latency: 19763447 -> 19763288 (-0.00%)
InvThroughput: 3860502 -> 3860478 (-0.00%)
Branches: 96204 -> 96141 (-0.07%)
SALU: 406648 -> 406549 (-0.02%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37542>
2025-09-25 15:35:55 +00:00
Rhys Perry
591b498e1f radv: fix progress reporting in lower_rt_derefs
Only create nir_load_rt_arg_scratch_offset_amd if needed.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35069>
2025-09-24 08:20:27 +00:00
Marek Olšák
bbab69d343 radv: fix load_smem alignment
radv_cmd_buffer_upload_alloc_aligned is used with alignment=0, which
guarantees that the alignment is at least 4.

Fixes: 9e16ed7a13 - ac/nir: switch nir_load_smem_amd uses to ac_nir_load_smem wrapper

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37345>
2025-09-19 21:08:25 -04:00
Georg Lehmann
a2d3cbac2a radv: determine subgroup/wave size early
This means we can actually implement varying subgroup size correctly.
It also means that we implement the implicit SPIR-V 1.6 full subgroups
requirement in compute shaders with cswave32/rtwave32.

In the future it will also allow more optimizations that use the subgroup size.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>

The only somewhat complex case here is GFX10 geometry shaders, if gewave32 is
used. We then only know the subgroup size when is_ngg is decided, as legacy
GS doesn't support wave32.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37294>
2025-09-14 13:21:21 +00:00
Georg Lehmann
4143f0725a radv/nir/lower_cmat: clean up GFX11 ACC->B convert
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37213>
2025-09-09 06:08:55 +00:00
Georg Lehmann
5c0ebcdaef radv/nir/lower_cmat: clean up gfx12 transpose
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37213>
2025-09-09 06:08:55 +00:00
Georg Lehmann
2da7b4bd0a radv/nir/lower_cmat: add shuffle_xor_imm helper
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37213>
2025-09-09 06:08:54 +00:00
Christian Gmeiner
1492de1bc3 radv: re-format using clang-format
No manual changes here, this is simply running
$ ninja -C build/ clang-format

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37226>
2025-09-09 05:48:56 +00:00
Samuel Pitoiset
8e4d5743d2 radv: move debug related drirc to radv_drirc::debug
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37145>
2025-09-05 05:56:17 +00:00
Georg Lehmann
83326af899 nir/builder: add nir_inverse_ballot_imm
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37178>
2025-09-04 14:03:56 +00:00
Georg Lehmann
ef8c364d3d nir: make inverse_ballot 1bit only
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37178>
2025-09-04 14:03:56 +00:00
Samuel Pitoiset
decf9af472 radv/rt: only use one user SGPR for the traversal shader addr
All shaders are allocated in the 32-bit addr space. To avoid an issue
with alignment, and also for future work, there is an unused user SGPR.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37133>
2025-09-03 05:53:41 +00:00
Daniel Schürmann
fcf8899c9e radv/rt: use ACCESS_CAN_REORDER when loading SBT entries
Totals from 56 (0.07% of 79839) affected shaders: (Navi48)

Instrs: 2790220 -> 2790130 (-0.00%); split: -0.00%, +0.00%
CodeSize: 14704952 -> 14704292 (-0.00%)
Latency: 13994383 -> 13953444 (-0.29%); split: -0.29%, +0.00%
InvThroughput: 2717973 -> 2710748 (-0.27%); split: -0.27%, +0.00%
VClause: 68783 -> 68687 (-0.14%)
SClause: 51910 -> 52007 (+0.19%)
Copies: 223192 -> 223190 (-0.00%); split: -0.01%, +0.01%
VALU: 1557513 -> 1557451 (-0.00%); split: -0.00%, +0.00%
VMEM: 118789 -> 118692 (-0.08%)
SMEM: 66498 -> 66595 (+0.15%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36933>
2025-09-02 19:07:30 +00:00
Samuel Pitoiset
bc9a020dd3 radv: rename NGG culling user SGPRs
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37022>
2025-09-01 08:52:55 +00:00
Marek Olšák
9e16ed7a13 ac/nir: switch nir_load_smem_amd uses to ac_nir_load_smem wrapper
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
ac_nir_load_smem will use load_global_amd

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37101>
2025-08-30 15:04:32 -04:00
Georg Lehmann
acd879f096 radv: set ACCESS_CAN_SPECULATE for smem buffer loads with known good descriptors
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Foz-DB GFX1201:
Totals from 2872 (3.59% of 80098) affected shaders:
MaxWaves: 78208 -> 78234 (+0.03%); split: +0.21%, -0.18%
Instrs: 6214171 -> 6193701 (-0.33%); split: -0.40%, +0.07%
CodeSize: 33121244 -> 33113692 (-0.02%); split: -0.18%, +0.16%
VGPRs: 151680 -> 152016 (+0.22%); split: -0.25%, +0.47%
SpillSGPRs: 775 -> 776 (+0.13%)
Latency: 46080905 -> 45955331 (-0.27%); split: -0.55%, +0.28%
InvThroughput: 6235954 -> 6250598 (+0.23%); split: -0.25%, +0.48%
VClause: 111125 -> 110955 (-0.15%); split: -0.17%, +0.02%
SClause: 221845 -> 214761 (-3.19%); split: -3.20%, +0.01%
Copies: 501387 -> 488215 (-2.63%); split: -2.96%, +0.33%
Branches: 191455 -> 178574 (-6.73%)
PreSGPRs: 146364 -> 146923 (+0.38%); split: -0.12%, +0.50%
PreVGPRs: 120813 -> 121073 (+0.22%)
VALU: 3139282 -> 3137471 (-0.06%); split: -0.11%, +0.05%
SALU: 1079863 -> 1083158 (+0.31%); split: -0.55%, +0.86%
VMEM: 182255 -> 182247 (-0.00%)
SMEM: 293409 -> 290233 (-1.08%)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36938>
2025-08-27 09:45:19 +00:00
Georg Lehmann
5a10142a9f radv/nir/lower_cmat: split up larger nested switches
This has been annoying me for quite some while, the level of indention
makes reviewing code changes in Gitlab harder.

I think now is a good time to change this before more cmat lowering is added.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37002>
2025-08-27 08:20:47 +00:00
Samuel Pitoiset
c5a5c8818c radv/nir/lower_cmat: handle untyped pointers for load/store
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36677>
2025-08-26 13:47:07 +00:00
Samuel Pitoiset
19c712c8ef radv: rename rast_prim to vgt_outprim_type everywhere
To avoid confusion between the primitive topology and the output
rasterized primitive.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36912>
2025-08-25 12:17:38 +00:00
Samuel Pitoiset
ce83800262 radv: remove unused forwarded declarations of pipeline layout
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36792>
2025-08-18 07:25:34 +00:00
Konstantin Seurer
cc0dc4b566 radv: Store parent node IDs inside nodes on GFX12
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Saves some space.

Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36691>
2025-08-15 13:00:32 +00:00
Konstantin Seurer
be4be884e1 radv: Rename radv_printf files to radv_debug_nir
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34392>
2025-08-15 10:32:34 +00:00
Samuel Pitoiset
0ac7f1888f radv: reduce the combined image/sampler desc size on GFX11+
From 96 to 64 due to the 32 bytes descriptor alignment.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36762>
2025-08-14 06:47:30 +00:00
Samuel Pitoiset
297cf6f1aa radv/meta: add a pass to clear HiZ surfaces
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36739>
2025-08-12 13:48:09 +00:00
Konstantin Seurer
c4b18c689f radv: Emit compressed primitive nodes on GFX12
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Emits two triangles per node whenever possible. The nir code will
revisit the triangle node to handle the second triangle only if both
triangles are interescted by the ray.

Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35734>
2025-08-07 20:23:15 +00:00
Qiang Yu
196569b1a4 all: rename gl_shader_stage to mesa_shader_stage
It's not only for GL, change to a generic name.

Use command:
  find . -type f -not -path '*/.git/*' -exec sed -i 's/\bgl_shader_stage\b/mesa_shader_stage/g' {} +

Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569>
2025-08-06 10:28:40 +08:00
Alyssa Rosenzweig
82ae8b1d33 treewide: simplify nir_def_rewrite_uses_after
Most of the time with nir_def_rewrite_uses_after, you want to rewrite after the
replacement. Make that the default thing to be more ergonomic and to drop
parent_instr uses.

We leave nir_def_rewrite_uses_after_instr defined if you really want the old
signature with an arbitrary after point.

Via Coccinelle patch:

    @@
    expression a, b;
    @@

    -nir_def_rewrite_uses_after(a, b, b->parent_instr)
    +nir_def_rewrite_uses_after_def(a, b)

Followed by a bunch of sed.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36489>
2025-08-01 15:34:24 +00:00
Alyssa Rosenzweig
cc6e3b84cb treewide: use nir_def_as_*
Via Coccinelle patch:

    @@
    expression definition;
    @@

    -nir_instr_as_alu(definition->parent_instr)
    +nir_def_as_alu(definition)

    @@
    expression definition;
    @@

    -nir_instr_as_intrinsic(definition->parent_instr)
    +nir_def_as_intrinsic(definition)

    @@
    expression definition;
    @@

    -nir_instr_as_phi(definition->parent_instr)
    +nir_def_as_phi(definition)

    @@
    expression definition;
    @@

    -nir_instr_as_load_const(definition->parent_instr)
    +nir_def_as_load_const(definition)

    @@
    expression definition;
    @@

    -nir_instr_as_deref(definition->parent_instr)
    +nir_def_as_deref(definition)

    @@
    expression definition;
    @@

    -nir_instr_as_tex(definition->parent_instr)
    +nir_def_as_tex(definition)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36489>
2025-08-01 15:34:24 +00:00
Antonio Ospite
ddf2aa3a4d build: avoid redefining unreachable() which is standard in C23
In the C23 standard unreachable() is now a predefined function-like
macro in <stddef.h>

See https://android.googlesource.com/platform/bionic/+/HEAD/docs/c23.md#is-now-a-predefined-function_like-macro-in

And this causes build errors when building for C23:

-----------------------------------------------------------------------
In file included from ../src/util/log.h:30,
                 from ../src/util/log.c:30:
../src/util/macros.h:123:9: warning: "unreachable" redefined
  123 | #define unreachable(str)    \
      |         ^~~~~~~~~~~
In file included from ../src/util/macros.h:31:
/usr/lib/gcc/x86_64-linux-gnu/14/include/stddef.h:456:9: note: this is the location of the previous definition
  456 | #define unreachable() (__builtin_unreachable ())
      |         ^~~~~~~~~~~
-----------------------------------------------------------------------

So don't redefine it with the same name, but use the name UNREACHABLE()
to also signify it's a macro.

Using a different name also makes sense because the behavior of the
macro was extending the one of __builtin_unreachable() anyway, and it
also had a different signature, accepting one argument, compared to the
standard unreachable() with no arguments.

This change improves the chances of building mesa with the C23 standard,
which for instance is the default in recent AOSP versions.

All the instances of the macro, including the definition, were updated
with the following command line:

  git grep -l '[^_]unreachable(' -- "src/**" | sort | uniq | \
  while read file; \
  do \
    sed -e 's/\([^_]\)unreachable(/\1UNREACHABLE(/g' -i "$file"; \
  done && \
  sed -e 's/#undef unreachable/#undef UNREACHABLE/g' -i src/intel/isl/isl_aux_info.c

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36437>
2025-07-31 17:49:42 +00:00
Georg Lehmann
4683187f49 radv/nir/lower_cmat: load gfx11 8bit ACC using the B layout to get aligned loads
This allows us to use aligned loads that can be vectorized, without any
downside as 8bit scalar loads always write 16bits of a register.

Foz-DB Navi31:
Totals from 10 out of 14 FSR4 shader:
MaxWaves: 71 -> 68 (-4.23%)
Instrs: 60146 -> 59781 (-0.61%); split: -0.67%, +0.06%
CodeSize: 412448 -> 413428 (+0.24%); split: -0.11%, +0.35%
VGPRs: 2112 -> 2160 (+2.27%)
SpillVGPRs: 89 -> 68 (-23.60%)
Scratch: 11776 -> 8704 (-26.09%)
Latency: 196628 -> 193770 (-1.45%); split: -2.62%, +1.17%
InvThroughput: 224944 -> 226274 (+0.59%); split: -0.02%, +0.61%
VClause: 862 -> 796 (-7.66%)
Copies: 3166 -> 3342 (+5.56%); split: -6.22%, +11.78%
Branches: 37 -> 38 (+2.70%)
PreSGPRs: 311 -> 312 (+0.32%)
PreVGPRs: 2153 -> 2214 (+2.83%); split: -1.35%, +4.18%
VALU: 51073 -> 51448 (+0.73%); split: -0.03%, +0.77%
SALU: 1072 -> 1074 (+0.19%)
VMEM: 3275 -> 2765 (-15.57%)
VOPD: 1739 -> 1783 (+2.53%); split: +7.99%, -5.46%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36117>
2025-07-30 07:25:51 +00:00
Marek Olšák
09e607c385 nir: add access to load_smem_amd (for ACCESS_CAN_SPECULATE)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36099>
2025-07-24 18:41:38 +00:00
Marek Olšák
4c8a757951 radv,radeonsi: mark VS input loads and poly stipple load speculatable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35950>
2025-07-24 06:31:17 +00:00
Alyssa Rosenzweig
8a1a410389 treewide: use SWAP macro
Via Coccinelle patch + manual clean up:

    @@
    identifier temporary, a, b;
    type T;
    @@

    -T temporary = a;
    -a = b;
    -b = temporary;
    +SWAP(a, b);

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36297>
2025-07-23 19:49:47 +00:00
Alyssa Rosenzweig
6b34e2174e nir: introduce ergonomic tex builder
for intrinsics, we have these really nice builders using designated initializers
+ macros to specify optional indices. texture instrs have even more craziness
involved, but we can do the same trick. this commit takes the existing "fixed
form" deref-centric tex builders and generalizes them to work with non-deref
textures, making it useful also for GL and late VK passes, while providing an
API that strives to be ergonomic and consistent.

this series only implements a subset of possible texture operations for now, but
more generalizing could be added as people have need.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36050>
2025-07-21 12:11:41 +00:00
Konstantin Seurer
d59c22b6e1 radv/rt: Implement null acceleration structure in shader code
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The previous approach is broken with descriptor buffer capture/replay
because the address off the dummy VA used can randomly change.

Totals from 78 (20.58% of 379) affected shaders:

Instrs: 3837275 -> 3839653 (+0.06%); split: -0.01%, +0.07%
CodeSize: 20235104 -> 20251744 (+0.08%); split: -0.01%, +0.09%
SpillSGPRs: 997 -> 1007 (+1.00%)
Latency: 22305937 -> 22331551 (+0.11%); split: -0.03%, +0.15%
InvThroughput: 4232313 -> 4237341 (+0.12%); split: -0.03%, +0.15%
VClause: 97043 -> 97027 (-0.02%); split: -0.02%, +0.01%
SClause: 72169 -> 72416 (+0.34%); split: -0.00%, +0.35%
Copies: 321578 -> 322126 (+0.17%); split: -0.11%, +0.28%
Branches: 110163 -> 110444 (+0.26%); split: -0.00%, +0.26%
PreSGPRs: 7879 -> 7942 (+0.80%)
VALU: 2155040 -> 2156425 (+0.06%); split: -0.02%, +0.09%
SALU: 502292 -> 503078 (+0.16%); split: -0.00%, +0.16%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36034>
2025-07-19 21:02:42 +00:00
Konstantin Seurer
d28ff8050a radv/rt: Use inv_dir for software ray-triangle tests
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Reviewed-by: Autumn Ashton <misyl@froggi.es>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36213>
2025-07-19 16:35:37 +00:00
Konstantin Seurer
5494789e89 radv/rt: Optimize emulated ray-triangle tests
The imod instructions are lowered to 4 alu instructions each. We can do
better by packing the results with the values for kz.

Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Reviewed-by: Autumn Ashton <misyl@froggi.es>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36213>
2025-07-19 16:35:37 +00:00
Konstantin Seurer
d140f2a6a2 radv: Implement watertightness for emulated RT
Instead of using fp64 (Which is broken in some cases) the new approach
only uses fp32 and implements tiebreaking for edge/vertex hits. Using
fp32 is also much faster, improving performance of q2rtx by around 40%.

Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Reviewed-by: Autumn Ashton <misyl@froggi.es>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36213>
2025-07-19 16:35:36 +00:00
Konstantin Seurer
55641f9ca0 radv: Disable pointer flags and the GFX12 WA for emulated RT
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Reviewed-by: Autumn Ashton <misyl@froggi.es>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36213>
2025-07-19 16:35:36 +00:00
Konstantin Seurer
df44b353ad radv: Optimize ray tracing position fetch
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Gets rid of a lot of indirection when fetching triangle positions.
Storing the primitive address increases register pressure by a bit but
the traversal shader which should have the highest register demand
should not be affected when position fetch is not used.

Totals:
Instrs: 4021686 -> 4022435 (+0.02%); split: -0.01%, +0.03%
CodeSize: 21235812 -> 21235832 (+0.00%); split: -0.02%, +0.02%
Latency: 23402275 -> 23412110 (+0.04%); split: -0.04%, +0.09%
InvThroughput: 4352818 -> 4352206 (-0.01%); split: -0.04%, +0.02%
VClause: 101906 -> 102058 (+0.15%); split: -0.03%, +0.18%
Copies: 342210 -> 342368 (+0.05%); split: -0.09%, +0.14%
Branches: 114988 -> 114993 (+0.00%)
PreVGPRs: 26551 -> 27111 (+2.11%)
VALU: 2249366 -> 2249524 (+0.01%); split: -0.01%, +0.02%
SALU: 529828 -> 529808 (-0.00%); split: -0.01%, +0.00%

Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35533>
2025-07-19 16:07:59 +00:00
Georg Lehmann
497f607c8e radv/nir/lower_cmat: vectorize GFX11 B -> ACC conversion
Foz-DB Navi31:
Totals from 7 out of 14 FSR4 shaders:
MaxWaves: 50 -> 52 (+4.00%)
Instrs: 44951 -> 44516 (-0.97%); split: -1.00%, +0.03%
CodeSize: 309176 -> 305500 (-1.19%); split: -1.23%, +0.04%
VGPRs: 1464 -> 1416 (-3.28%)
SpillVGPRs: 188 -> 92 (-51.06%)
Scratch: 24064 -> 11776 (-51.06%)
Latency: 171318 -> 163663 (-4.47%); split: -4.51%, +0.04%
InvThroughput: 178796 -> 178956 (+0.09%); split: -0.04%, +0.13%
VClause: 769 -> 730 (-5.07%); split: -6.50%, +1.43%
Copies: 3149 -> 3261 (+3.56%); split: -1.21%, +4.76%
PreVGPRs: 1607 -> 1467 (-8.71%)
VALU: 37715 -> 37744 (+0.08%); split: -0.11%, +0.18%
SALU: 754 -> 753 (-0.13%)
VMEM: 2813 -> 2621 (-6.83%)
VOPD: 1674 -> 1685 (+0.66%); split: +1.55%, -0.90%

Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36115>
2025-07-16 11:46:52 +00:00
Georg Lehmann
7546169e1c radv/nir/lower_cmat: vectorize GFX11 ACC -> B conversion
Foz-DB Navi31:
Totals from 10 out of 14 FSR4 shaders:
Instrs: 64204 -> 60749 (-5.38%)
CodeSize: 439052 -> 417668 (-4.87%)
SpillVGPRs: 186 -> 188 (+1.08%)
Scratch: 23808 -> 24064 (+1.08%)
Latency: 208878 -> 202903 (-2.86%)
InvThroughput: 232898 -> 225688 (-3.10%)
VClause: 902 -> 907 (+0.55%); split: -1.55%, +2.11%
Copies: 6418 -> 3762 (-41.38%)
Branches: 55 -> 37 (-32.73%)
PreSGPRs: 297 -> 298 (+0.34%)
PreVGPRs: 2299 -> 2303 (+0.17%)
VALU: 54762 -> 51489 (-5.98%)
SALU: 956 -> 938 (-1.88%)
VMEM: 3469 -> 3473 (+0.12%)
VOPD: 3895 -> 2126 (-45.42%)

Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36115>
2025-07-16 11:46:52 +00:00
Georg Lehmann
56d93c40ea radv/nir/lower_cmat: convert matrix use in smaller type
Less conversions, and less data to move around.

Foz-DB Navi31:
Totals from 10 out of 14 FSR4 shaders:
Instrs: 65443 -> 64204 (-1.89%); split: -1.93%, +0.04%
CodeSize: 441884 -> 439052 (-0.64%); split: -1.21%, +0.57%
Latency: 213374 -> 208878 (-2.11%); split: -2.17%, +0.07%
InvThroughput: 236922 -> 232898 (-1.70%); split: -1.77%, +0.08%
VClause: 935 -> 902 (-3.53%); split: -3.74%, +0.21%
Copies: 5064 -> 6418 (+26.74%); split: -13.35%, +40.09%
Branches: 54 -> 55 (+1.85%)
VALU: 55700 -> 54762 (-1.68%); split: -1.85%, +0.16%
VOPD: 3459 -> 3895 (+12.60%); split: +16.88%, -4.28%

Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36115>
2025-07-16 11:46:52 +00:00
Georg Lehmann
f2846b936a radv/nir/lower_cmat: use v_permlanex16_b32 instead of ds_swizzle_b32 for GFX11 ACC->B
ds_swizzle is slower than I expected.

Foz-DB Navi31:
Totals from 10 out of 14 FSR4 shaders:
Instrs: 68802 -> 65443 (-4.88%)
CodeSize: 458000 -> 441884 (-3.52%)
Latency: 218147 -> 213374 (-2.19%); split: -3.17%, +0.99%
InvThroughput: 230190 -> 236922 (+2.92%); split: -0.25%, +3.18%
VClause: 922 -> 935 (+1.41%); split: -0.98%, +2.39%
Copies: 5877 -> 5064 (-13.83%); split: -15.74%, +1.91%
Branches: 37 -> 54 (+45.95%)
VALU: 53441 -> 55700 (+4.23%); split: -0.55%, +4.77%
SALU: 872 -> 956 (+9.63%)
VOPD: 1767 -> 3459 (+95.76%)

Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36115>
2025-07-16 11:46:51 +00:00
Samuel Pitoiset
ea742877f6 radv: re-run clang-format
For style consistency.

$ clang-format -i $(find src/amd/vulkan/ -name "*.h" -o -name "*.c" -o -name "*.cpp")

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36118>
2025-07-16 09:10:33 +02:00
Natalie Vock
e978f6e247 radv/rt: Use ds_bvh_stack_push8_pop1_rtn_b32
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35269>
2025-07-15 21:34:40 +00:00