With this patch, we compile separately
- general shaders (raygen, miss, callable)
- closest-hit shaders
- traversal shader (incl. all intersection / any-hit shaders)
Each shader uses the following scheme:
if (shader_pc == shader_va) {
<shader code>
}
next = select_next_shader(shader_va)
jump next
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22096>
We can remove the LLVM 13 Wave32 discard workaround and
SI_PROFILE_IGNORE_LLVM13_DISCARD_BUG that disabled the workaround.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23471>
Via Coccinelle patches
@@
expression a, b, c;
@@
-nir_channels(b, a, (1 << c) - 1)
+nir_trim_vector(b, a, c)
@@
expression a, b, c;
@@
-nir_channels(b, a, BITFIELD_MASK(c))
+nir_trim_vector(b, a, c)
@@
expression a, b;
@@
-nir_channels(b, a, 3)
+nir_trim_vector(b, a, 2)
@@
expression a, b;
@@
-nir_channels(b, a, 7)
+nir_trim_vector(b, a, 3)
Plus a fixup for pointless trimming an immediate in RADV and radeonsi.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23352>
It was only read by RRA which can infer it from the parenbt internal
node.
Change in average build time (Control):
84.69471 ms -> 84.25319 ms
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22400>
This avoids waiting for instance_data which can improve performance:
vk_ray_tracing_ao_KHR_app: 0.2% (The TLAS has 2 instances)
Quake II RTX: 1%
Control: 1%
We also have to shuffle around some code to avoid increasing VGPR usage.
That leaves us with the following stats:
Quake II RTX:
Totals from 7 (14.29% of 49) affected shaders:
CodeSize: 165612 -> 165716 (+0.06%)
Instrs: 31446 -> 31460 (+0.04%)
Latency: 596709 -> 554292 (-7.11%)
InvThroughput: 121998 -> 113327 (-7.11%)
VClause: 596 -> 587 (-1.51%)
Copies: 4664 -> 4646 (-0.39%)
PreVGPRs: 620 -> 639 (+3.06%)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21421>
The previous check assumed the stack starts at offset=0, which isn't
necessarily true for ray queries.
Note that this didn't cause correctness issues, just made an optimization
not apply. Found when I accidentally made this load-bearing in a
refactor.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20265>
So we don't have to do it in the traversal loop. Should 2 and
instructions and a 64-bit shift, so 4/8 cycles per instance node
visit.
Totals from 7 (0.01% of 134913) affected shaders:
CodeSize: 208460 -> 208292 (-0.08%)
Instrs: 38276 -> 38248 (-0.07%)
Latency: 803181 -> 803142 (-0.00%)
InvThroughput: 165384 -> 165376 (-0.00%)
Copies: 4912 -> 4905 (-0.14%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19706>
Culls like 99% of the triangles that are culled at all.
Reduces VALU usage in Q2RTX traversal by ~8%, though doesn't look
like VALU is a bottleneck at this point ...
For Control we get a ~5% reduction in VALU usage, but similarly it
doesn't look like a bottleneck.
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18830>
Adds a helper for building the ray traversal loop to radv_rt_common.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18650>
Ray queries and acceleration structure builds
are quite stable now and so we can enable those
features for CI and more feedback and bug reports.
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16007>