Commit graph

9825 commits

Author SHA1 Message Date
Marek Olšák
aee1ebb992 nir: print interp_mode better
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31968>
2024-11-05 14:13:40 +00:00
Marek Olšák
2ca56376a4 nir: rename nir_io_glsl_lower_derefs -> nir_io_has_io_intrinsics
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31968>
2024-11-05 14:13:40 +00:00
Marek Olšák
adc40aee25 glsl: lower IO in the linker if enabled, don't lower it later
This removes the useless codepath that kept IO derefs until st_finalize_nir.
It was used before nir_opt_varyings existed.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31968>
2024-11-05 14:13:40 +00:00
Georg Lehmann
bedd6310dc nir: add nir_opt_frag_coord_to_pixel_coord
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31864>
2024-11-04 12:34:31 +00:00
Georg Lehmann
2f830f9b94 nir: add SYSTEM_VALUE_PIXEL_COORD
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31864>
2024-11-04 12:34:30 +00:00
Alyssa Rosenzweig
f31b451916 clc: add mesa_clc tool
This is a generic tool to convert OpenCL C to SPIR-V.

In the future, this will be replaced by `clang` directly using the LLVM SPIR-V
backend, but for now we need a tool in Mesa to provide this functionality with
older LLVM versions.

The important parts are that:

1. It does not depend on NIR or any real platform details. An older mesa_clc
   from a previous Mesa version can generally be used to build a newer Mesa to
   ease cross-OS builds.

2. Its output can be consumed without any LLVM dependence, which will untangle
   the LLVM mess we have now.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31923>
2024-11-01 13:25:37 -07:00
Alyssa Rosenzweig
506b9a5ff5 nir/divergence_analysis: add AGX atomics
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31909>
2024-10-30 19:04:32 +00:00
Alyssa Rosenzweig
85b3dc90e0 nir,agx: lower fmin/fmax in NIR
we want to elide flushes, doing so requires more sophisticated analysis than I'd
like in the middle of isel. also, it should be done before forming preambles for
efficiency (notice the uniform reduction here). let's do it with a NIR pass.

total instructions in shared programs: 2768481 -> 2757832 (-0.38%)
instructions in affected programs: 644084 -> 633435 (-1.65%)
helped: 2242
HURT: 18
helped stats (abs) min: 1 max: 349 x̄: 4.77 x̃: 3
helped stats (rel) min: 0.01% max: 34.91% x̄: 3.19% x̃: 2.19%
HURT stats (abs)   min: 1 max: 19 x̄: 2.89 x̃: 1
HURT stats (rel)   min: 0.24% max: 7.94% x̄: 1.27% x̃: 0.81%
95% mean confidence interval for instructions value: -5.20 -4.22
95% mean confidence interval for instructions %-change: -3.30% -3.01%
Instructions are helped.

total alu in shared programs: 2182880 -> 2172352 (-0.48%)
alu in affected programs: 513166 -> 502638 (-2.05%)
helped: 2235
HURT: 16
helped stats (abs) min: 1 max: 349 x̄: 4.73 x̃: 3
helped stats (rel) min: 0.02% max: 37.65% x̄: 3.70% x̃: 2.59%
HURT stats (abs)   min: 1 max: 19 x̄: 2.50 x̃: 1
HURT stats (rel)   min: 0.33% max: 3.74% x̄: 1.04% x̃: 0.91%
95% mean confidence interval for alu value: -5.16 -4.20
95% mean confidence interval for alu %-change: -3.83% -3.49%
Alu are helped.

total fscib in shared programs: 2178643 -> 2168059 (-0.49%)
fscib in affected programs: 514666 -> 504082 (-2.06%)
helped: 2243
HURT: 17
helped stats (abs) min: 1 max: 349 x̄: 4.74 x̃: 3
helped stats (rel) min: 0.02% max: 37.65% x̄: 3.74% x̃: 2.59%
HURT stats (abs)   min: 1 max: 19 x̄: 2.65 x̃: 1
HURT stats (rel)   min: 0.33% max: 14.71% x̄: 1.85% x̃: 0.93%
95% mean confidence interval for fscib value: -5.16 -4.20
95% mean confidence interval for fscib %-change: -3.87% -3.53%
Fscib are helped.

total bytes in shared programs: 18467348 -> 18403042 (-0.35%)
bytes in affected programs: 4403648 -> 4339342 (-1.46%)
helped: 2247
HURT: 20
helped stats (abs) min: 2 max: 2132 x̄: 28.73 x̃: 18
helped stats (rel) min: 0.01% max: 33.53% x̄: 2.80% x̃: 1.94%
HURT stats (abs)   min: 4 max: 72 x̄: 12.60 x̃: 6
HURT stats (rel)   min: 0.23% max: 6.58% x̄: 1.06% x̃: 0.75%
95% mean confidence interval for bytes value: -31.29 -25.45
95% mean confidence interval for bytes %-change: -2.90% -2.64%
Bytes are helped.

total regs in shared programs: 864605 -> 864442 (-0.02%)
regs in affected programs: 4692 -> 4529 (-3.47%)
helped: 68
HURT: 48
helped stats (abs) min: 1 max: 54 x̄: 7.25 x̃: 3
helped stats (rel) min: 4.26% max: 43.20% x̄: 13.21% x̃: 10.53%
HURT stats (abs)   min: 1 max: 36 x̄: 6.88 x̃: 6
HURT stats (rel)   min: 3.64% max: 91.67% x̄: 23.12% x̃: 24.00%
95% mean confidence interval for regs value: -3.60 0.79
95% mean confidence interval for regs %-change: -2.10% 5.75%
Inconclusive result (value mean confidence interval includes 0).

total uniforms in shared programs: 2120927 -> 2120911 (<.01%)
uniforms in affected programs: 770 -> 754 (-2.08%)
helped: 6
HURT: 0
helped stats (abs) min: 2 max: 4 x̄: 2.67 x̃: 2
helped stats (rel) min: 1.79% max: 2.70% x̄: 2.13% x̃: 1.96%
95% mean confidence interval for uniforms value: -3.75 -1.58
95% mean confidence interval for uniforms %-change: -2.50% -1.76%
Uniforms are helped.

total threads in shared programs: 27612224 -> 27613056 (<.01%)
threads in affected programs: 7168 -> 8000 (11.61%)
helped: 6
HURT: 3
helped stats (abs) min: 64 max: 192 x̄: 170.67 x̃: 192
helped stats (rel) min: 8.33% max: 23.08% x̄: 20.62% x̃: 23.08%
HURT stats (abs)   min: 64 max: 64 x̄: 64.00 x̃: 64
HURT stats (rel)   min: 8.33% max: 9.09% x̄: 8.59% x̃: 8.33%
95% mean confidence interval for threads value: -3.17 188.06
95% mean confidence interval for threads %-change: -0.92% 22.69%
Inconclusive result (value mean confidence interval includes 0).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31908>
2024-10-30 10:14:07 -04:00
Alyssa Rosenzweig
e3f91fb13c nir/serialize: fix name
no more nir_register

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Reviewed-by: Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31892>
2024-10-30 12:59:11 +00:00
Alyssa Rosenzweig
b8624d5c6b nir: correct comment
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Reviewed-by: Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31892>
2024-10-30 12:59:11 +00:00
Alyssa Rosenzweig
33299354e0 nir/opt_algebraic: optimize patterns hit with OpenCL
This patterns were all found in the AGX quads tessellator, a medium-sized OpenCL
kernel. LLVM generates a lot of garbage around booleans which we need to chew
through. Though there's nothing AGX or really OpenCL specific here, so some of
this could help graphics shaders too.

Together, their effect is significant for that kernel instr count & occupancy:

before: 2966 inst, 2310 alu, 2310 fscib, 1216 ic, 23148 bytes, 239 regs, 384 threads
after:  2848 inst, 2246 alu, 2246 fscib, 1000 ic, 22260 bytes, 231 regs, 448 threads

No significant changes on GL shaderdb (a single godot shader regressed 1
instruction, 1344->1345).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31892>
2024-10-30 12:59:10 +00:00
Marek Olšák
ee452129c6 nir: add cull_triangles_, cull_lines_ prefixes to viewport_xy_scale_and_offset
for radeonsi

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865>
2024-10-29 16:47:44 +00:00
Marek Olšák
2227f5be9d nir: rename load_cull_small_primitive_precision -> triangle, add line_precision
for radeonsi

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865>
2024-10-29 16:47:44 +00:00
Marek Olšák
0914e0d02f nir: rename load_cull_small_primitives -> triangles, add load_cull_small_lines
for radeonsi

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865>
2024-10-29 16:47:44 +00:00
Georg Lehmann
d6535f2602 nir/opt_algebraic: create ubfe with non constant mask
Foz-DB Navi21:
Totals from 278 (0.35% of 79395) affected shaders:
MaxWaves: 7444 -> 7448 (+0.05%)
Instrs: 316069 -> 314584 (-0.47%); split: -0.47%, +0.00%
CodeSize: 1608064 -> 1593204 (-0.92%)
VGPRs: 11128 -> 11120 (-0.07%)
Latency: 796599 -> 797786 (+0.15%); split: -0.19%, +0.34%
InvThroughput: 141195 -> 139472 (-1.22%); split: -1.22%, +0.00%
Copies: 28565 -> 29796 (+4.31%); split: -0.15%, +4.46%
PreSGPRs: 14335 -> 14336 (+0.01%)
VALU: 161342 -> 159426 (-1.19%)
SALU: 87794 -> 88305 (+0.58%); split: -0.03%, +0.61%

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31852>
2024-10-29 10:51:10 +00:00
Timur Kristóf
be68aeafdc nir/opt_algebraic: Add various bitfield extract patterns.
v2 (Georg Lehmann):
- fixed incorrect imin in ubfe_ubfe
- simplied outer_bits of ushr((ubfe, ...), ...) opt
- added is_used_once to iand(ushr(), ...) opt to improve stats

For-DB Navi21:
Totals from 3309 (4.18% of 79206) affected shaders:
Instrs: 5295291 -> 5282128 (-0.25%); split: -0.28%, +0.03%
CodeSize: 28299320 -> 28298456 (-0.00%); split: -0.07%, +0.06%
Latency: 51566173 -> 51521923 (-0.09%); split: -0.09%, +0.01%
InvThroughput: 13222050 -> 13204557 (-0.13%); split: -0.14%, +0.01%
VClause: 116451 -> 116458 (+0.01%); split: -0.02%, +0.02%
SClause: 160356 -> 160324 (-0.02%); split: -0.03%, +0.01%
Copies: 424152 -> 423670 (-0.11%); split: -0.20%, +0.09%
Branches: 156701 -> 156192 (-0.32%); split: -0.33%, +0.01%
PreSGPRs: 168507 -> 168500 (-0.00%); split: -0.02%, +0.01%
PreVGPRs: 151477 -> 151474 (-0.00%)
VALU: 3486077 -> 3476675 (-0.27%); split: -0.31%, +0.04%
SALU: 786467 -> 783109 (-0.43%); split: -0.45%, +0.03%
VMEM: 188035 -> 188060 (+0.01%)
SMEM: 259632 -> 259630 (-0.00%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31852>
2024-10-29 10:51:09 +00:00
Georg Lehmann
695d2414cd nir,radv: optimize shared atomic offsets
Foz-DB Navi21:
Totals from 87 (0.11% of 79395) affected shaders:
Instrs: 140877 -> 140873 (-0.00%)
CodeSize: 747760 -> 747164 (-0.08%); split: -0.09%, +0.01%
Latency: 4528171 -> 4528162 (-0.00%)
InvThroughput: 826358 -> 826349 (-0.00%)
Copies: 10888 -> 10884 (-0.04%)
VALU: 84634 -> 84630 (-0.00%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31080>
2024-10-29 09:31:08 +00:00
Patrick Lerda
5a423d2d9a glsl: fix gl_nir_validate_intrastage_interface_blocks() memory leak
For instance, this issue is triggered on redeonsi with
"piglit/bin/shader_runner tests/spec/glsl-1.50/linker/interface-blocks-multiple-vs-member-count-mismatch.shader_test -auto -fbo":
Indirect leak of 176 byte(s) in 1 object(s) allocated from:
    #0 0x7f894b5cd7ef in __interceptor_malloc (/usr/lib64/libasan.so.6+0xb17ef)
    #1 0x7f894183aebf in ralloc_size ../src/util/ralloc.c:118
    #2 0x7f894183b36e in rzalloc_size ../src/util/ralloc.c:152
    #3 0x7f894183b36e in rzalloc_array_size ../src/util/ralloc.c:232
    #4 0x7f894182da67 in _mesa_hash_table_init ../src/util/hash_table.c:163
    #5 0x7f894182da67 in _mesa_hash_table_create ../src/util/hash_table.c:186
    #6 0x7f894169af03 in gl_nir_validate_intrastage_interface_blocks ../src/compiler/glsl/gl_nir_link_interface_blocks.c:533
    #7 0x7f89414464a4 in link_intrastage_shaders ../src/compiler/glsl/gl_nir_linker.c:2750
    #8 0x7f894144bad2 in gl_nir_link_glsl ../src/compiler/glsl/gl_nir_linker.c:3785
    #9 0x7f894128977e in st_link_glsl_to_nir ../src/mesa/state_tracker/st_glsl_to_nir.cpp:515
    #10 0x7f894128977e in st_link_shader ../src/mesa/state_tracker/st_glsl_to_nir.cpp:1008
    #11 0x7f894113c7b5 in link_program ../src/mesa/main/shaderapi.c:1317
    #12 0x7f894113c7b5 in link_program_error ../src/mesa/main/shaderapi.c:1426
    #13 0x7f8940afb1bb in _mesa_unmarshal_LinkProgram src/mapi/glapi/gen/marshal_generated2.c:1627
    #14 0x7f894063319b in glthread_unmarshal_batch ../src/mesa/main/glthread.c:141
    #15 0x7f894184e658 in util_queue_thread_func ../src/util/u_queue.c:294
    #16 0x7f89418d220a in impl_thrd_routine ../src/c11/impl/threads_posix.c:67
    #17 0x7f894a66a7c3  (/lib64/libc.so.6+0x867c3)
...
SUMMARY: AddressSanitizer: 1392 byte(s) leaked in 11 allocation(s).

Fixes: ffbd763586 ("glsl: add gl_nir_validate_intrastage_interface_blocks()")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31871>
2024-10-28 22:23:29 +00:00
Rob Clark
7f63fa34da nir/lower_amul: Fix ASAN error
We shouldn't assume the bindings are sparse when we allocate an array
indexed on the binding.  See, for example:

  dEQP-GLES31.functional.program_interface_query.buffer_variable.random.55

Fixes: 2e833b16bc ("nir/lower_amul: Use num_ubos/ssbos instead of recomputing it.")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31611>
2024-10-25 15:38:51 +00:00
Pierre-Eric Pelloux-Prayer
9434ac65f4 glsl: use nir_io_add_const_offset_to_base in gl_nir_opts
This fixes:
   KHR-GLES32.core.tessellation_shader.tessellation_shader_tessellation.max_in_out_attributes

Without this change the assert in gather_output is hit:
   assert(!nir_src_is_const(offset) || nir_src_as_uint(offset) == 0)

Because nir_opt_algebraic determines that some ssa values are constant,
but the nir_io_add_const_offset_to_base wasn't run afterwards.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31684>
2024-10-25 13:36:54 +00:00
Pierre-Eric Pelloux-Prayer
60578df33a nir: skip offset=0 in nir_io_add_const_offset_to_base
When offset=0, the pass was a no-op but was setting the progress
flag which could cause infinite loops when this pass is going
to be added to gl_nir_opts.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31684>
2024-10-25 13:36:54 +00:00
Rhys Perry
8efc765a3d nir/algebraic: fix shfr optimization with zero src2
No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Fixes: 08903bbe89 ("nir: add mqsad_4x8, shfr and nir_opt_mqsad")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31808>
2024-10-25 09:59:40 +00:00
Rhys Perry
b2abd3bdba nir: fix shfr constant folding with zero src2
No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Fixes: 08903bbe89 ("nir: add mqsad_4x8, shfr and nir_opt_mqsad")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31808>
2024-10-25 09:59:40 +00:00
Daniel Schürmann
87cb42f953 treewide: don't lower to LCSSA before calling nir_divergence_analysis()
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>
2024-10-24 10:06:17 +00:00
Daniel Schürmann
95ed72922e nir/divergence: Don't assume that LCSSA phis are not loop-invariant
Since we check for loop-invariance, we don't have to unconditionally
flag LCSSA phis as divergent in presence of divergent breaks.
This ensures consistency, with or without LCSSA form.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>
2024-10-24 10:06:17 +00:00
Daniel Schürmann
c5f142a695 nir/divergence: skip expensive nir_src_is_divergent() check in most cases
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>
2024-10-24 10:06:17 +00:00
Daniel Schürmann
0eff03d385 nir/divergence: calculate divergence without requiring LCSSA form
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>
2024-10-24 10:06:17 +00:00
Daniel Schürmann
d34d2f8fa8 nir: consider loop invariance in nir_src_is_divergent()
By doing so, this function does not require LCSSA form anymore
in order to provide correct results.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>
2024-10-24 10:06:17 +00:00
Daniel Schürmann
1a55d6c23b nir/divergence: Introduce and set nir_def::loop_invariant
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>
2024-10-24 10:06:17 +00:00
Daniel Schürmann
c0b3d7a916 nir/divergence: require nir_metadata_block_index
This allows for fast checks whether some value is defined inside a loop.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>
2024-10-24 10:06:17 +00:00
Daniel Schürmann
8d1abd4996 treewide: use nir_src_is_divergent() rather than checking the divergence of the SSA
Without LCSSA, divergence between src and def might differ.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>
2024-10-24 10:06:17 +00:00
Daniel Schürmann
c8348139fd nir: change signature of nir_src_is_divergent()
Now, it takes nir_src * instead of nir_src.
Also move the implementation to nir_divergence_analysis.c.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>
2024-10-24 10:06:17 +00:00
Daniel Schürmann
421b42637d nir: remove nir_update_instr_divergence()
This function has obscure limitations.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>
2024-10-24 10:06:17 +00:00
Daniel Schürmann
ce0a3fe645 nir/opt_uniform_atomics: don't preserve divergence information
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>
2024-10-24 10:06:17 +00:00
Daniel Schürmann
c25c63ebc0 nir/divergence: separately indicate whether loops have divergent continues or breaks
bool nir_loop_is_divergent(nir_loop *)
 replaces the previous loop->divergent indicator.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>
2024-10-24 10:06:17 +00:00
Georg Lehmann
1f9b82bb2a nir/opt_algebraic: optimize -0.0 + a
Foz-DB Navi21:
Totals from 428 (0.54% of 79395) affected shaders:
MaxWaves: 8510 -> 8512 (+0.02%)
Instrs: 731062 -> 729665 (-0.19%); split: -0.19%, +0.00%
CodeSize: 3735788 -> 3728324 (-0.20%); split: -0.20%, +0.00%
VGPRs: 27328 -> 27336 (+0.03%); split: -0.03%, +0.06%
SpillSGPRs: 315 -> 314 (-0.32%)
Latency: 3872986 -> 3873236 (+0.01%); split: -0.08%, +0.09%
InvThroughput: 971001 -> 970056 (-0.10%); split: -0.17%, +0.08%
VClause: 11954 -> 11956 (+0.02%); split: -0.02%, +0.03%
SClause: 17361 -> 17358 (-0.02%)
Copies: 59038 -> 59045 (+0.01%); split: -0.22%, +0.24%
Branches: 17685 -> 17656 (-0.16%)
PreSGPRs: 26103 -> 26102 (-0.00%)
PreVGPRs: 23220 -> 23206 (-0.06%)
VALU: 515293 -> 513963 (-0.26%); split: -0.26%, +0.00%
SALU: 91591 -> 91544 (-0.05%)

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31770>
2024-10-23 08:58:34 +00:00
Marek Olšák
0226922384 nir: add nir_gather_tcs_info, new gathering/analysis pass
This does shader analysis that is more niche than regular shader info.

It's planned to be used by nir_restructure_tcs_flow as discussed here:
    https://gitlab.freedesktop.org/mesa/mesa/-/issues/11910

It's also useful for driver-specific passes.

The code for gathering "all_invocations_define_tess_levels" is copied
from radeonsi. The rest is new.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31447>
2024-10-23 03:17:16 +00:00
Amber
a3afe22dc9 nir: add pass to lower atomic arithmetic to a loop with cmpxchg.
Signed-off-by: Amber Harmonia <amber@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27776>
2024-10-21 21:47:44 +00:00
Mary Guillemard
84d57e1fb1 nir: Move atomic_op_to_alu to common code
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27776>
2024-10-21 21:47:44 +00:00
Marek Olšák
fb6184f89c nir: add shader_info::tess::tcs_same_invocation_inputs_read(_indirect)
We need both the same-invocation usage mask and cross-invocation usage
mask. The AMD reason is below.

Cross-invocation TCS input access doesn't prevent the same-invocation
fast path in AMD hw because it's just a different way to load the same
data, and we want to use both paths for the same TCS input based on
the load instruction. The fast path can't be used for indirect access,
which is gathered separately for same-invocation access.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31645>
2024-10-21 18:53:51 +00:00
Christian Gmeiner
5fa4c1a191 compiler/rust: Copy NirInstrPrinter from NAK
Switch NAK to it.

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31706>
2024-10-18 12:43:52 +00:00
Pavel Ondračka
33c8dc4f18 nir/nir_group_loads: reduce chance of max_distance check overflow
Helps for the case when max_distance is set to ~0, where the pass would now
only create groups of two loads together due to overflow. Found while
experimenting with this pass on r300, however the only driver currently
affected is i915.

With i915 this change gains around 20 shaders in my small shader-db
(most notably some GLMark2, Unigine Tropics, Tesseract, Amnesia) at
the expense of increased register pressure in few other cases.
I'm assuming this is a good deal for such old HW, and this seems like what
was intended when the pass was introduced to i915, but anyway this
could be tweaked further driver side with a more optimized max_distance
value. Only shader-db tested.

Relevant i915 shader-db stats (lpt):
total tex_indirect in shared programs: 1529 -> 1493 (-2.35%)
tex_indirect in affected programs: 96 -> 60 (-37.50%)
helped: 29
HURT: 2
total temps in shared programs: 3015 -> 3200 (6.14%)
temps in affected programs: 465 -> 650 (39.78%)
helped: 1
HURT: 91

GAINED: 20

Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: GKraats <vd.kraats@hccnet.nl>
Fixes: 33b4eb149e
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31529>
2024-10-18 09:21:22 +00:00
Job Noorman
509606e56d nir/lower_subgroups: scan/reduce for multiple ballot components
lower_scan_reduce only worked when ballot_components equals one. This
commit adds support for arbitrary ballot_components.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31587>
2024-10-18 06:57:52 +00:00
Job Noorman
58b199f7ed nir/lower_subgroups: add build_cluster_mask helper
This functionality will become more complex in the next commit so
separate it into a helper function.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31587>
2024-10-18 06:57:52 +00:00
Job Noorman
e0cb4a94a3 nir/lower_subgroups: move up some helper functions
build_subgroup_mask and build_ballot_imm_ishl will be needed by other
functions higher-up the file.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31587>
2024-10-18 06:57:52 +00:00
Lionel Landwerlin
97b17aa0b1 brw/nir: rework inline_data_intel to work with compute
This intrinsic was initially dedicated to mesh/task shaders, but the
mechanism it exposes also exists in the compute shaders on Gfx12.5+.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31508>
2024-10-17 19:35:59 +00:00
Faith Ekstrand
4cc9730307 compiler/rust: Fix a bad cast in the memstream abstraction
If you just do ref.cast(), it will cast the thing it's a reference to.
If you want to turn a reference into a pointer, you need to explicitly
use "as".

Fixes: 279f38918f ("nak: memstream: move into common code")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31718>
2024-10-17 18:59:02 +00:00
Faith Ekstrand
212e07a70e compiler/rust: Add a unit test for the memstream abstraction
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31718>
2024-10-17 18:59:02 +00:00
Faith Ekstrand
ec24156b31 compiler/rust: Enable unit tests
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31718>
2024-10-17 18:59:01 +00:00
Georg Lehmann
dbf63a0788 nir: remove nir_op_is_derivative
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31014>
2024-10-17 09:50:19 +00:00