Commit graph

10094 commits

Author SHA1 Message Date
Marek Olšák
14956aa0f2 mesa: remove unused PROGRAM_SYSTEM_VALUE
ARB_vp/fp don't have system values.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32779>
2025-01-06 19:09:17 +00:00
Marek Olšák
ee8916c414 nir: use IO intrinsics in nir_lower_drawpixels
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32779>
2025-01-06 19:09:17 +00:00
Marek Olšák
0de28a9fd0 nir: use IO intrinsics in nir_lower_bitmap
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32779>
2025-01-06 19:09:17 +00:00
Marek Olšák
a7ad1b302b nir: remove redundant option linker_ignore_precision
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32779>
2025-01-06 19:09:17 +00:00
Marek Olšák
730c8d506f nir: flip the early exit condition in nir_lower_io_temporaries
no change in behavior other than skipping COMPUTE as well.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32779>
2025-01-06 19:09:17 +00:00
Marek Olšák
7b55ee999d nir: don't set num_slots/src/dest_type/write_mask when they're set automatically
to those values

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32779>
2025-01-06 19:09:17 +00:00
Marek Olšák
55a4a8a2a8 nir: set src_type and dest_type to float implicitly for IO build helpers
If you want to set it to int/uint, set .src_type or .dest_type. If you want
to set it to float, you don't need to set the type at all. It's implicitly
set to float.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32779>
2025-01-06 19:09:17 +00:00
Marek Olšák
b9f9d001d7 nir: set nir_io_semantics::num_slots to at least 1 in build helpers
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32779>
2025-01-06 19:09:17 +00:00
Benjamin Lee
081438ad39 panfrost: add nir pass to lower noperspective varyings
Mali only supports perspective-correct varying interpolation in
hardware, so we have to emulate noperspective with lowering in both the
VS and FS.

Both vulkan and opengl allow mismatched interpolation qualifiers between
stages. Because we need all varyings that are noperspective in the FS to
be lowered in the VS, we cannot rely on the interpolation qualifiers in
the VS. Loading the set of noperspective varyings as a sysval allows the
implementation to pass them as a compile-time constant when known
statically, or a runtime push constant when not. Passing noperspective
varyings dynamically has a performance cost with unnecessary branches
and fmuls.

This sysval is not hooked up yet in either panfrost or panvk, so shader
compilation will fail.

Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32127>
2025-01-03 07:04:05 +00:00
Benjamin Lee
6f541e2016 panfrost: add intrinsic to load frag coord at a barycentric
This is needed for noperspective lowering, where we need to multiply the
varying value by gl_FragCoord.w at the same barycentric as the varying.
Normal nir_load_frag_coord_zw instructions are lowered to the new
intrinsic on bifrost with the pan_lower_frag_coord_zw pass.

Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32127>
2025-01-03 07:04:05 +00:00
Mel Henning
c273ada502 compiler/rust/bitset: Test next_unset()
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32812>
2025-01-02 20:52:48 +00:00
Mel Henning
2bcb950865 compiler/rust/bitset: Don't expose words
This encapsulates the bitset's word size and word count, which means
consumers no longer need to be careful about word count. Users of the old
apis for writing expressions on bit sets should migrate to the new expression
API.

Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32812>
2025-01-02 20:52:48 +00:00
Mel Henning
86e5cb7c2d compiler/rust/bitset: Take a stream in union_with
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32812>
2025-01-02 20:52:48 +00:00
Mel Henning
47da213e19 compiler/rust/bitset: Add a lazy expression API
The new api doesn't require allocations for intermediate values in
expressions. It also has tests, which is nice because eg. the previous
implementation of the `&` operator was broken.

Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32812>
2025-01-02 20:52:48 +00:00
Mel Henning
e52b2ee4b9 compiler/rust/bitset: Remove impl Not
This is extremely difficult to use correctly for bitsets of
different sizes. Also, nobody uses it. Remove the footgun.

Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32812>
2025-01-02 20:52:48 +00:00
Mel Henning
8ec885da1d compiler/rust/bitset: impl FromIterator
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32812>
2025-01-02 20:52:48 +00:00
Mel Henning
de47702dde compiler/rust/bitset: Make BitSetIter private
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32812>
2025-01-02 20:52:47 +00:00
Mel Henning
06cd3c7fa3 compiler/rust/bitset: Removed unused start param
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32812>
2025-01-02 20:52:47 +00:00
Mel Henning
6ba317bd8c compiler/rust/bitset: Add a basic test
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32812>
2025-01-02 20:52:47 +00:00
Mel Henning
3b341366a6 compiler/rust: Fix running tests
`ninja test` wasn't actually running these tests, I guess because the
target name was duplicated in meson. Fix this so the tests actually run.

Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32812>
2025-01-02 20:52:47 +00:00
Mel Henning
639211dea8 compiler/rust/bitset: Fix the bitset iterator
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32812>
2025-01-02 20:52:47 +00:00
Timur Kristóf
ec548fd37b Revert "nir/opt_varyings: Add workaround for RADV mesh shader multiview."
The workaround is not needed anymore, because RADV now implements
the FS layer ID input as a sysval.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32641>
2025-01-02 14:07:51 +00:00
Marek Olšák
c21bc65ba7 nir/opt_load_store_vectorize: make hole_size signed to indicate overlapping loads
A negative hole size means the loads overlap. This will be used by drivers
to handle overlapping loads in the callback easily.

Reviewed-by: Mel Henning <drawoc@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32699>
2025-01-01 00:03:55 +00:00
Georg Lehmann
e112e2b047 nir,amd: optimize front_face ? a : -a
Foz-DB Navi31:
Totals from 3345 (4.21% of 79395) affected shaders:
MaxWaves: 96182 -> 96174 (-0.01%)
Instrs: 3135439 -> 3129508 (-0.19%); split: -0.24%, +0.05%
CodeSize: 16776088 -> 16718048 (-0.35%); split: -0.38%, +0.03%
VGPRs: 190884 -> 190848 (-0.02%); split: -0.03%, +0.01%
Latency: 32624132 -> 32621734 (-0.01%); split: -0.16%, +0.16%
InvThroughput: 5759987 -> 5749957 (-0.17%); split: -0.23%, +0.05%
VClause: 51044 -> 51086 (+0.08%); split: -0.12%, +0.20%
SClause: 103415 -> 103223 (-0.19%); split: -0.64%, +0.45%
Copies: 170398 -> 170555 (+0.09%); split: -0.64%, +0.74%
PreSGPRs: 135567 -> 133887 (-1.24%)
PreVGPRs: 140569 -> 141317 (+0.53%)
VALU: 1959144 -> 1953839 (-0.27%); split: -0.30%, +0.03%
SALU: 217956 -> 217676 (-0.13%); split: -0.20%, +0.07%

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32791>
2024-12-30 22:31:35 +00:00
Georg Lehmann
9bd4296845 nir: add nir_alu_srcs_negative_equal_typed
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32791>
2024-12-30 22:31:35 +00:00
Georg Lehmann
15d754fefa nir: add load_front_face_fsign
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32791>
2024-12-30 22:31:34 +00:00
Georg Lehmann
b8fa9daf0c nir: sink/move alu with two identical, non constant sources.
Foz-DB Navi21:
Totals from 32363 (40.76% of 79395) affected shaders:
MaxWaves: 787499 -> 787675 (+0.02%); split: +0.02%, -0.00%
Instrs: 28783404 -> 28783464 (+0.00%); split: -0.01%, +0.01%
CodeSize: 156763536 -> 156765148 (+0.00%); split: -0.01%, +0.02%
VGPRs: 1493304 -> 1492848 (-0.03%); split: -0.04%, +0.01%
Latency: 243022511 -> 243051994 (+0.01%); split: -0.08%, +0.09%
InvThroughput: 57827398 -> 57828129 (+0.00%); split: -0.05%, +0.05%
VClause: 582208 -> 582298 (+0.02%); split: -0.07%, +0.08%
SClause: 959634 -> 959312 (-0.03%); split: -0.07%, +0.04%
Copies: 1965821 -> 1965826 (+0.00%); split: -0.17%, +0.17%
Branches: 710593 -> 710596 (+0.00%); split: -0.00%, +0.01%
PreSGPRs: 1313513 -> 1313632 (+0.01%); split: -0.00%, +0.01%
PreVGPRs: 1210596 -> 1209103 (-0.12%); split: -0.12%, +0.00%
VALU: 19463445 -> 19463497 (+0.00%); split: -0.02%, +0.02%
SALU: 3319529 -> 3319500 (-0.00%); split: -0.01%, +0.01%

Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32783>
2024-12-30 13:28:30 +00:00
Georg Lehmann
5b4b195f1b nir: optimize unpacking 8bit values from a 64bit source
Useful for load vectorization.

Foz-DB Navi21:
Totals from 299 (0.38% of 79395) affected shaders:
Instrs: 287818 -> 284333 (-1.21%); split: -1.21%, +0.00%
CodeSize: 1557124 -> 1540544 (-1.06%); split: -1.07%, +0.00%
Latency: 4009407 -> 4012389 (+0.07%); split: -0.05%, +0.12%
InvThroughput: 1260613 -> 1262530 (+0.15%); split: -0.01%, +0.17%
VClause: 5472 -> 5369 (-1.88%); split: -1.92%, +0.04%
SClause: 5419 -> 5305 (-2.10%); split: -2.58%, +0.48%
Copies: 36709 -> 36060 (-1.77%); split: -1.81%, +0.04%
PreSGPRs: 11861 -> 11655 (-1.74%)
SALU: 66920 -> 64310 (-3.90%)

Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32778>
2024-12-26 17:50:32 +00:00
Marek Olšák
58132d6fc8 radeonsi: implement nir_opt_frag_depth using kill_z instead of the NIR pass
This uses si_shader_info to store whether gl_FragDepth can be removed,
and it uses the kill_z epilog flag to do the removal without recompilation.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32713>
2024-12-24 12:02:20 +00:00
Marek Olšák
dae57e184a glsl,st/mesa: always lower IO for GLSL, unlower IO for drivers
This enables nir_opt_varyings for all gallium drivers.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31942>
2024-12-24 05:54:07 -05:00
Mary Guillemard
13fe5a597b meson: Add mesa-clc and install-mesa-clc options
Due to the cross build issues in current meson, we adds new options to
allow mesa_clc and vtn_bindgen to be installed or searched on the
system.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32719>
2024-12-23 15:09:40 +00:00
Marek Olšák
a50d069d1c nir/opt_varyings: clear info->clip/cull_distance_array_size if relocated
svga breaks if shader_info declares these, but the shader is missing
the outputs.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32684>
2024-12-20 02:32:08 +00:00
Marek Olšák
9d129505b5 nir/opt_varyings: set all IO types to float to facilitate full vectorization
If types differ between components of a vec4 slot, IO vectorization can't
be done.

This also helps drivers like d3d12 that require matching types between
shaders.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32684>
2024-12-20 02:32:08 +00:00
Caterina Shablia
f4fcfa8016 pan,nir: introduce load_attribute_pan
load_attribute_pan is a panfrost-specific intrinsic for loading
vertex attributes. Takes explicit vertex and instance IDs which
we need in order to implement vertex attribute divisor with
non-zero base instance on v9+.

Passes which are used by panvk are modified to be aware of
load_attribute_pan.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32039>
2024-12-18 08:33:16 +00:00
Georg Lehmann
c695043e81 nir/opt_algebraic: optimize min(max(a, b), a)
Foz-DB Navi21:
Totals from 105 (0.13% of 79395) affected shaders:
MaxWaves: 2638 -> 2646 (+0.30%)
Instrs: 76531 -> 75077 (-1.90%)
CodeSize: 413668 -> 406484 (-1.74%)
VGPRs: 4856 -> 4848 (-0.16%)
Latency: 333684 -> 328438 (-1.57%); split: -1.57%, +0.00%
InvThroughput: 80417 -> 78579 (-2.29%)
VClause: 1818 -> 1768 (-2.75%)
SClause: 3028 -> 2964 (-2.11%)
Copies: 4708 -> 4513 (-4.14%); split: -4.50%, +0.36%
PreVGPRs: 3792 -> 3715 (-2.03%); split: -2.08%, +0.05%
VALU: 54734 -> 53528 (-2.20%)
SALU: 6195 -> 6137 (-0.94%)
VMEM: 2363 -> 2313 (-2.12%)
SMEM: 5219 -> 5119 (-1.92%)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32634>
2024-12-16 22:29:21 +00:00
Georg Lehmann
0e6d32777f nir/opt_remove_phis: rematerialize equal alu
Foz-DB Navi31:
Totals from 943 (1.19% of 79395) affected shaders:
MaxWaves: 24672 -> 24722 (+0.20%)
Instrs: 1541665 -> 1544956 (+0.21%); split: -0.23%, +0.44%
CodeSize: 8085180 -> 8109212 (+0.30%); split: -0.16%, +0.46%
VGPRs: 57768 -> 57624 (-0.25%)
Latency: 18043743 -> 17948245 (-0.53%); split: -1.28%, +0.75%
InvThroughput: 2692605 -> 2677049 (-0.58%); split: -2.07%, +1.49%
VClause: 25321 -> 25343 (+0.09%); split: -0.48%, +0.57%
SClause: 38473 -> 38614 (+0.37%); split: -0.00%, +0.37%
Copies: 86089 -> 86236 (+0.17%); split: -0.46%, +0.63%
Branches: 36719 -> 36777 (+0.16%); split: -0.60%, +0.76%
PreSGPRs: 44138 -> 44303 (+0.37%); split: -0.05%, +0.42%
PreVGPRs: 43319 -> 43009 (-0.72%)
VALU: 893684 -> 894272 (+0.07%); split: -0.42%, +0.48%
SALU: 189561 -> 191358 (+0.95%); split: -0.05%, +1.00%
VMEM: 42294 -> 42313 (+0.04%); split: -0.44%, +0.49%
SMEM: 72916 -> 73144 (+0.31%)

Instruction count regressions are largly caused by additional
loop unrolling.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31028>
2024-12-16 20:38:38 +00:00
Qiang Yu
129e37bab6 nir: do not generate b2i64 when driver want to lower it
This is found on GFX12 by:
  KHR-GL43.shader_ballot_tests.ShaderBallotBitmasks

ACO does not support it.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32570>
2024-12-16 07:35:07 +00:00
Alyssa Rosenzweig
923e6361d1 compiler/glsl_types: add glsl_get_word_size_align_bytes
this alignment matches what nir_lower_scratch_to_var wants. this is not
correctness bearing but it mitigates stats regressions.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>
2024-12-12 21:16:13 +00:00
Alyssa Rosenzweig
bd89279dd4 nir: add lower_scratch_to_var pass
to ease opencl pain.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>
2024-12-12 21:16:13 +00:00
Alyssa Rosenzweig
8abb043c19 compiler: add mesa_prim_has_adjacency helper
hk will use this, it's a pretty obvious thing to want.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>
2024-12-12 21:16:12 +00:00
Alyssa Rosenzweig
e4f61771d8 compiler: use libcl.h for CL
instead of redefining BITFIELD_BIT.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>
2024-12-12 21:16:12 +00:00
Alyssa Rosenzweig
d64caf4161 libcl: add VkDraw(Indexed)IndirectCommand definitions
this is helpful to indirect draw munging code, which applies to at least 3
stacks using driver CL stuff (current Intel, shortterm Asahi, mediumterm
Panfrost)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>
2024-12-12 21:16:12 +00:00
Alyssa Rosenzweig
12e27497b3 libcl: add a common header for CPU/GPU stuff
In an attempt to make OpenCL shaders more "batteries included", start building
up a standard library. Based on libagx.h.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>
2024-12-12 21:16:12 +00:00
Alyssa Rosenzweig
13b8af95fb clc: plumb cl_khr_subgroup_ballot
although rusticl isn't lighting it up yet, it's helpful to get
sub_group_ballot for driver CL, which is all standard Vulkan-compatible spirv.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>
2024-12-12 21:16:12 +00:00
Samuel Pitoiset
4d4418dbb3 spirv: add an options to lower SpvOpTerminateInvocation to OpKill
To workaround game bugs like Indiana Jones.

Original workaround found by Hans-Kristian.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32606>
2024-12-12 19:54:39 +00:00
Rhys Perry
26790e90d3 nir: make ballot ALU and mbcnt_amd operations reorderable
These can be lowered to ALU and load_subgroup_invocation, all of which are
reorderable.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32512>
2024-12-11 14:47:12 +00:00
Rhys Perry
650468fbdf nir/move_discards_to_top: don't move across more intrinsics
This missed dpp16_shift_amd, lane_permute_16_amd, last_invocation and
ballot_relaxed.

Instead, list the non-reorderable intrinsics which are allowed to be moved
after discards.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32512>
2024-12-11 14:47:12 +00:00
Rhys Perry
5368569d06 nir: make load_helper_invocation non-reorderable
This can't be moved to after demote, so it's not reorderable.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32512>
2024-12-11 14:47:12 +00:00
Georg Lehmann
e8b29abb25 nir: add unsigned upper bound support for fsat
Foz-DB Navi21:
Totals from 89 (0.11% of 79395) affected shaders:
Instrs: 97018 -> 96995 (-0.02%)
CodeSize: 492996 -> 492488 (-0.10%)
Latency: 504649 -> 504555 (-0.02%)
InvThroughput: 121968 -> 121875 (-0.08%)
VALU: 67427 -> 67404 (-0.03%)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32565>
2024-12-10 20:53:53 +00:00
Georg Lehmann
e78e63e3fe nir: add unsigned upper bound support for f2i32
Foz-DB Navi21:
Totals from 649 (0.82% of 79395) affected shaders:
CodeSize: 2330592 -> 2314112 (-0.71%)
Latency: 2068161 -> 2053370 (-0.72%)
InvThroughput: 346583 -> 329425 (-4.95%)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32565>
2024-12-10 20:53:53 +00:00