Commit graph

6733 commits

Author SHA1 Message Date
Georg Lehmann
41e82b8b8e nir: sink is_subgroup_invocation_lt_amd
Having it closer to the branches means we can eliminate an exec copy.

Foz-DB Navi31:
Totals from 11615 (14.63% of 79395) affected shaders:
Instrs: 6804372 -> 6804903 (+0.01%); split: -0.04%, +0.05%
CodeSize: 33684672 -> 33680584 (-0.01%); split: -0.07%, +0.05%
VGPRs: 578616 -> 578604 (-0.00%)
SpillSGPRs: 1506 -> 1304 (-13.41%)
Latency: 29817034 -> 29821320 (+0.01%); split: -0.03%, +0.05%
InvThroughput: 3581587 -> 3581217 (-0.01%); split: -0.02%, +0.01%
VClause: 124826 -> 124782 (-0.04%); split: -0.04%, +0.00%
SClause: 187916 -> 187645 (-0.14%); split: -0.27%, +0.13%
Copies: 520969 -> 510027 (-2.10%); split: -2.20%, +0.10%
PreSGPRs: 442584 -> 421344 (-4.80%)
VALU: 3810755 -> 3810267 (-0.01%); split: -0.01%, +0.00%
SALU: 763402 -> 752650 (-1.41%); split: -1.48%, +0.07%

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31184>
2024-09-26 14:29:14 +00:00
Georg Lehmann
bcfc5c09fa amd: add offset to is_subgroup_invocation_lt_amd
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31184>
2024-09-26 14:29:13 +00:00
Marek Olšák
09e64e3682 nir/opt_shrink_vectors: shrink memory loads, not just IO
The problem with radeonsi+ACO is that UBO loads from vec4 uniforms using
only 1 component always load all 4 components. This fixes that.

We are only interested in shrinking UBO and SSBO loads, but I added more
intrinsics because why not.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29384>
2024-09-26 03:01:38 +00:00
Timothy Arceri
f6e7520b13 glsl: remove now unused linker code
This has all be replaced by a nir based linker implementation.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>
2024-09-25 09:39:44 +00:00
Timothy Arceri
fe9b93fc1c nir: handle wildcard array deref
Here we add handling of wildcard array derefs when attempting to mark
an io as partially used rather than hitting an assert.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>
2024-09-25 09:39:44 +00:00
Timothy Arceri
6bb6b0e5ad nir: add nir_intrinsic_deref_implicit_array_length intrinsic
This will be used to handle .length() calls on unsized arrays

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>
2024-09-25 09:39:44 +00:00
Timothy Arceri
60937b5286 nir: add implicit_conversion_prohibited field to nir_parameter
Will be used in link time validation in following patches.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>
2024-09-25 09:39:44 +00:00
Timothy Arceri
5645495156 nir: store variable mode in nir_parameter
This will be used by the nir glsl linker in following patches.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>
2024-09-25 09:39:44 +00:00
Timothy Arceri
89a2411c54 nir: serialize nir_parameter type
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>
2024-09-25 09:39:44 +00:00
Timothy Arceri
6ff3e87e5f nir: add function in/outs to variable modes
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>
2024-09-25 09:39:44 +00:00
Timothy Arceri
1cb115abd2 nir: add nir_function_impl_clone_remap_globals()
This will be use by the glsl nir linker when we are combining
different shaders from the same shader stage that might have multiple
declarations of global variables across the different shaders.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>
2024-09-25 09:39:43 +00:00
Timothy Arceri
7a1061e0dd nir: add max_ifc_array_access field to vars
This will be used in following patches by the nir based glsl
linker code.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>
2024-09-25 09:39:43 +00:00
Timothy Arceri
7c5b21c032 glsl: add support for converting global instructions to NIR
NIR doesn't really support global instructions such as global val
initilisation. So here we add functionality to glsl_to_nir() to
put these instructions into a temporary function that will be
later inlined into main.

We give the function a name starting with gl_mesa_tmp_ as functions
starting with gl_ are reserved and will not have any clashes with
user functions, we finish the name with the blake3 of the shader
source to avoid conflicts with multiple shaders attached to a single
stage.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>
2024-09-25 09:39:43 +00:00
Georg Lehmann
e0bcab953d nir: add amd shared append/consume
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31075>
2024-09-19 16:21:47 +00:00
Boris Brezillon
eeb3512498 nir/lower_ssbo: Extend the load_ssbo_address intrinsic to pass an offset
On Mali(Valhall), the bounds checking can be done when in hardware, but
for this to work properly, we need to pass the offset to the
nir_load_ssbo_address() intrinsic.

Add an offset source to the intrinsic, and adjust the lowering pass
to conditionally lower the offset addition.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31164>
2024-09-18 13:45:57 +00:00
Boris Brezillon
adadb097a3 nir/lower_ssbo: Add an option to conditionally lower loads
On Mali(Valhall), we have a way to load SSBO data without going through
an SSBO index -> global address translation, so let's provide a way
to tell nir_lower_ssbo() when it shouldn't lower loads.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31164>
2024-09-18 13:45:57 +00:00
Georg Lehmann
a3d6a770c0 nir/instr_set: fix fp_fast_math
We can't just ignore the flags of the match, we need the union.

Fixes: 666647acae ("nir: track some float controls bits per instruction")

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31195>
2024-09-17 20:00:03 +00:00
Ian Romanick
6a09d33549 nir: Add a pass to generate BFI instructions from logical operations
Inspired by a commit message in !30934, I set about optimizing the code
generated for nir_copysign. It would be possible to just implement an
opt_algebraic pattern for the specific values used by nir_copysign, but
this casts a slightly larger net.

As noted in a comment in the code, there may be variations of the
pattern that this pass misses. The opt_algebraic pattern would miss them
too.

v2: Use nir_def_replace. Suggested by Alyssa. Allow more "root"
instruction types. Suggested by Georg.

v3: Treat extract_u16(x, 0) as (x & 0x0000ffff), and treat extract_u8(x,
0) as (x & 0x000000ff).

v4: Use nir_scalar. Suggested by Georg.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31006>
2024-09-13 00:21:00 +00:00
Ian Romanick
057c7c9f53 nir/algebraic: Recognize open-coded bitfield_reverse in XCOM 2
The XCOM 2 shaders in my shader-db use iadd instead of ior.

No fossil-db changes on any Intel platform.

shader-db:

All Intel platforms had similar results. (Meteor Lake shown)
total instructions in shared programs: 19787210 -> 19787034 (<.01%)
instructions in affected programs: 1187 -> 1011 (-14.83%)
helped: 6 / HURT: 0

total cycles in shared programs: 906024436 -> 906012612 (<.01%)
cycles in affected programs: 72978 -> 61154 (-16.20%)
helped: 6 / HURT: 0

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31006>
2024-09-13 00:21:00 +00:00
Rhys Perry
97f4250a7c nir: skip opt_loop_peel_initial_break if continue block only has phis
Doing that optimization wouldn't do anything useful in this case.

nir_block_has_non_copy() is used by opt_loop_peel_initial_break().

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31002>
2024-09-12 23:36:58 +00:00
Rhys Perry
8410b4cdd6 nir/tests: add some loop peeling tests
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31002>
2024-09-12 23:36:58 +00:00
Rhys Perry
64ac601049 nir/opt_loop: skip peeling if the loop ends with any kind of jump
Any kind of jump prevents us from moving it to the top of the loop, not
just breaks.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 6b4b044739 ("nir/opt_loop: add loop peeling optimization")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31002>
2024-09-12 23:36:58 +00:00
Rhys Perry
af3b099e0a nir/opt_loop: skip peeling if the break is non-trivial
If this nir_if contains continues or other breaks, we can't move it
outside the loop.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 6b4b044739 ("nir/opt_loop: add loop peeling optimization")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31002>
2024-09-12 23:36:57 +00:00
Rhys Perry
4f44a944bb nir/opt_if: fix fighting between split_alu_of_phi and peel_initial_break
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 6b4b044739 ("nir/opt_loop: add loop peeling optimization")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11822
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31002>
2024-09-12 23:36:57 +00:00
Georg Lehmann
7fa7812219 nir: merge out of loop decision with nir_can_move_instr logic
One place to modify instead of two when adding new intrinsics here.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30906>
2024-09-12 21:49:34 +00:00
Georg Lehmann
91f8e32a85 nir/opt_sink: do not sink inverse_ballot out of loops
Inverse_ballot result is undefined if the input is not dynamically uniform.
And sinking out of loops might make the input divergent.

Fixes: 18a0ff137f ("nir: sink/move inverse_ballot like moves")

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30906>
2024-09-12 21:49:34 +00:00
Georg Lehmann
1ec3cc2aed nir/opt_sink: do not sink load_ubo_vec4 out of loops
Same reason as for load_ubo.

Fixes: d199d65c3a ("nir/nir_opt_move,sink: Include load_ubo_vec4 as a load_ubo instr.")

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30906>
2024-09-12 21:49:34 +00:00
Caio Oliveira
1e7f1c2039 nir: Allow Mesh/Task to use implicit LOD when DERIVATIVE_GROUP is set
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30956>
2024-09-10 18:22:42 +00:00
David Heidelberg
6bf7b5bcd8 nir_lower_mem_access_bit_sizes: Assert when 0 components or bits are requested
Prevent the accidental passing of 0 components or bits, as it makes no sense.

Cc: mesa-stable
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Suggested-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: David Heidelberg <david@ixit.cz>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31103>
2024-09-10 11:17:48 +00:00
Ian Romanick
a780305818 nir/algebraic: Optimize more comparisons with b2f
shader-db:

All Intel platforms had similar results. (Meteor Lake shown)
total instructions in shared programs: 19781108 -> 19772614 (-0.04%)
instructions in affected programs: 372638 -> 364144 (-2.28%)
helped: 2915 / HURT: 0

total cycles in shared programs: 905907644 -> 905822682 (<.01%)
cycles in affected programs: 5573453 -> 5488491 (-1.52%)
helped: 2363 / HURT: 234

LOST:   42
GAINED: 16

fossil-db:

All Intel platforms had similar results. (Meteor Lake shown)
Totals:
Instrs: 152519634 -> 152519610 (-0.00%)
Cycle count: 17122707642 -> 17122710974 (+0.00%); split: -0.00%, +0.00%

Totals from 5 (0.00% of 633222) affected shaders:
Instrs: 2827 -> 2803 (-0.85%)
Cycle count: 83089 -> 86421 (+4.01%); split: -0.12%, +4.13%

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31068>
2024-09-10 04:15:58 +00:00
Alyssa Rosenzweig
b7542c4390 nir: CSE comparisons in atan2
Same code generated on AGX but simplified NIR.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30934>
2024-09-07 00:54:35 +00:00
Alyssa Rosenzweig
7546ae96a7 nir: drop NaN fixup for atan
this existed due to the min/max, per the comment. now that we don't do min/max,
the whole routine is NaN correct so the fixup is pointless.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Suggested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30934>
2024-09-07 00:54:35 +00:00
Alyssa Rosenzweig
ab8547a002 nir: push up abs in atan2 calculation
everybody has abs on fmul, not everyone has abs on bcsel. should help agx and
bifrost.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30934>
2024-09-07 00:54:35 +00:00
Alyssa Rosenzweig
398e1ad46c nir: fuse ffma for atan range fixup
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30934>
2024-09-07 00:54:35 +00:00
Alyssa Rosenzweig
47e7cd268c nir: negate an expression in atan
we're going to fix up the sign immediately anyway.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30934>
2024-09-07 00:54:35 +00:00
Alyssa Rosenzweig
5318b8868b nir: simplify atan range reduction fixup
the original version sure is creative.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30934>
2024-09-07 00:54:35 +00:00
Alyssa Rosenzweig
87b99d5797 nir: use copysign for atan
this does two things:

* ignores sign of negative numbers which let us play fast and loose later in th
  series
* avoids an expensive fsign instruction in favour of a cheap bitwise op

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30934>
2024-09-07 00:54:35 +00:00
Alyssa Rosenzweig
95215a094a nir: extend copysign for no-integer hw
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30934>
2024-09-07 00:54:35 +00:00
Alyssa Rosenzweig
0a4a0df283 nir: push down fabs for atan
worse in terms of NIR instruction count but lets the fabs fold easier. (on agx,
which has fabs on comparisons and fmul but not on bcsel. should be no worse if
ISA has fabs on all 3.)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30934>
2024-09-07 00:54:35 +00:00
Alyssa Rosenzweig
8579375777 nir: simplify atan range reduction
just implement what the comment says, don't be clever. the clever thing is worse
on all architectures i'm familiar with, because the fdiv will turn into
fmul+frcp.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30934>
2024-09-07 00:54:35 +00:00
Alyssa Rosenzweig
a32b1a975d nir: correct comment for atan range reduction
the code did not match the comment, blew a sign.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30934>
2024-09-07 00:54:35 +00:00
Alyssa Rosenzweig
4fc3e34f2f nir: use Horner's method for atan
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30934>
2024-09-07 00:54:35 +00:00
Georg Lehmann
6378bbaa82 nir/opt_algebraic: reassociate constants in ior(iand) chains
Mostly affects one F1_23 shader that packs bitfields bit by bit.

Totals from 3 (0.00% of 79395) affected shaders:
Instrs: 5004 -> 4202 (-16.03%)
CodeSize: 30992 -> 23952 (-22.72%)
Latency: 28894 -> 28464 (-1.49%)
InvThroughput: 4095 -> 3934 (-3.93%)
Copies: 363 -> 376 (+3.58%)
PreVGPRs: 110 -> 109 (-0.91%)
VALU: 3035 -> 2504 (-17.50%)
SALU: 463 -> 459 (-0.86%)

Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31009>
2024-09-05 22:04:05 +00:00
Caio Oliveira
74be809237 compiler: Allow derivative_group to be used for all stages in shader_info
These will now also be used by stages that have workgroups.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30950>
2024-09-03 20:03:18 +00:00
Alyssa Rosenzweig
f977c52b84 ail: swallow up formats
ail is a more sensible place for the format tables to live. this does create a
bit of dependency soup but hey.

nfc

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30981>
2024-09-02 23:27:14 +00:00
Alyssa Rosenzweig
afc7557cb6 nir,agx: make block image store an image() intrinsic
so we can do a bindless version

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30981>
2024-09-02 23:27:14 +00:00
Alyssa Rosenzweig
4941d71846 nir/divergence_analysis: handle load_agx
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30981>
2024-09-02 23:27:14 +00:00
Qiang Yu
d43c5003fc nir: add skip_lower_packing_ops shader compile option
Drivers like radeonsi and radv prefer to not lowering
some packing ops.

Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30885>
2024-08-30 05:46:51 +00:00
Ian Romanick
c160ed212e nir/divergence: resource_intel is less divergent than you thought
When the non_uniform flag is not set, the result is never divergent.

v2: Remove redundant assignment to is_divergent. Suggested by Caio.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>
2024-08-30 03:39:30 +00:00
Ian Romanick
f11a414645 nir/algebraic: Remove incorrect bfi of iand pattern
The comment says, "This expands to (b & 3) & ~0xc which is (b & 3) &
3." This is not correct. ~0xc is actually 0xfffffff3.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Closes: #11695
Fixes: 1c7e35d4e0 ("nir/algebraic: Optimize some bit operation nonsense observed in some shaders")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30913>
2024-08-29 22:21:55 +00:00