It's unused and if we ever want to use it again we should make it an alu
opcode instead.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21445>
Hoping that I didn't miss any, this *should* add assertions
to all functions and passes which explicitly handle 'nir_loop'.
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13962>
If we pass a physical 2D texture descriptor we should also pass 2
coords. Otherwise it just uses the random content in the second
register which ends up funny sometimes.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20696>
Driver workarounds for game bugs can be easily broken. This one
shouldn't be applied to meta shaders and this restores previous logic.
Fixes: da32cbb5c6 ("aco: fix missing uses of MRT output flags")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20637>
If a shader doesn't export any color targets and instead only exports
mrtz, the discard early exit block should match.
Fixes artifacts on Lara in Rise of the Tomb Raider benchmark and hair in
The Witcher 3 (classic).
https://reviews.llvm.org/D128185
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: bc8da20dda ("aco: export MRT0 instead of NULL on GFX11")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20345>
Similar to emit_gfx10_wave64_bpermute, but uses the new
v_permlane64_b32 instruction to swap data between wave halves.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20293>
Different sequences are emitted for these, so it makes sense to
have different opcodes too.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20293>
There are a lot of optimizations in opt_algebraic that match ('ine', a,
0), but there are almost none that match i2b. Instead of adding a huge
pile of additional patterns (including variations that include both ine
and i2b), always lower i2b to a != 0.
At this point in the series, it should be impossible for anything to
generate i2b, so there /should not/ be any changes.
The failing test on d3d12 is a pre-existing bug that is triggered by
this change. I talked to Jesse about it, and, after some analysis, he
suggested just adding it to the list of known failures.
v2: Don't rematerialize i2b instructions in dxil_nir_lower_x2b.
v3: Don't rematerialize i2b instructions in zink_nir_algebraic.py.
v4: Fix zink-on-TGL CI failures by calling nir_opt_algebraic after
nir_lower_doubles makes progress. The latter can generate b2i
instructions, but nir_lower_int64 can't handle them (anymore).
v5: Add back most of the hunk at line 2125 of nir_opt_algebraic.py. I
had accidentally removed the f2b(bf2(x)) optimization.
v6: Just eliminate the i2b instruction.
v7: Remove missed i2b32 in midgard_compile.c. Remove (now unused)
emit_alu_i2orf2_b1 function from sfn_instr_alu.cpp. Previously this
function was still used. 🤷
No shader-db changes on any Intel platform.
All Intel platforms had similar results. (Ice Lake shown)
Instructions in all programs: 141165875 -> 141165873 (-0.0%)
Instructions helped: 2
Cycles in all programs: 9098956382 -> 9098956350 (-0.0%)
Cycles helped: 2
The two Vulkan shaders are helped because of the "new" (('b2i32',
('ine', ('ubfe', a, b, 1), 0)), ('ubfe', a, b, 1)) algebraic pattern.
Acked-by: Jesse Natalie <jenatali@microsoft.com> [earlier version]
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Daniel Schürmann <daniel@schuermann.dev> [earlier version]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>
Fixes regressions on GFX6 and the RAGE2 workaround.
Fixes: a297ac10a4 ("radv,aco: stop lowering FS outputs in NIR")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20154>
This was a bad idea because:
- it diverges too much with the fragment shader epilog
- it doesn't allow to implement alpha-to-coverage via MRTZ correctly
- it was supposed to be used by LLVM but this never happened
Reverting this back allows us to fix alpha-to-coverage via MRTZ
on GFX11 easily, including for fragment shader epilogs.
fossils-db (NAVI21):
Totals from 20411 (15.13% of 134913) affected shaders:
VGPRs: 972056 -> 971400 (-0.07%); split: -0.08%, +0.01%
CodeSize: 92284804 -> 92295392 (+0.01%); split: -0.05%, +0.06%
MaxWaves: 465010 -> 465166 (+0.03%); split: +0.03%, -0.00%
Instrs: 17034162 -> 17034963 (+0.00%); split: -0.00%, +0.01%
Latency: 252013190 -> 251971764 (-0.02%); split: -0.03%, +0.02%
InvThroughput: 45859625 -> 45842556 (-0.04%); split: -0.04%, +0.01%
VClause: 324627 -> 324629 (+0.00%); split: -0.03%, +0.03%
SClause: 672918 -> 672826 (-0.01%); split: -0.05%, +0.04%
Copies: 1172126 -> 1158152 (-1.19%); split: -1.20%, +0.01%
Branches: 420602 -> 420604 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 1025441 -> 1025481 (+0.00%)
PreVGPRs: 861787 -> 860650 (-0.13%); split: -0.17%, +0.03%
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20126>
16-bit isn't possible. Note that this is currently style broken for
compressed formats because the w channel is never written to.
Ported from RadeonSI ('radeonsi/gfx11: fix alpha-to-coverage with
stencil or samplemask export')
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20126>
This new behaviour will let us insert exports in GS copy shader control
flow.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18898>
Fixes crucible test func.shader.dualsrc_mrt0_undef on polaris10.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: 22.3 mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19806>