Commit graph

5257 commits

Author SHA1 Message Date
Alyssa Rosenzweig
140616d26a brw: scalarize even 64-bit scratch access
No, I don't know how this worked before, thanks for asking.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40843>
2026-04-09 21:02:16 +00:00
Alyssa Rosenzweig
15b11635a2 brw: Move intel_nir_opt_peephole_imul32x16 later in compilation
(Split by Ken out of a patch authored by Alyssa.)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40843>
2026-04-09 21:02:16 +00:00
Kenneth Graunke
e5598166b0 brw: Have brw_nir_apply_key call brw_nir_lower_simd for all stages
brw_nir_apply_key typically knows the dispatch width (it's fixed for
geometry stages, and we clone the NIR for compute and mesh shaders).
For compute/mesh, this was the very next thing called.  For the others,
if we know the width, there's no reason not to lower it.

Scratch lowering will start using load_simd_width_intel soon, so we
need it to work in all stages.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40843>
2026-04-09 21:02:16 +00:00
Kenneth Graunke
765d74eebe brw: Set nir->info.{min,max}_subgroup_size in brw_nir_apply_key
This records the actual SIMD width we selected for the shader, in
all cases except fragment shaders, where we don't know it yet.

MR 37258 notes that "Backends can update [these fields] when they make
new decisions about the subgroup size" - which is what we now do.

Note that nir->info.api_subgroup_size may be different than min/max
subgroup size on Vulkan prior to SPV1.6/VK_EXT_subgroup_size_control,
so we do not alter that.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40843>
2026-04-09 21:02:16 +00:00
Kenneth Graunke
d7d2d7aceb brw: Support load_simd_width_intel for fragment shaders
This lets us emit NIR code based on the SIMD size.  For non-fragment
stages, we'll replace it with a constant and optimize, but for FS,
we delay it until the backend.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40843>
2026-04-09 21:02:16 +00:00
Kenneth Graunke
cac9f670d1 intel/compiler: Use nir_static_workgroup_size helper
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40843>
2026-04-09 21:02:16 +00:00
Tapani Pälli
3ab9145393 intel/compiler: implement dummy mov for Wa_18035690555
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37804>
2026-04-09 07:30:01 +00:00
Tapani Pälli
4bb68d7474 intel/compiler: expose inferred_exec_pipe from scoreboarding
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37804>
2026-04-09 07:30:01 +00:00
Sagar Ghuge
2bf520340d intel/compiler: Remove unused brw_nir_memclear_global helper
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This is a dead code, we can remvoe it for now.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenz.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40801>
2026-04-09 05:06:05 +00:00
José Roberto de Souza
1e052f0bb5 intel/brw: Remove unsed functions to get data port message type
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40832>
2026-04-08 17:44:52 +00:00
Alyssa Rosenzweig
73701c305e brw: wire up MACL
New on Xe2, this instruction enables faster 32x32 integer multiply at the cost
of extra accumulator usage. Add it to the opcode list for future use.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40833>
2026-04-08 16:07:35 +00:00
Rhys Perry
463e3643f2 nir: add and use block predecessor helpers
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40242>
2026-04-08 15:06:32 +00:00
Ian Romanick
cfdb3ddb93 brw: brw_reg::nr for an accumulator is not part of the offset
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Without this, reg_offset will return 1024 for acc0. This causes
has_invalid_dst_region to decide that the destination region is invalid
(because 1024 != 0), and the lowering code tries to treat the floating
point accumulators as integers. It's a mess.

v2: Add and use set_gfx_platform. Suggested by Caio.

Fixes: 937373eb25 ("i965/fs: Handle fixed HW GRF subnr in reg_offset().")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40716>
2026-04-08 00:36:39 +00:00
Ian Romanick
ffdc310bf1 brw/const: Don't allow type changes when accumulators are involved
Integer accumulators and float accumulators do not occupy the same bits,
so the types cannot be arbitrarily changed.

No shader-db or fossil-db changes on any Intel platform.

v2: Use is_accumulator() instead if brw_reg_is_arf(). Add an extra test
to show the desired behavior when an accumulator is not
involved. Suggested by Caio.

Fixes: 64c251bb3a ("intel/fs: Combine constants for SEL instructions too")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40638>
2026-04-07 23:37:26 +00:00
Caio Oliveira
3b4a7f2d1a brw: In "Clear Accumulator" workaround, never set predicate_inverse
Since there's no predicate, the inverse bit is not relevant, so always
set it to false instead of using whatever was set by the previous
instruction.  Hardware already ignores this but will make verifying
later changes easier.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40800>
2026-04-07 20:33:46 +00:00
Alyssa Rosenzweig
959ec01ac8 brw/nir_lower_fs_load_output: optimize pixel coord
this saves a conversion or two.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40829>
2026-04-07 19:32:15 +00:00
Alyssa Rosenzweig
1d0f42c264 brw/eu_emit: relax assertion to allow ARF NULL
On new platforms, it's valid to use a NULL destination in conjunction with a
cmod, where you care about the implicit flag write but you don't need to clobber
any GRF. Something like:

   if (x * y > z) {

compiling (with fast-math) to

        mad.gt.f0 _, -z, x, y
   (f0) if

This patch allows us to emit that instruction.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40829>
2026-04-07 19:32:15 +00:00
Alyssa Rosenzweig
2ed6ff728a brw: explicitly pad tgl_swsb
This lets us treat it as a packed data structure without worrying about garbage.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40829>
2026-04-07 19:32:15 +00:00
Sagar Ghuge
f0ae58df12 intel/compiler: Handle TerminateOnFirstHit in ray query execution
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Once commited and have AABB or triangle intersection found, terminate
the traversal if TerminateOnFirstHit ray flag is present.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40773>
2026-04-06 10:00:05 -07:00
Arkady Shlykov
7f7ba20cca brw: Implement divergent atomics fusion optimization (single message approach)
For an atomic with a divergent addr generates a CFG grouping the same addrs
values together and emits a single atomic with fused data covering
the subgroup. Lanes with other addr values perform a default atomic.

Co-authored-by: Jhanani Thiagarajan <jhanani.thiagarajan@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40631>
2026-04-03 12:17:01 +00:00
Lionel Landwerlin
fab6f84126 brw: make the program key available on pass_tracker
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40631>
2026-04-03 12:17:01 +00:00
Caio Oliveira
0bf3aaedb1 brw: Always use split send in generator
Instead of generating special single source send in some cases, always
use the split send (called SENDS pre-Xe, and the only option in Xe).
Having code-path for single source was relevant for old Gfx versions,
but for Gfx9+ split send is always available.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40755>
2026-04-02 18:31:02 +00:00
Kenneth Graunke
ca3cabd2f8 brw: Use nir_texop_resinfo_intel for query_levels and txs
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This eliminates the need to special case query_levels.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40451>
2026-03-29 12:53:10 +00:00
Lionel Landwerlin
fa523aedd0 brw: fence SLM writes between workgroups
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
On LSC platforms the SLM writes are unfenced between workgroups. This
means a workgroup W1 finishing might have uncompleted SLM writes.
Another workgroup W2 dispatched after W1 which gets allocated an
overlapping SLM location might have writes that race with the previous
W1 operations.

The solution to this is fence all write operations (store & atomics)
of a workgroup before ending the threads. We do this by emitting a
single SLM fence either at the end of the shader or if there is only a
single unfenced right, at the end of that block.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13924
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40430>
2026-03-26 22:38:55 +00:00
Georg Lehmann
eef0fa22e0 brw: preserve fp_math_ctrl when lowering cmat alu
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40630>
2026-03-26 13:15:50 +00:00
Kenneth Graunke
204af7e09f intel/nir: Replace tg4 with txl/txb/tex when splitting texture residency
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
textureGather() returns the four taps that would have been filtered
together to produce the value that ordinary texturing operations
would return.  As such, it should access the same data, so we can use
either interchangeably when we're only checking for residency and not
returning the actual data.

This allows us to mask out some unneeded registers.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40590>
2026-03-24 16:06:29 +00:00
Kenneth Graunke
605ef577b3 intel/nir: Generalize lower_tex_compare to split_tex_residency
This splits a single texture-with-residency operation into two halves,
one which returns texture data, and another which queries residency.

We're currently using this only for a shadow sampling workaround,
but the technique is more broadly applicable, if we ever wanted.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40590>
2026-03-24 16:06:29 +00:00
Kenneth Graunke
dc760104ba intel/nir: Set new image intrinsic parameters via builder helpers
A bit less code.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40590>
2026-03-24 16:06:28 +00:00
Kenneth Graunke
9d07e85287 intel/nir: Use txf builder in intel_nir_lower_sparse
Newer helpers make NIR easier to write.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40590>
2026-03-24 16:06:28 +00:00
Tapani Pälli
c75256b2ab intel/compiler: move validation assert after brw_shader_debug_log
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
When validation fails we print instructions to use INTEL_DEBUG=shaders
but that will not help if we assert before dumping shader debug log.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40529>
2026-03-24 04:54:31 +00:00
Ian Romanick
b5e023777c brw: Change the flags written by some CMP
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
One frustrating thing about the CMP and CMPN instructions is that they
always write the flags. Sometimes, however, it is desirable to generate
the comparison result without modifying the flags. This would,
theoretically, reduce false dependencies that restrict the scheduler's
ability to rearrange code, create more opportunities for cmod
propagation, save a kitten from a tree, and make a rainbow.

Consider this sequence:

           cmp.ge.f0.0(8)  g103<1>F        g101<8,8,1>F    g39<8,8,1>F
           cmp.nz.f0.0(8)  null<1>D        g81<8,8,1>D     0D
   (+f0.0) if(8)   JIP:  LABEL19         UIP:  LABEL19

It would be advantageous to put the first CMP between the second CMP and
the IF, but this cannot be done since the IF depends on the flags generated
by the second CMP.

This pass enables this rescheduling by changing the first CMP to write
to a different flags register.

           cmp.ge.f1.0(8)  g103<1>F        g101<8,8,1>F    g39<8,8,1>F
           cmp.nz.f0.0(8)  null<1>D        g81<8,8,1>D     0D
   (+f0.0) if(8)   JIP:  LABEL19         UIP:  LABEL19

Sometimes this is also possible by using a different instruction.  For
example, consider

           cmp.l.f0.0(8)   g103<1>D        g101<8,8,1>D    0D

This produces 0xffffffff when g101 negative and zero otherwise. This
instruction, which does not modifiy the flag, also produces these results:

           asr(8)          g103<1>D        g101<8,8,1>D    31D

Gfx9 platforms take a hit on instructions due to the instruction added
at the end of short shaders by brw_workaround_source_arf_before_eot.

shader-db:

Lunar Lake, Meteor Lake, DG2, Tiger Lake, and Ice Lake had similar results. (Lunar Lake shown)
total instructions in shared programs: 17089451 -> 17088766 (<.01%)
instructions in affected programs: 766613 -> 765928 (-0.09%)
helped: 653 / HURT: 0

total cycles in shared programs: 888832986 -> 887873068 (-0.11%)
cycles in affected programs: 549441852 -> 548481934 (-0.17%)
helped: 10474 / HURT: 130

LOST:   9
GAINED: 0

Skylake
total instructions in shared programs: 19037976 -> 19049719 (0.06%)
instructions in affected programs: 3979914 -> 3991657 (0.30%)
helped: 503 / HURT: 12303

total cycles in shared programs: 867918242 -> 866930801 (-0.11%)
cycles in affected programs: 512773919 -> 511786478 (-0.19%)
helped: 13858 / HURT: 66

LOST:   32
GAINED: 0

fossil-db:

Lunar Lake
Totals:
Instrs: 925023504 -> 924950382 (-0.01%); split: -0.01%, +0.00%
Cycle count: 106348432916 -> 106116809009 (-0.22%); split: -0.22%, +0.00%
Spill count: 3423988 -> 3423930 (-0.00%); split: -0.00%, +0.00%
Fill count: 4877087 -> 4876960 (-0.00%); split: -0.01%, +0.00%
Max dispatch width: 49087552 -> 49078448 (-0.02%); split: +0.00%, -0.02%

Totals from 1099332 (54.44% of 2019443) affected shaders:
Instrs: 742670473 -> 742597351 (-0.01%); split: -0.01%, +0.00%
Cycle count: 100455549635 -> 100223925728 (-0.23%); split: -0.23%, +0.00%
Spill count: 3384366 -> 3384308 (-0.00%); split: -0.00%, +0.00%
Fill count: 4837434 -> 4837307 (-0.00%); split: -0.01%, +0.00%
Max dispatch width: 26725152 -> 26716048 (-0.03%); split: +0.00%, -0.03%

Meteor Lake and DG2 had similar results. (Meteor Lake shown)
Totals:
Instrs: 997603774 -> 997529238 (-0.01%); split: -0.01%, +0.00%
Cycle count: 93904012762 -> 93646730006 (-0.27%); split: -0.28%, +0.00%
Spill count: 3710155 -> 3710125 (-0.00%); split: -0.00%, +0.00%
Fill count: 5032908 -> 5032819 (-0.00%); split: -0.01%, +0.00%
Max dispatch width: 37929640 -> 37811560 (-0.31%)

Totals from 1334920 (58.52% of 2281134) affected shaders:
Instrs: 817377787 -> 817303251 (-0.01%); split: -0.01%, +0.00%
Cycle count: 88468851658 -> 88211568902 (-0.29%); split: -0.29%, +0.00%
Spill count: 3663353 -> 3663323 (-0.00%); split: -0.00%, +0.00%
Fill count: 4991629 -> 4991540 (-0.00%); split: -0.01%, +0.00%
Max dispatch width: 20245832 -> 20127752 (-0.58%)

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
Totals:
Instrs: 1013433769 -> 1013363273 (-0.01%); split: -0.01%, +0.00%
Cycle count: 85766921182 -> 85509316620 (-0.30%); split: -0.31%, +0.00%
Spill count: 3903923 -> 3903944 (+0.00%); split: -0.00%, +0.00%
Fill count: 6801983 -> 6801948 (-0.00%); split: -0.00%, +0.00%
Max dispatch width: 37896320 -> 37805320 (-0.24%); split: +0.00%, -0.24%

Totals from 1333814 (58.54% of 2278396) affected shaders:
Instrs: 830200531 -> 830130035 (-0.01%); split: -0.01%, +0.00%
Cycle count: 80746184101 -> 80488579539 (-0.32%); split: -0.32%, +0.01%
Spill count: 3855771 -> 3855792 (+0.00%); split: -0.00%, +0.00%
Fill count: 6755513 -> 6755478 (-0.00%); split: -0.00%, +0.00%
Max dispatch width: 20301456 -> 20210456 (-0.45%); split: +0.00%, -0.45%

Skylake
Totals:
Instrs: 519389758 -> 519874108 (+0.09%); split: -0.00%, +0.10%
Cycle count: 57932316132 -> 57789433956 (-0.25%); split: -0.25%, +0.00%
Spill count: 636741 -> 636715 (-0.00%); split: -0.01%, +0.00%
Fill count: 860470 -> 860357 (-0.01%); split: -0.02%, +0.00%
Max dispatch width: 32527800 -> 32481792 (-0.14%); split: +0.00%, -0.14%

Totals from 1080380 (62.25% of 1735462) affected shaders:
Instrs: 411976399 -> 412460749 (+0.12%); split: -0.00%, +0.12%
Cycle count: 54291447615 -> 54148565439 (-0.26%); split: -0.27%, +0.00%
Spill count: 602993 -> 602967 (-0.00%); split: -0.01%, +0.00%
Fill count: 734459 -> 734346 (-0.02%); split: -0.02%, +0.00%
Max dispatch width: 18626096 -> 18580088 (-0.25%); split: +0.00%, -0.25%

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38978>
2026-03-24 01:31:26 +00:00
Ian Romanick
31de96d321 brw/lower_regioning: Allow integer conversions in SEL
The Bspec says that SEL sources and destination can be any mix of *B,
*W, and *D. We should allow those. Specifically, without this change,
this instruction

    sel.sat.l(8) v548:UD, v899:D, 255d

gets unnecessarily split into two instructions.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38978>
2026-03-24 01:31:26 +00:00
Ian Romanick
dff1e8ae28 brw: Handle scalars and swizzles correctly in is_const_zero
v2: Massive simplification based on feedback from Ken.

Fixes: 96cde9cc01 ("intel/fs: Emit better code for bfi(..., 0)")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38978>
2026-03-24 01:31:25 +00:00
Ian Romanick
985ace332b brw/algebraic: Allow mixed types in saturate constant folding
Prevents assertion failures in func.shader-ballot.basic.q0 and other
tests starting with "nir/algebraic: Optimize some b2f of integer
comparison".

Vector immediates, bfloat, and 8-bit floats are still not supported.

v2: Almost complete re-write based on suggestions from Ken.

v3: Don't retype() on a brw_imm_f value.

Fixes: f8e54d02f7 ("intel/compiler: Relax mixed type restriction for saturating immediates")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38978>
2026-03-24 01:31:25 +00:00
Marek Olšák
fa5175023b Final rename of sha1 names to blake3
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
2026-03-23 07:03:28 +00:00
Marek Olšák
ae9ea27e0d Rename *_sha1 names to *_blake3
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
2026-03-23 07:03:28 +00:00
Marek Olšák
102d41799b Rename more sha and sha1 names to blake3
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
2026-03-23 07:03:28 +00:00
Marek Olšák
d4831aaf5f Rename sha1_* and sha_* names to blake3_*
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
2026-03-23 07:03:28 +00:00
Marek Olšák
c0ac992a2a Remove mesa-sha1.h
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
2026-03-23 07:03:27 +00:00
Marek Olšák
53c64973e8 Inline _mesa_sha1_compute/format, remove the other unused ones
_mesa_sha1_format has a few remaining uses, so it's moved to build_id.c,
which is its last user.

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
2026-03-23 07:03:27 +00:00
Marek Olšák
699f9d7066 Inline _mesa_sha1_init/update/final functions
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
2026-03-23 07:03:27 +00:00
Marek Olšák
a965ada6ee Inline mesa_sha1, SHA1_CTX
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
2026-03-23 07:03:27 +00:00
Marek Olšák
0da88d237a Inline SHA1_DIGEST_STRING_LENGTH
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
2026-03-23 07:03:27 +00:00
Marek Olšák
110632f702 Inline SHA1_DIGEST_LENGTH
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
2026-03-23 07:03:27 +00:00
Georg Lehmann
ec331cc48a nir: replace lower_ldexp with has_ldexp
I can be bothered to fix all the backends that don't set lower_ldexp,
and only two backends have ldexp anyway.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33900>
2026-03-20 08:15:08 +00:00
Iván Briano
fd556e54f6 brw: do not omit RT writes if dual_src_blend is on
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Dual source blending when one of the sources is not written to leaves
those values undefined, but the other should still be valid.
By omitting unwritten outputs, we ended up not writing anything at all
for the case that OUT1 is written to but OUT0 is undefined.

Fixes new CTS tests: dEQP-VK.pipeline.*.blend.dual_source.undefined_output.first*

Cc: mesa-stable
Signed-off-by: Iván Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40357>
2026-03-19 23:38:40 +00:00
Caio Oliveira
dcba49d7ef intel/compiler: Handle shuffle_*_intel intrinsics in bit size lowering
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40376>
2026-03-17 17:21:52 +00:00
Kenneth Graunke
9f77991751 brw: Simplify mark_last_urb_write_with_eot()
Just tag the last instruction, drop useless dead code elimination.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40328>
2026-03-12 21:40:37 +00:00
Kenneth Graunke
4bfa7a602c brw: Don't emit HALT_TARGET for VS/TCS/TES/GS
This isn't needed and will allow simplifications in the next patch.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40328>
2026-03-12 21:40:37 +00:00
Kenneth Graunke
2b6c6f8130 brw: Lower TCS single patch invocation ID calculations in NIR
This is a bit less code and also drops one more TCS-specific thing
from the "run" function.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40328>
2026-03-12 21:40:37 +00:00