Commit graph

4712 commits

Author SHA1 Message Date
Lionel Landwerlin
ff57c31696 brw: avoid invalid URB messages
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Some new CTS tests have geometry shader looking like this :

   void main()
   {
      gl_Position = gl_in[0].gl_Position;
      EmitVertex();
      EndPrimitive();
      // <-- some storage buffer write
   }

The generate shader has :
   - a message to write the position
   - a message to write to the storage buffer
   - a final message to end the thread

This generates an empty EOT URB messages which is apparently not legal
(simulation complains, HW hangs) :

send(8)         nullUD          g126UD          nullUD          0x04088007                0x00000000
                urb MsgDesc: offset 0 SIMD8 write masked  mlen 2 ex_mlen 0 rlen 0 { align1 1Q A@1 EOT };

Instead emit a write with actual data and the mask set at 0 to discard
the effect :

mov(8)          g127<1>UD       0x00000000UD                    { align1 WE_all 1Q };
mov(8)          g125<1>UD       0x00000000UD                    { align1 1Q };
send(8)         nullUD          g126UD          g125UD          0x04088007                0x00000040
                urb MsgDesc: offset 0 SIMD8 write masked  mlen 2 ex_mlen 1 rlen 0 { align1 1Q A@1 EOT };

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38243>
2025-11-05 17:18:09 +00:00
Ian Romanick
34fe598b39 brw: Correctly generate conditional modifier for BFN
Fixes: 4193895145 ("brw/cmod: Enable limited cmod propagation for BFN")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38251>
2025-11-05 16:52:56 +00:00
Kenneth Graunke
96b739b449 elk: Disable IO semantic validation when remapping patch offsets
Marek disabled this for brw in 2f6b4803ab
but elk also needs the fix.  Fixes issues in shader-db/open-subdiv/7 on
crocus targeting Haswell.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38231>
2025-11-05 10:58:00 +00:00
Alyssa Rosenzweig
17355f716b treewide: use UTIL_DYNARRAY_INIT
Instead of util_dynarray_init(&dynarray, NULL), just use
UTIL_DYNARRAY_INIT instead. This is more ergonomic.

Via Coccinelle patch:

    @@
    identifier dynarray;
    @@

    -struct util_dynarray dynarray = {0};
    -util_dynarray_init(&dynarray, NULL);
    +struct util_dynarray dynarray = UTIL_DYNARRAY_INIT;

    @@
    identifier dynarray;
    @@

    -struct util_dynarray dynarray;
    -util_dynarray_init(&dynarray, NULL);
    +struct util_dynarray dynarray = UTIL_DYNARRAY_INIT;

    @@
    expression dynarray;
    @@

    -util_dynarray_init(&(dynarray), NULL);
    +dynarray = UTIL_DYNARRAY_INIT;

    @@
    expression dynarray;
    @@

    -util_dynarray_init(dynarray, NULL);
    +(*dynarray) = UTIL_DYNARRAY_INIT;

Followed by sed:

    bash -c "find . -type f -exec sed -i -e 's/util_dynarray_init(&\(.*\), NULL)/\1 = UTIL_DYNARRAY_INIT/g' \{} \;"
    bash -c "find . -type f -exec sed -i -e 's/util_dynarray_init( &\(.*\), NULL )/\1 = UTIL_DYNARRAY_INIT/g' \{} \;"
    bash -c "find . -type f -exec sed -i -e 's/util_dynarray_init(\(.*\), NULL)/*\1 = UTIL_DYNARRAY_INIT/g' \{} \;"

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38189>
2025-11-04 13:39:48 +00:00
Lionel Landwerlin
53834ccb6a brw: disable io_semantic validation for mesh intrinsics
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 2f6b4803ab ("nir/validate: expand IO intrinsic validation with nir_io_semantics")
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38222>
2025-11-03 21:28:22 +00:00
Marek Olšák
2f6b4803ab nir/validate: expand IO intrinsic validation with nir_io_semantics
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
There are many workarounds.

v2: add more validation

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> (v1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38113>
2025-11-02 02:21:46 +00:00
Ian Romanick
2e8b89ec60 elk: Apply vgrf127 workaround in more cases
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
No shader-db changes on Broadwell. Older platforms were not tested.

Fixes: e7b7d572b3 ("intel/fs/ra: Re-arrange interference setup")
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38122>
2025-10-31 22:55:53 +00:00
Ian Romanick
3e6af6c5bb brw: Apply Gfx9 vgrf127 workaround in more cases
No shader-db changes on any Intel platform.

fossil-db:

Skylake
Intel(R) HD Graphics 530 (SKL GT2)
Totals:
Cycle count: 57669758527 -> 57669757913 (-0.00%); split: -0.00%, +0.00%

Totals from 10 (0.00% of 1736875) affected shaders:
Cycle count: 274949 -> 274335 (-0.22%); split: -0.36%, +0.14%

This change is likely due to subtle differences of different registers
being allocated.

In addition, fossils/google-meet-clvk/BgBlur.1f58fdf742c27594.1.foz and
fossils/google-meet-clvk/Relight.1f58fdf742c27594.1.foz stopped failing
EU validation on Gfx9 platforms.

Closes: #14171
Fixes: e7b7d572b3 ("intel/fs/ra: Re-arrange interference setup")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38122>
2025-10-31 22:55:53 +00:00
Alyssa Rosenzweig
5f53e6edc0 intel: use util_is_aligned more
Coccinelle + filtering hunks manually.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38169>
2025-10-31 15:03:58 +00:00
Daniel Schürmann
10be538851 tree-wide: don't call nir_opt_constant_folding after nir_lower_flrp
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37195>
2025-10-30 19:28:07 +00:00
Caio Oliveira
3334284845 brw: Don't set destination of branch instructions
In Gfx9+ the destination should be set to ARF null in all those cases, the
use of IP was a requirement of old versions only.  The already zeroed
bits will encode ARF null, so no need to set.

Skipping the helper avoids setting unwanted bits (like hstride), which
in Gfx12+ are MBZ.

This patch adjust the expectations of the asm tests to remove the dst
type and dst stride fields -- will expect them all zeroed.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36454>
2025-10-30 17:18:15 +00:00
Caio Oliveira
8c45ff9acb brw: Set relevant immediate bits for Gfx9-11 in JIP and UIP helpers
This is better than using the generic helper since will not set unwanted
bits (e.g. hstride) and it is already handling their case for Gfx12+
anyway.

There's an extra helper now for the case where src1 is not used.  In
Gfx9-11 it needs to be set to ARF but with a matching type of src0.

Assembler was updated to follow the same approach.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36454>
2025-10-30 17:18:15 +00:00
Caio Oliveira
adc353da3c brw: Fix MOV_INDIRECT lowering for various platforms
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Even though some platforms support int64 they don't support indirect
movs with 64-bit values.  Effectively this is only supported for non-LP
Gfx9.

This fixes various tests in dEQP-VK.spirv_assembly.instruction.compute.untyped_pointers.*.push_constant.*64*
on BMG.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38125>
2025-10-30 16:06:42 +00:00
Caio Oliveira
538fd7266e brw: Fix EU validation of VxH and Vx1 region
Use same approach as the other code checking for this vstride.  Argument
could be made we want to reuse the same enum value for both the encoded
and decoded version, but for now follow the existing practice.

This will cause
dEQP-VK.spirv_assembly.instruction.compute.untyped_pointers.vulkan_memory_model.type_punning.load.push_constant.int64_to_uint64
and similar tests to fail validation on BMG.  Later patch will fix that.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38125>
2025-10-30 16:06:42 +00:00
Iván Briano
473119ab91 brw: plug some holes in brw_wm_prog_data
Remove two unused fields, and move a lonely boolean a bit up to plug the
remaining hole.

Because I was looking around and it bothered me.

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38116>
2025-10-28 20:24:23 +00:00
Sagar Ghuge
89fbcc8c34 brw/rt: fix ray_object_(direction|origin) for closest-hit shaders
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
We were returning world BVH level for origin/direction, this commit
fixes by retuning correct object BVH level origin/direction.

Fixes: aaff191356 ("brw/rt: fix ray_object_(direction|origin) for closest-hit shaders")
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36853>
2025-10-27 01:42:22 +00:00
Sagar Ghuge
3edeb1e191 brw/rt: Move nir_build_vec3_mat_mult_col_major helper to header
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36853>
2025-10-27 01:42:22 +00:00
Alyssa Rosenzweig
b824ef83ab util/dynarray: infer type in append
Most of the time, we can infer the type to append in
util_dynarray_append using __typeof__, which is standardized in C23 and
support in Jesse's MSMSVCV. This patch drops the type argument most of
the time, making util_dynarray a little more ergonomic to use.

This is done in four steps.

First, rename util_dynarray_append -> util_dynarray_append_typed

    bash -c "find . -type f -exec sed -i -e 's/util_dynarray_append(/util_dynarray_append_typed(/g' \{} \;"

Then, add a new append that infers the type. This is much more ergonomic
for what you want most of the time.

Next, use type-inferred append as much as possible, via Coccinelle
patch (plus manual fixup):

    @@
    expression dynarray, element;
    type type;
    @@

    -util_dynarray_append_typed(dynarray, type, element);
    +util_dynarray_append(dynarray, element);

Finally, hand fixup cases that Coccinelle missed or incorrectly
translated, of which there were several because we can't used the
untyped append with a literal (since the sizeof won't do what you want).

All four steps are squashed to produce a single patch changing every
util_dynarray_append call site in tree to either drop a type parameter
(if possible) or insert a _typed suffix (if we can't infer). As such,
the final patch is best reviewed by hand even though it was
tool-assisted.

No Long Linguine Meals were involved in the making of this patch.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38038>
2025-10-24 18:32:07 +00:00
Caio Oliveira
4f628c9e8c brw: Consolidate late lowering of int64 operations
Instead of doing selectively and with different supporting passes, just
run the complete set (special algebraic before and cleanup optimizations
after) at the end of brw_postprocess_nir_opts().

No changes to fossil-db on ICL, TGL, ACM and BMG.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35844>
2025-10-24 16:41:29 +00:00
Dylan Baker
a5b9f428f9 intel/compiler/brw: Add assert that we don't have a negative value
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Coverity notices that `nir_get_io_index_src_number` could return -1, and
that we use it to index an array. It cannot understand that -1 only
happens for unhandled enum values, but all of these are handled. Add an
assert to help it out.

CID: 1667234
Fixes: 37a9c5411f ("brw: serialize messages on Gfx12.x if required")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38007>
2025-10-24 15:13:10 +00:00
Dylan Baker
83c52f75d0 intel/compiler/brw: fix potential unsigned overflow
Coverity notices that if `util_last_bit()` returns 0, and we subtract 1,
then the unsigned will overflow before being converted. We could cast to
eliminate that error, but the entire optimization function would do
nothing if tex->required_params == 0 (the way that we would get here),
so let's just not do work if we know we don't need to *and* avoid this
overflow.

CID: 1667241
Fixes: efcba73b49 ("brw: switch to new sampler payload description scheme")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38009>
2025-10-24 07:52:09 -07:00
Lionel Landwerlin
e450297ea9 anv/brw: fix output tcs vertices
brw_prog_tcs_data::instances can be divided by vertices per threads on
earlier generations.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: a91e0e0d61 ("brw: add support for separate tessellation shader compilation")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38036>
2025-10-23 18:54:05 +00:00
Lionel Landwerlin
f3df267735 brw: handle GLSL/GLSL tessellation parameters
Apparently various tessellation parameters come specified from
TESS_EVAL stage in GLSL while they come from the TESS_CTRL stage in
HLSL.

We switch to store the tesselation params more like shader_info with 0
values for unspecified fields. That let's us merge it with a simple OR
with values from from tcs/tes and the resulting merge can be used for
state programming.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: a91e0e0d61 ("brw: add support for separate tessellation shader compilation")
Fixes: 50fd669294 ("anv: prep work for separate tessellation shaders")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37979>
2025-10-22 20:48:59 +00:00
Alyssa Rosenzweig
05481f56a0 brw: use the right int8/int16 division lowering
lowering bitsize before lowering idiv is silly, since then it forces us
down the software int32 division path instead of the much faster
int8/int16 lowered path. Relevant CTS tests:

dEQP-VK.spirv_assembly.type.scalar.i16.div_comp,
dEQP-VK.spirv_assembly.type.scalar.i8.rem_comp,

Go from:

SIMD8 shader: 46 instructions. 1 loops. 4716 cycles. 0:0 spills:fills
SIMD8 shader: 1008 instructions. 0 loops. 3600 cycles. 0:0 spills:fills, 8 sends

to:

SIMD8 shader: 17 instructions. 1 loops. 2556 cycles. 0:0 spills:fills
SIMD8 shader: 464 instructions. 0 loops. 1394 cycles. 0:0 spills:fills, 8 sends

No stats change on fossil-db (which has very little int8/int16 and even
less integer division, apparently).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37966>
2025-10-22 10:00:36 -04:00
Georg Lehmann
cf4ab485ea nir: remove manual nir_load_global_constant
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37959>
2025-10-21 12:39:53 +02:00
Georg Lehmann
654bd74c60 treewide: use nir_store_global alias of nir_build_store_global
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37959>
2025-10-21 12:37:58 +02:00
Georg Lehmann
2306cba65b nir: remove manual nir_store_global
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37959>
2025-10-21 12:37:58 +02:00
Georg Lehmann
9e41a7c139 treewide: use nir_load_global alias of nir_build_load_global
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37959>
2025-10-21 12:37:58 +02:00
Georg Lehmann
77540cac8c nir: remove manual nir_load_global
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37959>
2025-10-21 12:37:58 +02:00
Lionel Landwerlin
c5d313a2a8 brw: handling dynamic programmable offsets pre-Xe2
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37929>
2025-10-21 06:13:10 +00:00
Lionel Landwerlin
d37c6ff4ed brw: mark divergence data as valid for debug purposes
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37929>
2025-10-21 06:13:10 +00:00
Lionel Landwerlin
e2918ad82c brw: fix missing generation requirement on sampler opcode
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: bcffd839aa ("brw: new Xe2 sampler opcodes")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37929>
2025-10-21 06:13:10 +00:00
Lionel Landwerlin
757c042e39 brw: fix ballot() type operations in shaders with HALT instructions
Fixes dEQP-VK.reconvergence.terminate_invocation.bit_count

LNL fossildb stats:

 Totals from 16489 (3.36% of 490184) affected shaders:
 Instrs: 3710499 -> 3710500 (+0.00%)
 Cycle count: 91601018 -> 90305642 (-1.41%); split: -1.81%, +0.40%
 Max dispatch width: 523936 -> 523952 (+0.00%); split: +0.02%, -0.01%

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37939>
2025-10-21 05:55:04 +00:00
Lionel Landwerlin
70aa028f27 brw: only consider cross lane access on non scalar VGRFs
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 1bff4f93ca ("brw: Basic infrastructure to store convergent values as scalars")
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37939>
2025-10-21 05:55:04 +00:00
Lionel Landwerlin
f48c9c3a37 brw: constant fold u2u16 conversion on MCS messages
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: bddfbe7fb1 ("brw/blorp: lower MCS fetching in NIR")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37963>
2025-10-21 08:27:07 +03:00
Lionel Landwerlin
f8745b3af3 brw: add missing offset to MCS fetching messages
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37963>
2025-10-21 08:27:05 +03:00
Lionel Landwerlin
c20e2733bf Revert "brw: add serialize send stats"
This reverts commit b8ae4ede60 now that
we have a cycle estimation accounting.

Reviewed-by: Alyssa Anne Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37816>
2025-10-16 18:55:06 +00:00
Lionel Landwerlin
14683a045b brw: account for disabled SEND fused message in cycle computation
This is an alternative Curro proposed to counting the number of
serialized messages.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Anne Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37816>
2025-10-16 18:55:06 +00:00
Lionel Landwerlin
b722e17203 brw: get rid of GET_BUFFER_SIZE opcode
Rely on RESINFO which is what was used already.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37171>
2025-10-16 12:08:16 +00:00
Lionel Landwerlin
bcffd839aa brw: new Xe2 sampler opcodes
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37171>
2025-10-16 12:08:16 +00:00
Lionel Landwerlin
efcba73b49 brw: switch to new sampler payload description scheme
Instead of having abstracted opcodes, we target directly the HW format
at the NIR translation.

The payload description gives us the order of the payload sources (we
can use that for pretty printing) and we don't have to have a
complicated scheme in the logical send lowering for the ordering. All
we have to do is build the header if needed as well as the descriptors.

PTL Fossil-db stats:
 Totals from 66759 (13.54% of 492917) affected shaders:
 Instrs: 44289221 -> 43957404 (-0.75%); split: -0.81%, +0.06%
 Send messages: 2050378 -> 2042607 (-0.38%)
 Cycle count: 3878874713 -> 3712848434 (-4.28%); split: -4.44%, +0.16%
 Max live registers: 8773179 -> 8770104 (-0.04%); split: -0.06%, +0.03%
 Max dispatch width: 1677408 -> 1707952 (+1.82%); split: +1.85%, -0.03%
 Non SSA regs after NIR: 11407821 -> 11421041 (+0.12%); split: -0.03%, +0.15%
 GRF registers: 5686983 -> 5838785 (+2.67%); split: -0.24%, +2.91%

LNL Fossil-db stats:

 Totals from 57911 (15.72% of 368381) affected shaders:
 Instrs: 39448036 -> 38923650 (-1.33%); split: -1.41%, +0.08%
 Subgroup size: 1241360 -> 1241392 (+0.00%)
 Send messages: 1846696 -> 1845137 (-0.08%)
 Cycle count: 3834818910 -> 3784003027 (-1.33%); split: -2.33%, +1.00%
 Spill count: 21866 -> 22168 (+1.38%); split: -0.07%, +1.45%
 Fill count: 59324 -> 60339 (+1.71%); split: -0.00%, +1.71%
 Scratch Memory Size: 1479680 -> 1483776 (+0.28%)
 Max live registers: 7521376 -> 7447841 (-0.98%); split: -1.04%, +0.06%
 Non SSA regs after NIR: 9744605 -> 10113728 (+3.79%); split: -0.01%, +3.80%

Only 2 titles negatively impacted (spilling) :
  - Shadow of the Tomb Raider
  - Red Dead Redemption 2

All impacted shaders were already spilling.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37171>
2025-10-16 12:08:15 +00:00
Lionel Landwerlin
232697a0a3 brw: port some NIR lowering to the sampler payload description
We start by assigning a backend opcode to all tex instructions, use
that to figure out if we have packed sources and apply the lowering
accordingly.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37171>
2025-10-16 12:08:15 +00:00
Lionel Landwerlin
7c77c4768a brw: add a new sampler payload parameter description
Centralize all the information in one place and also make the mapping
decision from nir_tex_instr -> HW opcode much earlier.

This will help knowning exactly what the payload looks like early in
the backend IR and when it needs to lowered to a smaller SIMD size due
to HW limits. It will also allow NIR lowering to know when to combine
parameters into a single packed component.

Finally, this also reduces the amount of LOAD_PAYLOAD we need to carry
in the backend IR, because we don't have to generate VEC()
LOAD_PAYLOAD() for coordinates etc... Those are useless if there is
any other parameter in the payload and we need need to add one more
LOAD_PAYLOAD() when doing the logical send lowering.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37171>
2025-10-16 12:08:14 +00:00
Ian Romanick
1dea86f773 brw: Don't do non-obvious things with BFN parameter ordering
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Somehow dEQP-VK.spirv_assembly.instruction.graphics.float16.arithmetic_1.atan_frag
was able to generate a bitfield_select with a constant first
parameter. That makes the big comment here completely false.

Don't be clever. If the constant is in the wrong place,
commute_immediates during copy propagation will fix it.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37891>
2025-10-16 00:37:30 +00:00
Ian Romanick
85db960e37 brw: Mark src3 of BFN as is_control_source
This prevents lower_regioning from doing bad things when the destination
and all the other sources are UW.

Other solutions considered:

- Have the type of src[3] match the destination type. This also required
  changes in combine_constants to allow the type be UD or UW.
- Make a new subclass brw_bfn_inst, and store the Boolean function
  selector outside the src[] array. This was a lot more code and a lot
  more churn (+47,-27 vs +4).

Fixes: b948e6d503 ("brw: Use BFN to implement nir_opt_bitfield_select")
Suggested-by: Curro
Suggested-by: Ken
Closes: #14095
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37891>
2025-10-16 00:37:30 +00:00
Alyssa Rosenzweig
84d8e6824b treewide: don't check before free
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This was something that came up in the slop MR. Not sure it's actually a
good idea or not but kind of curious what people think, given we have a
sound tool (Coccinelle) to do the transform. Saves a redundant branch
but means extra noninlined function calls.. likely no actual perf impact
but saves some code.

Via Coccinelle patches:

    @@
    expression ptr;
    @@

    -if (ptr) {
    -free(ptr);
    -}
    +free(ptr);

    @@
    expression ptr;
    @@

    -if (ptr) {
    -FREE(ptr);
    -}
    +FREE(ptr);

    @@
    expression ptr;
    @@

    -if (ptr) {
    -ralloc_free(ptr);
    -}
    +ralloc_free(ptr);

    @@
    expression ptr;
    @@

    -if (ptr != NULL) {
    -free(ptr);
    -}
    -
    +free(ptr);

    @@
    expression ptr;
    @@

    -if (ptr != NULL) {
    -FREE(ptr);
    -}
    -
    +FREE(ptr);

    @@
    expression ptr;
    @@

    -if (ptr != NULL) {
    -ralloc_free(ptr);
    -}
    -
    +ralloc_free(ptr);

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v3d]
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org> [venus]
Reviewed-by: Frank Binns <frank.binns@imgtec.com> [powervr]
Reviewed-by: Janne Grunau <j@jannau.net> [asahi]
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> [radv]
Reviewed-by: Job Noorman <jnoorman@igalia.com> [ir3]
Acked-by: Marek Olšák <maraeo@gmail.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Job Noorman <jnoorman@igalia.com>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37892>
2025-10-15 23:01:33 +00:00
Caio Oliveira
f861cd47d6 brw: Add variable for opcode in the brw_set_* high-level helpers
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37896>
2025-10-15 17:22:04 +00:00
Lionel Landwerlin
49226692e5 brw: fix invalid sparse bitfield offset computation
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
dest_size is the number of outputs to be provided into the IR, but the
location of the sparse bitfield in the dst temporary SEND destination
might be different (shorter due to masking of unused components
computed above).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14094
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37876>
2025-10-15 14:42:51 +00:00
José Roberto de Souza
19de4b82f9 intel/brw: Store and set sfid in memory fences
sfid is another field that is not preserved after brw_transform_inst_to_send()
so we need to store it before transform and retore it to preserve the sfid value.

Fixes: 0fcce2722f ("brw: Add brw_send_inst")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37823>
2025-10-15 13:38:08 +00:00
José Roberto de Souza
a259f64595 intel/brw: Call lower_hdc_memory_fence_and_interlock() with brw_send_inst
With that we can avoid some as_send() calls.

Fixes: 0fcce2722f ("brw: Add brw_send_inst")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37823>
2025-10-15 13:38:08 +00:00