Commit graph

4458 commits

Author SHA1 Message Date
Qiang Yu
4847e0b380 all: rename gl_shader_stage_uses_workgroup to mesa_shader_stage_uses_workgroup
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569>
2025-08-06 10:28:41 +08:00
Qiang Yu
b27c8c9eb8 all: rename gl_shader_stage_is_mesh to mesa_shader_stage_is_mesh
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569>
2025-08-06 10:28:41 +08:00
Qiang Yu
7a91473192 all: rename gl_shader_stage_is_compute to mesa_shader_stage_is_compute
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569>
2025-08-06 10:28:41 +08:00
Qiang Yu
196569b1a4 all: rename gl_shader_stage to mesa_shader_stage
It's not only for GL, change to a generic name.

Use command:
  find . -type f -not -path '*/.git/*' -exec sed -i 's/\bgl_shader_stage\b/mesa_shader_stage/g' {} +

Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569>
2025-08-06 10:28:40 +08:00
Qiang Yu
07a3a54d37 all: rename PIPE_SHADER_TYPES to MESA_SHADER_STAGES
Use command:
  find . -type f -not -path '*/.git/*' -exec sed -i 's/\bPIPE_SHADER_TYPES\b/MESA_SHADER_STAGES/g' {} +

Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569>
2025-08-06 10:28:39 +08:00
Qiang Yu
11027dd3f8 all: rename PIPE_SHADER_FRAGMENT to MESA_SHADER_FRAGMENT
Use command:
  find . -type f -not -path '*/.git/*' -exec sed -i 's/PIPE_SHADER_FRAGMENT/MESA_SHADER_FRAGMENT/g' {} +

Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569>
2025-08-06 10:28:39 +08:00
Qiang Yu
197c183d2d all: rename PIPE_SHADER_TESS_EVAL to MESA_SHADER_TESS_EVAL
Use command:
  find . -type f -not -path '*/.git/*' -exec sed -i 's/PIPE_SHADER_TESS_EVAL/MESA_SHADER_TESS_EVAL/g' {} +

Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569>
2025-08-06 10:28:39 +08:00
Qiang Yu
6cb38f9418 all: rename PIPE_SHADER_TESS_CTRL to MESA_SHADER_TESS_CTRL
Use command:
  find . -type f -not -path '*/.git/*' -exec sed -i 's/PIPE_SHADER_TESS_CTRL/MESA_SHADER_TESS_CTRL/g' {} +

Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569>
2025-08-06 10:28:39 +08:00
Kenneth Graunke
c12497f943 brw: Update copy propagation into EOT sends handling for Xe2 units
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
We're counting in REG_SIZE units here, but g112-127 is twice as large on
Xe2.  Check against 15 * reg_unit() to avoid missing out on propagation.

fossil-db results on Arc B580:

   Totals:
   Instrs: 233779396 -> 233779098 (-0.00%)
   Cycle count: 32601212742 -> 32601187382 (-0.00%); split: -0.00%, +0.00%
   Max live registers: 72695253 -> 72694326 (-0.00%); split: -0.00%, +0.00%

   Totals from 232 (0.03% of 789301) affected shaders:
   Instrs: 41071 -> 40773 (-0.73%)
   Cycle count: 1756714 -> 1731354 (-1.44%); split: -2.01%, +0.57%
   Max live registers: 22092 -> 21165 (-4.20%); split: -4.48%, +0.28%

Fixes: ec2e8bc33f ("intel/compiler: Avoid copy propagating large registers into EOT messages")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36577>
2025-08-05 23:57:25 +00:00
Kenneth Graunke
946f768359 brw: Fix units in copy propagation EOT restriction size calculation
size_read() counts in bytes.  s.alloc.sizes[] counts in REG_SIZE units.

(Affects 4 raytracing shaders in Cyberpunk 2077.)

Fixes: ec2e8bc33f ("intel/compiler: Avoid copy propagating large registers into EOT messages")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36577>
2025-08-05 23:57:25 +00:00
Kenneth Graunke
4151a39b8a brw: Refactor copy propagation checks for EOT send restrictions
These are identical, pull them into a helper so we only have one place
to fix bugs.

(Marked fixes because the next two patches depend on the refactor.)

Fixes: ec2e8bc33f ("intel/compiler: Avoid copy propagating large registers into EOT messages")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36577>
2025-08-05 23:57:25 +00:00
Marek Olšák
b769d5dcde nir: don't use variables as ralloc parents, use the shader instead
so that we can switch variables to gc_ctx

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36538>
2025-08-05 22:55:13 +00:00
Marek Olšák
96ffc24e4e nir: add nir_variable_{set,append,steal}_name{f}() to modify nir_variable names
Setting variable names currently always uses ralloc, but the new
nir_variable_* helpers will mostly eliminate ralloc/malloc in a later
commit.

This just updates all places that touch nir_variable names to use the new
helpers.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36538>
2025-08-05 22:55:12 +00:00
Marek Olšák
ae5b168051 ralloc/linalloc: allow adding custom code to LINEAR_ALLOC new operator
for GLSL IR

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36539>
2025-08-04 02:07:00 +00:00
Alyssa Rosenzweig
3719983edf brw: replace lower_fs_msaa with nir_inline_sysval
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36516>
2025-08-03 21:27:47 +00:00
Paulo Zanoni
257e1515e3 brw: null-tile sends don't need to skip L3 on Xe2 and newer
Despite the information in "Overview of Memory Access" (57046), the L3
seems to be smarter on Xe2+. See 4aa3b2d3ad ("anv: LNL+ doesn't need
the special flush for sparse").

The behavior is the same both with vm_bind and TR-TT.

v2: Add some comments (Caio).

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36150>
2025-08-01 18:47:37 +00:00
Paulo Zanoni
80f01c03ba brw: remove unnecessary casts to unsigned after calling LSC_CACHE()
The macro already casts the values to unsigned.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36150>
2025-08-01 18:47:37 +00:00
Paulo Zanoni
c845b30a21 brw: adjust comment pasted from a commit message
The comment was pasted from the commit message that added it. Remove
the parts that only make sense in the commit message, not in the final
code.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36150>
2025-08-01 18:47:37 +00:00
Paulo Zanoni
4bb41156b9 brw: mark 'volatile' sends as uncached on LSC messages
The residencyNonResidentStrict property requires that writes to
unbound memory be ignored and reads return zero. We need this
property, otherwise vkd3d will claim we don't support DX12.

If a shader writes to a variable associated with an unbound memory
region (i.e., mapped to a null tile), reads it back (in the same
shader) and expects the value be 0 instead of what is wrote, it has to
use the 'volatile' access qualifier to the variable associated with
the access, otherwise the compiler will be allowed to optmize things
and use the non-zero value.  This is explained in the "Accessing
Unbound Regions" section of the Vulkan spec.

Our hardware adds an extra problem on top of the above. BSpec page
"Overview of Memory Access" (47630, 57046) says:

  "If a read from a Null tile gets a cache-hit in a
   virtually-addressed GPU cache, then the read may not return
   zeroes."

So, when we detect this type of access, we have to turn off the
caching.

There's a proposed Vulkan CTS test that does exactly the above.

No shaders on shader_db seem to be using 'volatile'.

v2:
 - Reorder commit order
 - Rewrite commit message

v3:
 - Rework the patch after Caio pointed out the interaction with
   'coherent'.
 - Remove previous R-B tags due to the patch differences.

v4:
 - Rework the patch and commit message again after further
   discussions.

v5:
 - Check for atomic first so we don't regress DG2 atomic tests.

Fixes future test: dEQP-VK.sparse_resources.buffer.ssbo.read_write.sparse_residency_non_resident_strict

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36150>
2025-08-01 18:47:37 +00:00
Paulo Zanoni
f7581e4a38 brw: consider 'volatile' memory access when doing CSE
The GLSL spec says (among other things):

  "When a volatile variable is read, its value must be re-fetched from
   the underlying memory, even if the shader invocation performing the
   read had previously fetched its value from the same memory. When a
   volatile variable is written, its value must be written to the
   underlying memory, even if the compiler can conclusively determine
   that its value will be overwritten by a subsequent write."

The SPIR-V spec says (among other things):

  "Accesses to volatile memory cannot be eliminated, duplicated, or
   combined with other accesses."

So in this commit we make sure that both writes and reads marked as
volatile can't be affected by CSE.

v2: Reorder patches in the series.

Credits-to: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v1)
Reviewed-by: Iván Briano <ivan.briano@intel.com> (v1)
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36150>
2025-08-01 18:47:36 +00:00
Paulo Zanoni
8e1e3ba152 brw: store 'volatile' GLSL/SPIR-V access in MEMORY_LOGICAL_FLAGS
We seem to be ignoring the 'volatile' keyword coming from the shaders.
Record this in MEMORY_LOGICAL_FLAGS so we can use it later.

Credits-to: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iván Briano <ivan.briano@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36150>
2025-08-01 18:47:36 +00:00
Paulo Zanoni
670cd08c68 brw: remove unnecessary <vector> inclusions
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iván Briano <ivan.briano@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36150>
2025-08-01 18:47:35 +00:00
Alyssa Rosenzweig
bcf1a1c20b treewide: use nir_def_block
Via Coccinelle patch:

    @@
    expression definition;
    @@

    -definition->parent_instr->block
    +nir_def_block(definition)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36489>
2025-08-01 15:34:24 +00:00
Alyssa Rosenzweig
82ae8b1d33 treewide: simplify nir_def_rewrite_uses_after
Most of the time with nir_def_rewrite_uses_after, you want to rewrite after the
replacement. Make that the default thing to be more ergonomic and to drop
parent_instr uses.

We leave nir_def_rewrite_uses_after_instr defined if you really want the old
signature with an arbitrary after point.

Via Coccinelle patch:

    @@
    expression a, b;
    @@

    -nir_def_rewrite_uses_after(a, b, b->parent_instr)
    +nir_def_rewrite_uses_after_def(a, b)

Followed by a bunch of sed.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36489>
2025-08-01 15:34:24 +00:00
Alyssa Rosenzweig
cc6e3b84cb treewide: use nir_def_as_*
Via Coccinelle patch:

    @@
    expression definition;
    @@

    -nir_instr_as_alu(definition->parent_instr)
    +nir_def_as_alu(definition)

    @@
    expression definition;
    @@

    -nir_instr_as_intrinsic(definition->parent_instr)
    +nir_def_as_intrinsic(definition)

    @@
    expression definition;
    @@

    -nir_instr_as_phi(definition->parent_instr)
    +nir_def_as_phi(definition)

    @@
    expression definition;
    @@

    -nir_instr_as_load_const(definition->parent_instr)
    +nir_def_as_load_const(definition)

    @@
    expression definition;
    @@

    -nir_instr_as_deref(definition->parent_instr)
    +nir_def_as_deref(definition)

    @@
    expression definition;
    @@

    -nir_instr_as_tex(definition->parent_instr)
    +nir_def_as_tex(definition)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36489>
2025-08-01 15:34:24 +00:00
Lionel Landwerlin
cea714329c brw: make more passes printable through NIR_DEBUG
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36512>
2025-08-01 11:35:00 +00:00
Marek Olšák
db26597f8d intel: fork exec_node/list -> brw_exec_node/list as a private Intel utility
NIR is going to use exec_node/list without the C++ code, and may switch to
a different linked list implementation in the future.

GLSL is going to use ir_exec_node/list, which we want to keep private
for GLSL, so that we can change it easily.

Thus, it's better to fork the C++ version of list.h for Intel.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36425>
2025-07-31 20:23:02 +00:00
Caio Oliveira
f222b16f92 brw: Remove extra iteration on instructions from brw_opt_address_reg_load
The helper function already iterate instructions.

Fixes: 8ac7802ac8 ("brw: move final send lowering up into the IR")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36478>
2025-07-31 19:45:16 +00:00
Antonio Ospite
ddf2aa3a4d build: avoid redefining unreachable() which is standard in C23
In the C23 standard unreachable() is now a predefined function-like
macro in <stddef.h>

See https://android.googlesource.com/platform/bionic/+/HEAD/docs/c23.md#is-now-a-predefined-function_like-macro-in

And this causes build errors when building for C23:

-----------------------------------------------------------------------
In file included from ../src/util/log.h:30,
                 from ../src/util/log.c:30:
../src/util/macros.h:123:9: warning: "unreachable" redefined
  123 | #define unreachable(str)    \
      |         ^~~~~~~~~~~
In file included from ../src/util/macros.h:31:
/usr/lib/gcc/x86_64-linux-gnu/14/include/stddef.h:456:9: note: this is the location of the previous definition
  456 | #define unreachable() (__builtin_unreachable ())
      |         ^~~~~~~~~~~
-----------------------------------------------------------------------

So don't redefine it with the same name, but use the name UNREACHABLE()
to also signify it's a macro.

Using a different name also makes sense because the behavior of the
macro was extending the one of __builtin_unreachable() anyway, and it
also had a different signature, accepting one argument, compared to the
standard unreachable() with no arguments.

This change improves the chances of building mesa with the C23 standard,
which for instance is the default in recent AOSP versions.

All the instances of the macro, including the definition, were updated
with the following command line:

  git grep -l '[^_]unreachable(' -- "src/**" | sort | uniq | \
  while read file; \
  do \
    sed -e 's/\([^_]\)unreachable(/\1UNREACHABLE(/g' -i "$file"; \
  done && \
  sed -e 's/#undef unreachable/#undef UNREACHABLE/g' -i src/intel/isl/isl_aux_info.c

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36437>
2025-07-31 17:49:42 +00:00
Caio Oliveira
f2a49081de brw: Use ralloc helpers for string handling in brw_eu_validate
Acked-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36339>
2025-07-30 17:59:26 +00:00
Lionel Landwerlin
60932e8fae brw: always ensure coarse pixel is disabled on Gfx9
No HW support there.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36457>
2025-07-30 07:57:19 +00:00
Lionel Landwerlin
aa6810b706 brw: consider LOAD_PAYLOAD fully defined
It's mostly used for SEND messages and fully defines the register data
(that's its purpose after all).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36457>
2025-07-30 07:57:19 +00:00
Lionel Landwerlin
9371e8d370 brw: fixup coarse_z computation
The delivered values in the coarse pixel size are 0 when coarse pixel
dispatch is disabled and that is screwing up our half pixel offset
adjustment.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36457>
2025-07-30 07:57:19 +00:00
Lionel Landwerlin
9dac7dda87 brw: fixup source depth enabling with coarse pixel shading
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36457>
2025-07-30 07:57:18 +00:00
Lionel Landwerlin
68c50d129e brw: fix NIR metadata invalidation with closest-hit shaders
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36457>
2025-07-30 07:57:18 +00:00
José Roberto de Souza
07f5b53dd7 intel/brw: Remove duplicated implementation of brw_imm_uq/brw_imm_u64()
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36448>
2025-07-29 16:05:54 +00:00
José Roberto de Souza
14386eb7e5 intel/brw: Add comment to reg_unit()
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36448>
2025-07-29 16:05:54 +00:00
José Roberto de Souza
7981a18df2 intel/brw: Nuke unused brw_message_desc_header_present()
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36448>
2025-07-29 16:05:53 +00:00
Ian Romanick
fa74c31b22 brw: Allow additional flags registers on Xe2+
Xe2 adds two more flags registers. We barely use the second flags
register on previous platforms, so the omission was not previously
noticed.

There are several efforts in progress that will add using of more flags
registers.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35415>
2025-07-24 23:08:08 +00:00
Ian Romanick
1279f12c84 brw: Implement Wa_22012725308 for flags via SWSB too
At this point, using the per-register granularity will only help in
conjuction with fragment shader discard (which is implemented using f1).

v2: Loop restructuring and code cleanups. Suggested by Curro.

v3: Only apply Wa on Gfx12.5+. Suggested by Curro.

v4: Also apply to implicit flag reads. Suggested by Curro. This version
affects a *lot* more shaders (10,936 on Meteor Lake shader-db versus
4,482 before). The results are still very much in the 🤷 territory.

v5: Add missing dependency. I thought I got them all the previous
time. :( Noticed by Curro.

shader-db:

Lunar Lake
total cycles in shared programs: 886315282 -> 886391040 (<.01%)
cycles in affected programs: 204907250 -> 204983008 (0.04%)
helped: 1 / HURT: 6716

LOST:   0
GAINED: 1

Meteor Lake and DG2 had similar results. (Meteor Lake shown)
total cycles in shared programs: 883774789 -> 883921507 (0.02%)
cycles in affected programs: 481836784 -> 481983502 (0.03%)
helped: 4 / HURT: 10936

LOST:   3
GAINED: 7

fossil-db:

Lunar Lake
Totals:
Cycle count: 32600441334 -> 32601862658 (+0.00%); split: -0.00%, +0.00%

Totals from 90283 (11.44% of 789260) affected shaders:
Cycle count: 17265933202 -> 17267354526 (+0.01%); split: -0.00%, +0.01%

Meteor Lake and DG2 had similar results. (Meteor Lake shown)
Totals:
Cycle count: 26477292677 -> 26480321805 (+0.01%); split: -0.00%, +0.01%
Max dispatch width: 8010440 -> 8010984 (+0.01%)

Totals from 132952 (14.71% of 903925) affected shaders:
Cycle count: 15349555348 -> 15352584476 (+0.02%); split: -0.00%, +0.02%
Max dispatch width: 1085416 -> 1085960 (+0.05%)

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35415>
2025-07-24 23:08:07 +00:00
Ian Romanick
1fdcc9039b brw: Add and use brw_reg_is_arf to test for a specific ARF
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35415>
2025-07-24 23:08:07 +00:00
Alyssa Rosenzweig
ecfca8ec6f util: crib SWAP macro from freedreno
we have a bunch of copies across the tree, unify them.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36257>
2025-07-21 11:42:18 +00:00
Caio Oliveira
3c7dd0ccf1 brw: Make brw_builder() shader constructor use CFG if available
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Properly pick the end of the last block as a cursor.  Also remove the
default constructor since is not needed anymore.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34681>
2025-07-19 17:49:48 +00:00
Caio Oliveira
ab8af62745 brw: Use a builder to track position in lower_simd
Removes brw_builder::at() since it is now unused, replaced by various
other helpers.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34681>
2025-07-19 17:49:48 +00:00
Caio Oliveira
8826b1e680 brw: Use a more specific builder helper in combine constants
Also remove commentary about older Gfx versions that don't apply
anymore.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34681>
2025-07-19 17:49:47 +00:00
Caio Oliveira
ac2b072312 brw: Add more specific brw_builder helpers
Replace uses of brw_builder::at() with various more descriptive
variants.  Use block pointer from instruction when possible.

A couple of special cases remained and will be handled in separate patches.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34681>
2025-07-19 17:49:47 +00:00
Caio Oliveira
6c5132ec9a brw: Move insert/remove code to the block
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34681>
2025-07-19 17:49:46 +00:00
Caio Oliveira
2dfd4dcbc5 brw: Fix cmat conversion between bfloat16 and non-float32
The HW only supports converting BRW_TYPE_BF values to/from BRW_TYPE_F,
so intermediate conversion is needed.  Move the intermediate conversion
to the implementation of `@convert_cmat_intel` and simplify the
brw_nir_lower_cooperative_matrix pass.  This has two positive effects

- Fixes conversion between BF and integer type cooperative matrices,
  that was still using the old emit_alu1 approach instead of the new
  code for `@convert_cmat_intel`.

- Guarantee the intermediate conversion will result in a valid layout
  for conversions involved USE_B matrices.  If we instead used the
  intrinsic twice in brw_nir_lower_cooperative_matrix.c, a matrix with
  invalid layout would be visible at NIR level and we wouldn't be able
  to keep the current assertion for USE_B case.

Due to the configurations we have exposed, we still don't need to
write a more complex USE_B conversion -- they are all between same
size types (and, consequently, packing factors), so no shuffling of
data is needed to respect the USE_B layout.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36185>
2025-07-18 21:55:43 +00:00
Ian Romanick
2594fcadd4 brw: Split virtual GRFs again at the end of optimizations
Logical sends and load_payload can have large VGRFs that cannot be
split. Once all of the lowering passes and optimization passes that
might eliminate any of those instructions have completed, try to split
larger VGRFs one last time.

Register allocation can only handle VGRFs up to a certain size, so this
is the last opportunity to prevent later failures due to VGRFs that are
too large.

Closes: #13239

shader-db:

Lunar Lake, Meteor Lake, DG2, and Tiger Lake had similar results. (Lunar Lake shown)
total instructions in shared programs: 17114494 -> 17114496 (<.01%)
instructions in affected programs: 2790 -> 2792 (0.07%)
helped: 2 / HURT: 4

total cycles in shared programs: 886617364 -> 886315282 (-0.03%)
cycles in affected programs: 4067540 -> 3765458 (-7.43%)
helped: 48 / HURT: 9

Ice Lake and Skylake had similar restuls. (Ice Lake shown)
total instructions in shared programs: 20799801 -> 20799691 (<.01%)
instructions in affected programs: 1210 -> 1100 (-9.09%)
helped: 1 / HURT: 0

total cycles in shared programs: 865495386 -> 865498990 (<.01%)
cycles in affected programs: 60132 -> 63736 (5.99%)
helped: 2 / HURT: 1

total spills in shared programs: 3987 -> 3981 (-0.15%)
spills in affected programs: 24 -> 18 (-25.00%)
helped: 1 / HURT: 0

total fills in shared programs: 3535 -> 3519 (-0.45%)
fills in affected programs: 36 -> 20 (-44.44%)
helped: 1 / HURT: 0

fossil-db:

All Intel platforms had similar results. (Lunar Lake shown)
Totals:
Instrs: 208647246 -> 208646499 (-0.00%); split: -0.00%, +0.00%
Cycle count: 31257819536 -> 31263957016 (+0.02%); split: -0.02%, +0.04%
Max live registers: 66160877 -> 66155728 (-0.01%)

Totals from 34703 (4.91% of 707053) affected shaders:
Instrs: 13766639 -> 13765892 (-0.01%); split: -0.02%, +0.01%
Cycle count: 3693572086 -> 3699709566 (+0.17%); split: -0.15%, +0.32%
Max live registers: 4843852 -> 4838703 (-0.11%)

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36202>
2025-07-18 19:04:01 +00:00
Ian Romanick
f6da6399d7 brw/reg_allocate: Don't access out of bounds in non-debug builds
In debug builds, the assertion should be preferred as it will highlight
the actual problem. In non-debug builds, it is possible to fail register
allocation more gracefully. If the problem only occurs in, for example,
a SIMD32 version of a shader, the application may even continue to
function.

Closes: #13239
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36202>
2025-07-18 19:04:01 +00:00