Commit graph

222933 commits

Author SHA1 Message Date
Lakshman Chandu Kondreddy
9ff0cd7b4d zink: Query external memory handle type compatibility
Zink sets multiple external memory handle types(like Opaque FD,DMA-BUF)
without confirming if the Vulkan driver actually supports them. This may lead
to failures when attempting to allocate external memory with vkAllocateMemory.

This patch introduces query_external_memory_compatibility() to verify
handle type support via VkPhysicalDeviceImageFormatProperties2.
Combine handle types only if they are compatible; otherwise, use a single
supported type as a fallback.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40212>
2026-05-22 18:39:19 +00:00
Mike Blumenkrantz
cafa22142b zink: create views for samplers lazily
pipe_context::create_sampler_view can be called from different threads,
so the vulkan object must not be accessed in order to avoid conflicts
with driver thread operations

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15337

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41759>
2026-05-22 18:19:13 +00:00
Adam Jackson
319011d690 zink: stop find_good_mod from mutating ici in place
find_good_mod was accumulating DISJOINT_BIT across iterations and
setting ici->usage on success. Change it to return results via
out-parameters and save/restore ici->flags around each modifier
attempt. The caller (negotiate_image_config) now explicitly sets
ici->usage and ici->flags after find_good_mod returns.

Also save/restore flags in the LINEAR modifier fallback path.

Assisted-by: Claude
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41734>
2026-05-22 18:01:17 +00:00
Adam Jackson
d305e6d7b1 zink: replace image negotiation with candidate-based approach
Replace the nested retry loops (eval_ici, set_image_usage,
double_check_ici, suboptimal_check_ici, try_set_image_usage_or_EXTENDED)
with a flat candidate array that encodes the same fallback order.

Instead of mutating a shared VkImageCreateInfo through deeply nested
function calls and retrying with toggled flags, we now:
1. build_usage_candidates() generates an array of (tiling, usage,
   flags, has_format_list) tuples in preference order
2. try_image_config() applies each candidate and calls check_ici
3. negotiate_image_config() iterates tiling/extended combos, builds
   candidates for each, and takes the first passing one

The modifier path (find_good_mod) is kept separate since it iterates
modifiers and takes the last good one (max-by-position, matching
the GBM worst-to-best convention), which is fundamentally different
from the candidate model's first-match-from-fallback-chain.

Duplicate candidates from the old code's redundant retry paths are
eliminated via dedup_configs(). The pNext chain surgery in
double_check_ici (manually unlinking VkImageFormatListCreateInfo) is
replaced by try_image_config's explicit format list chain/unchain.

The cube-compatible post-pass is simplified to a single check_ici
call instead of re-running the full negotiation.

Assisted-by: Claude
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41734>
2026-05-22 18:01:17 +00:00
Adam Jackson
f5943e9dbb zink: extract memory binding from create_image
Move disjoint/non-disjoint BindImageMemory into bind_image_memory().
create_image now reads as a clear sequence: format list, init_ici,
negotiate, pNext chain, CreateImage, allocate, bind. No functional
change.

Assisted-by: Claude
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41734>
2026-05-22 18:01:16 +00:00
Adam Jackson
d0311ff971 zink: extract pNext chain construction from create_image
Move VkExternalMemoryImageCreateInfo, DRM modifier explicit/list
create info, and user memory pNext chain building into
setup_image_pnext(). The Vk*Info structs now live in
image_pnext_state on the caller's stack. No functional change.

Assisted-by: Claude
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41734>
2026-05-22 18:01:16 +00:00
Adam Jackson
9716630b41 zink: extract format list setup from create_image
Move sRGB pair computation, video plane format enumeration, and
VkImageFormatListCreateInfo population into setup_format_list().
No functional change.

Assisted-by: Claude
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41734>
2026-05-22 18:01:16 +00:00
Adam Jackson
b0d80d9b70 zink: consolidate resource_create error paths
Replace four copy-pasted cleanup sequences with goto fail/fail_obj
labels. No functional change.

Assisted-by: Claude
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41734>
2026-05-22 18:01:16 +00:00
Rob Clark
a573e25b6d freedreno/registers: Gen8 perfcntr fixes
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Correct # of UCHE counters, and fix pipe for CMP counters.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41757>
2026-05-22 16:47:04 +00:00
Rob Clark
9260c8b145 freedreno/registers: Add a6xx CMP counter group
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41757>
2026-05-22 16:47:04 +00:00
Rob Clark
96c5179c02 freedreno/registers: Skip deprecated warns for kernel
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41757>
2026-05-22 16:47:04 +00:00
Lionel Landwerlin
fd11e4b4d3 intel: switch shader hash to 64bit value
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41748>
2026-05-22 15:05:30 +00:00
Lionel Landwerlin
c09f00d339 anv: use shader source hash rather than cmd_buffer fields
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41748>
2026-05-22 15:05:28 +00:00
Lionel Landwerlin
88418718a9 spirv: fixup infinite recursion with shader replacement
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
While trying to use that feature on RADV I ran into an infinite
recursion.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 97b4a6d0e3 ("compiler: SPIR-V shader replacement")
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41751>
2026-05-22 14:31:22 +00:00
Daniel Stone
5b07e5e8e4 ci/panfrost: Add two T860 OpenCL fails
I've seen these for at least a couple of days now.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41749>
2026-05-22 13:31:37 +00:00
Daniel Stone
404a4f4d22 ci/panfrost: Switch T860 jobs to another RK3399 device type
The kevins are increasingly creaky and unreliable after a decade of
excellent service, so it's time to send them off to the farm and move
our T860 jobs to a device type which can actually run jobs.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41749>
2026-05-22 13:31:37 +00:00
David Rosca
998e2a70e7 radeonsi: Add RADEON_FLUSH_FORCE and use it to force flush
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Forcing flush by setting initial_gfx_cs_size to zero requires
there are always packets emitted on starting new gfx IB.
But this is not the case with userq, as there is no preamble.
Add a new flag to be used with si_flush_gfx_cs to force flush.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41530>
2026-05-22 10:49:22 +00:00
Lionel Landwerlin
294644643e brw: avoid requiring a valid render target for empty fragment shaders
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Dishonered 2 or DXVK is creating pipelines with empty fragment
shaders. With alpha-to-coverage a dynamic state, we currently consider
there is a need for a render target but if the shader is not writing
anything, it's not needed.

This change only considers the color output writes as it's the alpha
channel there that is used for coverage computation.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41711>
2026-05-22 09:53:33 +00:00
Lionel Landwerlin
f34dd96ab5 anv: fix render target remapping tracking at the beginning of render passes
At the beginning of render passes we need to consider all entries as
unknown because it's all new color outputs.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: d2f7b6d5a7 ("anv: implement VK_KHR_dynamic_rendering_local_read")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15475
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41711>
2026-05-22 09:53:33 +00:00
Lionel Landwerlin
f35a0f3ba5 anv: fix missing bindless flag hashing
It got dropped in a rebase it seems...

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41711>
2026-05-22 09:53:33 +00:00
Caio Oliveira
e46b43080b iris: Simplify code that calls brw/jay
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Iván Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41633>
2026-05-22 00:57:20 -07:00
Caio Oliveira
ffa4bc7d6a anv: Simplify code that calls brw/jay
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Iván Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41633>
2026-05-22 00:57:20 -07:00
Caio Oliveira
33475c0cce brw: Move key and prog_data to base compile params
Reviewed-by: Iván Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41633>
2026-05-22 00:57:20 -07:00
Caio Oliveira
7893eefa3b brw: Use a single brw_compile entrypoint
Reviewed-by: Iván Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41633>
2026-05-22 00:57:20 -07:00
Valentine Burley
190ce8280f meson: Add Soong compatibility compiler flags to Vulkan drivers
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Suggested by @gurchetansingh.

Android's Soong build system treats several compiler warnings as errors
by default: https://android.googlesource.com/platform/build/soong/+/27f57506/cc/config/global.go/#218

To catch these issues in Mesa, introduce `soong_compat_c_args`
and `soong_compat_cpp_args` with the following flags treated as errors:
 -D_LIBCPP_ENABLE_THREAD_SAFETY_ANNOTATIONS
 -Werror=date-time
 -Werror=gnu-alignof-expression
 -Werror=ignored-qualifiers
 -Werror=implicit-fallthrough
 -Werror=int-conversion
 -Werror=missing-prototypes
 -Werror=pragma-pack
 -Werror=pragma-pack-suspicious-include
 -Werror=sizeof-array-div
 -Werror=string-plus-int
 -Werror=unreachable-code-loop-increment

These compatibility flags are added to the meson configurations
for ANV, Gfxstream, Lavapipe, PanVK, Turnip, and Venus.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Gurchetan Singh <gurchetan.singh.foss@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41644>
2026-05-22 07:09:49 +00:00
Valentine Burley
64afecc4f9 panvk: Fix ignored qualifier warnings
Fixes:

src/panfrost/lib/pan_image.h:133:15: error: 'const' type qualifier on return type has no effect [-Werror,-Wignored-qualifiers]
  133 | static inline const struct pan_image_plane_ref
      |               ^~~~~

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Gurchetan Singh <gurchetan.singh.foss@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41644>
2026-05-22 07:09:49 +00:00
Valentine Burley
8cd6e3ac08 tu: Disable -Wmisleading-indentation when compiling with GCC
Based on the approach in e0eea5ea4e.

When a file is too large, -Wmisleading-indentantion will give the warning
below, that we can't prevent from a #pragma:

../src/freedreno/vulkan/tu_perfetto.cc: In function 'void setup_incremental_state(MesaRenderpassDataSource<TuRenderpassDataSource, TuRenderpassTraits>::TraceContext&, tu_device*)':
../src/freedreno/vulkan/tu_perfetto.cc:162: note: '-Wmisleading-indentation' is disabled from this point onwards, since column-tracking was disabled due to the size of the code/headers
  162 |    if (!state->was_cleared)
../src/freedreno/vulkan/tu_perfetto.cc:162: note: adding '-flarge-source-files' will allow for more column-tracking support, at the expense of compilation time and memory

See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89549 for details.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41644>
2026-05-22 07:09:49 +00:00
Lionel Landwerlin
dd41fde91d anv: use the new generation script for drirc
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41664>
2026-05-22 06:32:39 +00:00
Lionel Landwerlin
d8ab38e5e3 drirc: remove non Anv option in the Anv section
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41664>
2026-05-22 06:32:39 +00:00
Lionel Landwerlin
83ed74b5df hasvk: add a driver section for drirc
Only adding the workarounds that have an actual effect on that driver.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41664>
2026-05-22 06:32:39 +00:00
Lionel Landwerlin
af88ba317d hasvk: rename a couple of drirc options
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41664>
2026-05-22 06:32:39 +00:00
Lionel Landwerlin
61267c69db util/drirc_gen: enable validation for a specific driver
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41664>
2026-05-22 06:32:39 +00:00
Sagar Ghuge
73382c8126 brw/rt: Update committed hit leaf type properly
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
We want extract the leaf type from potential hit and assign it
to commited hit.

Instead of that, we were simply assigning leaf type 0x7 to commited hit.

This patch mask out leaf type with nir_iand_imm and also update the
incorrect field comment.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Iván Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41667>
2026-05-22 00:47:39 +00:00
Caio Oliveira
2c64e12462 intel/executor: Add performance counter support
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Add optional OA performance counter collection around each execute()
call. Examples:

```
  # List all profiles and counters, with descriptions.
  $ executor --oa list

  # Collect all counters from a profile.
  $ executor --oa ComputeBasic file.lua

  # Collect a subset of counters from a profile, separated by comma.
  $ executor --oa ComputeBasic:GpuTime,AvgGpuCoreFrequency file.lua

  # By default use ComputeBasic profile, so counter names only also work.
  $ executor --oa GpuTime file.lua
```

The selected counters are printed to stdout after the script finishes,
or written to a file specified by --oa-csv FILENAME.

Assisted-by: Pi coding agent (GPT-5.5)
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41610>
2026-05-21 16:46:35 -07:00
Caio Oliveira
8d237b5408 intel/executor: Add an overflow check for alloc function
Assisted-by: Pi coding agent (GPT-5.5)
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41610>
2026-05-21 16:46:35 -07:00
Icenowy Zheng
3522f0f24c llvmpipe: stub other functions inside compute shaders for ORCJIT
ORCJIT expects every functions prototypes to be present even when using
object caches. Code for adding stubs for entry point functions was added
previously when implementing shader cache for ORCJIT, but when using
OpenCL, extra functions could be present in compute shaders which need
stubs too.

Reuse the code for constructing references for extra functions to
generate function stubs for them.

This fixes function calls with Rusticl on llvmpipe with ORCJIT.

Fixes: bb0efdd4d8 ("llvmpipe: add shader cache support for ORCJIT implementation")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41532>
2026-05-21 22:36:35 +00:00
Caio Oliveira
0dda43819e intel/compiler: Move bison command to shared meson.build
It is used by both brw and elk.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41738>
2026-05-21 22:15:00 +00:00
Caio Oliveira
c8914985c4 compiler: Support more than 255 cols/rows in cmat descriptions
This struct was initially packed to fit in a slot in NIR intrinsics
indices.  Nowadays NIR supports larger indices and cooperative matrix
has extensions that allow it to go beyond the existing limit.  This
patch changes the struct to be larger and remove the manual bit packing.

The hash table change is to use the specialized version for u64 keys
that's available in src/util.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41691>
2026-05-21 21:47:03 +00:00
Rob Clark
952b984eca freedreno/common: Fix X2-90, add X2-85
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Rename X2-90 (4 slice), and add the real X2-85 (3 slice).

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41693>
2026-05-21 21:23:24 +00:00
Sagar Ghuge
7f1defa5ef brw/rt: Commit hit even if we are skipping closest hit shader
It's not about the memory traffic but updating the Tmax value/distance
so that on next intersection, we would be comparing the updated Tmax
value/distance instead of original distance.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Iván Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41709>
2026-05-21 20:45:39 +00:00
Sagar Ghuge
17f7e7f96b anv: Set execution mask based on SIMD size
Execution mask gets applied to last thread in the threadgroup to mask
off simd lanes, But with BTD enabled, we are seeing only last 4
components has valid stack ID's and upper 4 components of the register
are zero.

Changing execution mask somehow populates the stack IDs properly.

This is on simulator, before changing the execution mask:
00000000 00000000 00000000 00000000  000F000E 000D000C 000B000A 00090008  00000000 00000000 00000000 00000000  000F000E 000D000C 000B000A 00090008  r1

After changing execution mask:
000F000E 000D000C 000B000A 00090008  00070006 00050004 00030002 00010000  000F000E 000D000C 000B000A 00090008  00070006 00050004 00030002 00010000  r1

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41409>
2026-05-21 20:25:46 +00:00
Caio Oliveira
7b286abe33 nir: Add print for other cmat_description slots
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Fixes: 102d7409ef ("nir: Add convert_cmat_intel intrinsic")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41690>
2026-05-21 19:23:12 +00:00
Caio Oliveira
e2402f6a07 brw: Bound register coalesce rewrites by live range
When updating a register after successfully finding a pair to coalesce,
use the live range of the source register to walk only the instructions
that might use it.  Depending on the shader this allows skipping a bunch
of blocks -- and also terminating early.

Below are fossil compilation times in a MTL machine compiling shaders
for a BMG GPU, the big win here was for Cyberpunk 2077.

```
// Differences at 95.0% confidence.

// Rise of the Tomb Raider (n=20)
   -0.0095 +/- 0.00706877
   -1.90572% +/- 1.40609%

// Alan Wake (n=20)
   -0.031 +/- 0.0172806
   -0.93599% +/- 0.51952%

// Borderlands 3 (n=15)
   -0.353333 +/- 0.118679
   -2.44307% +/- 0.80787%

// Oblivion Remastered (n=15)
   -0.134 +/- 0.026008
   -2.76898% +/- 0.531637%

// Baldur's Gate 3 (n=15)
   -0.954286 +/- 0.163625
   -2.21713% +/- 0.377562%

// Cyberpunk 2077 (n=20)
   -2.8665 +/- 0.228489
   -8.08661% +/- 0.621779%
```

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41495>
2026-05-21 18:32:36 +00:00
Caio Oliveira
821a812c7d brw: Don't directly use regs_read/regs_written/size_read as bound for non-trivial loops
Instead save to a local variable and use that.  In various cases the
compiler is not able to pull it out of the loop, since there are other
not inlined function calls as part of the loop's body, resulting in
repeated unnecessary calls to either size_read() or its pieces that
get inlined.

Below are fossil compilation times in a MTL machine compiling shaders
for a BMG GPU:

```
// Differences at 95.0% confidence.

// Rise of the Tomb Raider (n=20)
   -0.017 +/- 0.00724575
   -3.45177665% +/- 1.45084%

// Alan Wake (n=20)
   -0.153 +/- 0.00960067
   -4.99265786% +/- 0.303695%

// Borderlands 3 (n=14)
   -0.486428571 +/- 0.15354
   -3.51248195% +/- 1.0835%

// Oblivion Remastered (n=14)
   -0.143571429 +/- 0.0357991
   -3.05749924% +/- 0.747872%

// Baldur's Gate 3 (n=14)
   -1.68928571 +/- 0.151598
   -4.12128605% +/- 0.364259%
```

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41496>
2026-05-21 18:04:14 +00:00
Caio Oliveira
3f71aab327 brw: Pass VGRF numbers to liveness helpers
Compute var_from_reg() once in setup_def_use() and pass the variable
number to setup_one_read() and setup_one_write().  This lets the loops walk
consecutive variable numbers directly instead of mutating a brw_reg offset.

Also: setup_one_write() is only called for VGRFs, so remove the check
for VGRF there.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41496>
2026-05-21 18:04:14 +00:00
Caio Oliveira
9975a35f43 brw: Avoid unnecessary calls to size_read() in flags_read()
Only ARF sources are relevant in this case, so check the file
before calling size_read().

Below are fossil compilation times in a MTL machine compiling shaders
for a BMG GPU:

```
// Differences at 95.0% confidence.

// Rise of the Tomb Raider (n=20)
   No difference proven

// Alan Wake (n=20)
   -0.0725 +/- 0.0139437
   -2.30965276% +/- 0.438787%

// Borderlands 3 (n=14)
   -0.248571429 +/- 0.135107
   -1.76946153% +/- 0.954171%

// Oblivion Remastered (n=14)
   -0.0735714286 +/- 0.0235712
   -1.54770849% +/- 0.492117%

// Baldur's Gate 3 (n=14)
   -0.832142857 +/- 0.23095
   -1.98028217% +/- 0.545648%
```

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41496>
2026-05-21 18:04:13 +00:00
Caio Oliveira
bb8d8a2141 brw: Call size_read() once in regs_read()
regs_read() itself gets inlined, but size_read() does not.  In GCC
release builds this results in three calls to size_read() at each site,
one of them due to how MIN2 is expanded.  Use a local variable to store
the result.

Below are fossil compilation times in a MTL machine compiling shaders
for a BMG GPU:

```
// Differences at 95.0% confidence.

// Rise of the Tomb Raider (n=20)
   -0.013 +/- 0.00596452
   -2.56410256% +/- 1.15623%

// Alan Wake (n=20)
   -0.1755 +/- 0.0144896
   -5.29491628% +/- 0.425556%

// Borderlands 3 (n=14)
   -0.562142857 +/- 0.129678
   -3.84765816% +/- 0.870239%

// Oblivion Remastered (n=14)
   -0.0821428571 +/- 0.0262485
   -1.69867061% +/- 0.537247%

// Baldur's Gate 3 (n=14)
   -1.61357143 +/- 0.21693
   -3.69788342% +/- 0.486462%
```

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41496>
2026-05-21 18:04:13 +00:00
Caio Oliveira
3850922b78 brw: Save original regs_written() value in register coalesce
The instruction may get transformed, modifying the destination before
the loop index gets incremented.  So save the original regs_written
value to be used in the loop increment.

While we are here, assert that all the slots in mov[] are filled
at this point in the code.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41496>
2026-05-21 18:04:13 +00:00
Michael Cheng
ec778a297f brw: Fix ordered dependency exec_all handling on Xe2+
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
On Xe2+ the Wa_1407528679 NoMask workaround is disabled, so
baked_ordered_dependency_mode() should treat all instructions as
exec_all, matching the logic in gather_inst_dependencies() and
emit_inst_dependencies().

Without this, ordered RegDist dependencies from uniform/WE_all
producers (e.g. 'mov s0, imm') are not found during baking and
fall through as separate WE_all SYNC NOPs. Real shaders pile up
dozens of these in front of masked sends.

v2(Caio): Fix existing scalar_register test expectations

Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Fixes: 47a6ef3fef ("brw/scoreboard: Use a predicate helper for the nomask workaround")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41713>
2026-05-21 16:50:50 +00:00
Caio Oliveira
26e832d069 brw/scoreboard: Add disabled tests for RegDist baking on Xe2+
Add two tests verifying that ordered RegDist dependencies from
uniform/WE_all producers are baked into the consumer's SWSB on Xe2+.
Disabled for now since they fail on current main.

Reviewed-by: Michael Cheng <michael.cheng@intel.com>
Assisted-by: Pi coding agent (Opus-4.7)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41713>
2026-05-21 16:50:50 +00:00