Commit graph

15202 commits

Author SHA1 Message Date
Hans-Kristian Arntzen
11195eb8de vulkan: Add KHR_swapchain_maintenance1 promotions.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37157>
2025-11-30 10:30:53 +01:00
Calder Young
c0d809820f intel: Fix calculation of max_scratch_ids on fused devices
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The subslice IDs provided by the SR0.0 EU register are not adjusted to account
for fusing, so the upper bound max_scratch_ids can vary from device to device
depending on what specific slices were fused during manufacturing.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38689>
2025-11-29 15:10:29 +00:00
Faith Ekstrand
4711e5954e nir: Always use sysvals in lower_input_attachments()
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The last holdouts of the var options are gone so we can just emit the
system values.  This is overall simpler as it confines all the sysval to
var logic to nir_lower_sysvals_to_varyings().

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38562>
2025-11-29 00:50:34 +00:00
Marek Olšák
fa0bea5ff8 nir: remove nir_io_add_const_offset_to_base
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
nir_opt_constant_folding does it now.

Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38277>
2025-11-29 00:16:38 +00:00
Marek Olšák
9a56672f56 nir: add shader_info::disable_input/output_offset_src_constant_folding
and set it where needed to prevent nir_opt_constant_folding from breaking
those drivers.

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38277>
2025-11-29 00:16:38 +00:00
Valentine Burley
f9ef7e0f64 Revert "anv/ci: Run vkd3d job in parallel"
With the new vkd3d-proton uprev, a random crash has appeared when
running in parallel.

This reverts commit 45c9c61ad3.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38652>
2025-11-28 11:44:28 +00:00
Samuel Pitoiset
92a468f8f2 ci: uprev vkd3d
vkd3d-proton had an issue with its runner and few tests were excluded
by accident.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38652>
2025-11-28 11:44:28 +00:00
Tapani Pälli
ba89826b75 anv: add furmark workaround layer
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14274
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38410>
2025-11-28 09:26:41 +00:00
Yonggang Luo
0a32d5e6fd treewide: Use regexp to replace usage of setenv with os_set_option.
setenv\((.*), 1\);
=>
os_set_option($1, true);

setenv\((.*), 0\);
=>
os_set_option($1, false);

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Antonio Ospite <antonio.ospite@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38640>
2025-11-27 18:22:34 +00:00
Lionel Landwerlin
515d8f8e3a brw: fix sample mask flag emission
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
It's also used for testing helper invocations.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: e3328dfa2f ("brw: only initialize sample mask flag if needed")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38699>
2025-11-27 15:59:35 +00:00
Calder Young
09e8a54087 anv: Fix ray query shadow stack buffer size
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38685>
2025-11-26 22:49:52 +00:00
Ian Romanick
0c089a5c32 brw: Eliminate duplicate fills
When the register allocator decides to spill a value, all reads of that
value are filled. This can result in cases where the same value is
filled many times in a single block. In those cases, the result of an
earlier fill may still be available when a later fill occurs.

This optimization replaces the later fill with a move from the result of
the earlier fill.

v2: Use FIXED_GRF for register overlap tests. Since this is after
register allocation, the VGRF values will not tell the whole truth.

v3: Use brw_transform_inst. Suggested by Caio. Add
brw_scratch_inst::offset instead of storing it as a source. Suggested by
Lionel.

v4: In intervening spill to the same location also invalidates the
value. 🤦

v5: Don't eliminate a fill if its destination partially overlaps the
preceeding fill destination. Fixes failures in cooperative matrix CTS.

shader-db:

Lunar Lake, Meteor Lake, and DG2 had similar results. (Lunar Lake shown)
total instructions in shared programs: 17249903 -> 17249653 (<.01%)
instructions in affected programs: 35550 -> 35300 (-0.70%)
helped: 20 / HURT: 0

total cycles in shared programs: 893092398 -> 893101836 (<.01%)
cycles in affected programs: 2501720 -> 2511158 (0.38%)
helped: 6 / HURT: 14

total fills in shared programs: 1901 -> 1776 (-6.58%)
fills in affected programs: 1757 -> 1632 (-7.11%)
helped: 20 / HURT: 0

fossil-db:

Lunar Lake, Meteor Lake, and DG2 had similar results. (Lunar Lake shown)
Totals:
Instrs: 929949528 -> 926770338 (-0.34%)
Cycle count: 105126671329 -> 104851299099 (-0.26%); split: -0.28%, +0.02%
Fill count: 6520785 -> 5021518 (-22.99%)

Totals from 54281 (2.69% of 2018922) affected shaders:
Instrs: 239616289 -> 236437099 (-1.33%)
Cycle count: 22051883404 -> 21776511174 (-1.25%); split: -1.33%, +0.08%
Fill count: 6406295 -> 4907028 (-23.40%)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37827>
2025-11-26 17:20:13 +00:00
Ian Romanick
d2e3707ecc brw: Eliminate redundant fills and spills
When the register allocator decides to spill a value, all writes to that
value are spilled and all reads are filled. In regions where there is
not high register pressure, a spill of a value may be followed by a fill
of that same file while the spilled register is still live. This
optimization pass finds these cases, and it converts the fill to a move
from the still-live register.

The restriction that the spill and the fill must have matching NoMask
really hampers this optimization. With the restriction removed, the pass
was more than 2x helpful.

v2: Require force_writemask_all to be the same for the spill and the fill.

v3: Use FIXED_GRF for register overlap tests. Since this is after
register allocation, the VGRF values will not tell the whole truth.

v4: Use brw_transform_inst. Suggested by Caio. The allows two of the
loops to be merged. Add brw_scratch_inst::offset instead of storing it
as a source. Suggested by Lionel.

v5: Add no-fill-opt debug option to disable optimizations. Suggested by
Lionel.

v6: Move a calculation outside a loop. Suggested by Lionel.

v7: Check that spill ranges overlap instead of just checking initial
offset. Zero shaders in fossil-db were affected, but some CTS with
spill_fs were fixed (e.g.,
dEQP-VK.subgroups.arithmetic.compute.subgroupmin_uint64_t_requiredsubgroupsize).
Suggested by Lionel.

v8: Add DEBUG_NO_FILL_OPT to debug_bits in
brw_get_compiler_config_value(). Noticed by Lionel.

shader-db:

Lunar Lake
total instructions in shared programs: 17249907 -> 17249903 (<.01%)
instructions in affected programs: 10684 -> 10680 (-0.04%)
helped: 2 / HURT: 0

total cycles in shared programs: 893092630 -> 893092398 (<.01%)
cycles in affected programs: 237320 -> 237088 (-0.10%)
helped: 2 / HURT: 0

total fills in shared programs: 1903 -> 1901 (-0.11%)
fills in affected programs: 110 -> 108 (-1.82%)
helped: 2 / HURT: 0

Meteor Lake and DG2 had similar results. (Meteor Lake shown)
total instructions in shared programs: 19968898 -> 19968778 (<.01%)
instructions in affected programs: 33020 -> 32900 (-0.36%)
helped: 10 / HURT: 0

total cycles in shared programs: 885157211 -> 884925015 (-0.03%)
cycles in affected programs: 39944544 -> 39712348 (-0.58%)
helped: 8 / HURT: 2

total fills in shared programs: 4454 -> 4394 (-1.35%)
fills in affected programs: 2678 -> 2618 (-2.24%)
helped: 10 / HURT: 0

fossil-db:

Lunar Lake
Totals:
Instrs: 930445228 -> 929949528 (-0.05%)
Cycle count: 105195579417 -> 105126671329 (-0.07%); split: -0.07%, +0.00%
Spill count: 3495279 -> 3494400 (-0.03%)
Fill count: 6767063 -> 6520785 (-3.64%)

Totals from 43844 (2.17% of 2018922) affected shaders:
Instrs: 212614840 -> 212119140 (-0.23%)
Cycle count: 19151130510 -> 19082222422 (-0.36%); split: -0.39%, +0.03%
Spill count: 2831100 -> 2830221 (-0.03%)
Fill count: 6128316 -> 5882038 (-4.02%)

Meteor Lake and DG2 had similar results. (Meteor Lake shown)
Totals:
Instrs: 1001375893 -> 1001113407 (-0.03%)
Cycle count: 92746180943 -> 92679877883 (-0.07%); split: -0.08%, +0.01%
Spill count: 3729157 -> 3728585 (-0.02%)
Fill count: 6697296 -> 6566874 (-1.95%)

Totals from 35062 (1.53% of 2284674) affected shaders:
Instrs: 179819265 -> 179556779 (-0.15%)
Cycle count: 18111194752 -> 18044891692 (-0.37%); split: -0.41%, +0.04%
Spill count: 2453752 -> 2453180 (-0.02%)
Fill count: 5279259 -> 5148837 (-2.47%)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37827>
2025-11-26 17:20:13 +00:00
Ian Romanick
b7f5285ad3 brw: Add fill and spill opcodes for LSC platforms
These opcodes are emitted during register allocation instead of the
scratch reads and writes that were previously emitted. These
instructions contain additional information (i.e., the instruction
encodes the scratch offset) that enable optimizations to be added
later.

The fill and spill opcodes are lowered to scratch reads and writes
shortly after register allocation. Eventually this lower may have some
optimizations (e.g., reuse previous address calculations for successive
spills).

v2: Add brw_scratch_inst::offset instead of storing it as a
source. Suggested by Lionel.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37827>
2025-11-26 17:20:12 +00:00
Ian Romanick
2215003d95 brw: Add OPT macro to brw_shader.cpp like brw_opt.cpp
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37827>
2025-11-26 17:20:11 +00:00
Ian Romanick
1f42ff530c brw: Return the new register from brw_lower_vgrf_to_fixed_grf
...and make the function public.

v2: s/struct brw_reg/brw_reg/. Suggested by Lionel.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37827>
2025-11-26 17:20:11 +00:00
Ian Romanick
243a3a4ca7 brw: Don't pass compressed to brw_lower_vgrf_to_fixed_grf
The parameter is never used. It's recalculated in the function.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37827>
2025-11-26 17:20:10 +00:00
Ian Romanick
1fc2f52d36 brw: Force allow_spilling when spill_all is set
This ensures that g0 is reserved for spilling since there is going to be
spilling.

Fixes: 8bca7e520c ("intel/brw: Only force g0's liveness to be the whole program if spilling")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37827>
2025-11-26 17:20:09 +00:00
Ian Romanick
042417a72e brw: Don't spill_all on internal shaders
Basically all of the internal shaders (e.g., from blorp) will fail
assertions if there is any scratch space used.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37827>
2025-11-26 17:20:09 +00:00
Alyssa Rosenzweig
e3328dfa2f brw: only initialize sample mask flag if needed
This is a refinement of 7c129d9365 ("intel/brw/xe2+: Keep PS sample mask in the
f1.0 register whether or not kill is used."). Rather than always insert this
move, do so only when we'll actually read the register: for memory writes and
for discards. This deletes an instruction from piles of fragment shaders.

shader-db on LNL:

total instructions in shared programs: 17134031 -> 17042706 (-0.53%)
instructions in affected programs: 9065743 -> 8974418 (-1.01%)
helped: 65045
HURT: 0
helped stats (abs) min: 1.0 max: 3.0 x̄: 1.40 x̃: 1
helped stats (rel) min: <.01% max: 50.00% x̄: 3.06% x̃: 1.64%
95% mean confidence interval for instructions value: -1.41 -1.40
95% mean confidence interval for instructions %-change: -3.10% -3.03%
Instructions are helped.

total cycles in shared programs: 885172098 -> 884835306 (-0.04%)
cycles in affected programs: 590294230 -> 589957438 (-0.06%)
helped: 53636
HURT: 4500
helped stats (abs) min: 2.0 max: 1126.0 x̄: 8.02 x̃: 4
helped stats (rel) min: <.01% max: 50.00% x̄: 1.24% x̃: 0.24%
HURT stats (abs)   min: 2.0 max: 7706.0 x̄: 20.77 x̃: 6
HURT stats (rel)   min: <.01% max: 82.06% x̄: 1.09% x̃: 0.54%
95% mean confidence interval for cycles value: -6.15 -5.43
95% mean confidence interval for cycles %-change: -1.10% -1.02%
Cycles are helped.

LOST:   385
GAINED: 47

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38665>
2025-11-26 16:53:36 +00:00
Lionel Landwerlin
5324712952 anv: remove errors on format queries
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
It's pretty spammy and since the whole purpose of queries is to report
support, why bother with errors?

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38661>
2025-11-26 16:06:57 +00:00
Kenneth Graunke
3182deaae1 brw: Combine output stores for TCS outputs even when unlinked
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Otherwise we get a lot of individual x/y/z stores to tesslevels when
we should really just be storing the whole thing at once.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
2025-11-25 22:44:03 +00:00
Kenneth Graunke
7e02738b63 brw: Drop check for legacy tess levels from remap_patch_urb_offsets
The newly rewritten remap_tess_levels_legacy will have already lowered
anything it cares about to URB intrinsics.  So the generic remapping
pass won't see them, as it operates on generic input/output intrinsics.

This also drops some of the callback boilerplate we needed temporarily.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
2025-11-25 22:44:03 +00:00
Kenneth Graunke
d95a9714c2 brw: Rewrite legacy tess level remapping
This unifies the dynamic (SSO) and fixed (linked together) versions.
We emit piles of NIR as if we were doing the dynamic version, but
replace the tess config field access with constant values.  It all
should optimize away back to something reasonable.  We lower these
directly to URB read/write intrinsics.

It also rewrites the dynamic version to directly read/write the URB
rather than going through temporaries.  The old version was broken
in that tessellation control shader invocations can technically use
the shared output area for cross-invocation data sharing with barriers,
although doing so using the built-in tesslevel patch outputs is very
unlikely.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
2025-11-25 22:44:03 +00:00
Kenneth Graunke
ee407481c2 brw: Switch to URB intrinsics for TCS inputs
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
2025-11-25 22:44:02 +00:00
Kenneth Graunke
943b2acf02 brw: Switch to NIR URB intrinsics for TES inputs
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
2025-11-25 22:44:01 +00:00
Kenneth Graunke
c0d69b2faf brw: Switch to NIR URB intrinsics for TCS outputs
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
2025-11-25 22:44:01 +00:00
Kenneth Graunke
9aff3cac3c brw: Add infrastructure for lowering to URB intrinsics
Based on earlier code by Lionel Landwerlin.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
2025-11-25 22:44:00 +00:00
Kenneth Graunke
13acc889af brw: Use io_sem.location instead of base to get varying slots
Alyssa noted we can be using semantic IO here rather than relying on
bases not having been remapped.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
2025-11-25 22:43:59 +00:00
Kenneth Graunke
96d331766a brw: Generalize read_attribute_payload_intel to handle more cases
We were using this for indirect loads of the shader input thread
payload, but there's no reason we can't use it for constant access
too.  In this case we can just MOV from the ATTR file directly
without a special opcode that turns into MOV_INDIRECT later.

We also allow it to load multiple components now.  This is useful
for say, returning vec4 pushed inputs.  And, we allow it in more
stages than just the fragment stage.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
2025-11-25 22:43:59 +00:00
Kenneth Graunke
792762617a brw: Rename read_attribute_payload_intel to load_attribute_payload_intel
We're going to change the intrinsic to a load(...) which puts "load" in
the name.  Also, it's just more consistent with our usual terminology.

We also rename the corresponding backend opcode so they remain matched.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
2025-11-25 22:43:58 +00:00
Kenneth Graunke
0f7590af81 brw, anv, iris: Switch to reversed patch header layouts
These are a ton more convenient.  When the TCS and TES were linked
together, the legacy layouts were a hassle, but didn't impose any
significant cost.  With unlinked TCS and TES, the legacy layouts
involve significant runtime code for scrambling the data, whereas
the reversed layouts are substantially less overhead.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
2025-11-25 22:43:57 +00:00
Kenneth Graunke
7d1dfc3468 brw: Lower tesslevel vars to vectors even for unlinked TCS/TES
st/nir lowers this for iris, and brw_link_shaders lowers this for anv,
but for unlinked tessellation control / evaluation shaders, the lowering
was not happening for TCS.

Just do it unconditionally when lowering TCS outputs and TES inputs.
This lets the remapping code just assume vectors all the time, rather
than getting single component stores with nir_intrinsic_component set
(which came from nir_lower_io lowering compact arrays).

This also requires changes to the dynamic unlinked TCS/TES lowering to
temporaries, which needs to use vectors rather than arrays with this
change.  That code is going away in future patches anyway, but this
keeps it going for now to avoid interim breakage.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
2025-11-25 22:43:56 +00:00
Kenneth Graunke
7736e693b1 brw: Pass devinfo into remap_patch_urb_offsets
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
2025-11-25 22:43:56 +00:00
Kenneth Graunke
4dc6413de8 brw: Rename remap_non_header_patch_values to remap_patch_values
See rationale in the previous patch.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
2025-11-25 22:43:56 +00:00
Kenneth Graunke
2b51963b8c brw: Remap tesslevels before other patch remapping
We now call remap_tess_levels before remap_non_header_patch_urb_offsets.
The latter already excludes tess levels anyway, so the order shouldn't
matter.

This paves the way for remap_tess_levels to skip handling some header
values in certain cases, because with reversed layouts, many of them no
longer need any special handling and we can just let the generic pass
handle them.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
2025-11-25 22:43:56 +00:00
Kenneth Graunke
e8669a8333 brw: Rework the tess level remapping interface
Just have a single remap_tess_levels that does either the
statically-known-primitive or the dynamic (unlinked) mode.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
2025-11-25 22:43:56 +00:00
Kenneth Graunke
1995c879a9 brw: Flip the TESS_LEVEL_INNER/OUTER vue map slot assignments
Our current legacy patch header layout handling doesn't actually care
which is which slot, and remaps everything to its correct spot anyway.

For using the newer "reversed" patch header layouts, it will be more
convenient to have outer as slot 0, and inner as slot 1, as that just
works with no special remapping needed for both quads and triangles
(but unfortunately isolines are still a pain).

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
2025-11-25 22:43:56 +00:00
Kenneth Graunke
e5c1d00faf brw: Pass devinfo to brw_nir_lower_tes_inputs
This will be useful for using reversed patch header layouts.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
2025-11-25 22:43:56 +00:00
Kenneth Graunke
a1c7ae9d15 brw: Implement URB handle intrinsics for TCS and TES stages
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
2025-11-25 22:43:56 +00:00
Lionel Landwerlin
e290f9641d brw: Implement load/store URB intrinsics
These work the same regardless of stage.

v2 (Ken): Rebase, move from mesh to all stages, add reorderable load
          variant, allow channel masks to be non-constant even on Xe2.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
2025-11-25 22:43:55 +00:00
Lionel Landwerlin
0d8ee4ed23 brw: use default builder for urb handle adjustment
Be consistent with lowering that happens after, so that it gets a full
vector register and can stride into it.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38482>
2025-11-25 22:43:55 +00:00
Felix DeGrood
406e6e094a anv/rt: avoid out of bound access by clamping global id
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Felix DeGrood <felix.j.degrood@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: cff9d82c66 ("anv/rt: rewrite encode.comp for better performance")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38636>
2025-11-25 19:59:42 +00:00
Lionel Landwerlin
b1e74a1bb1 anv: shrink image opaque data
Noticed renderdoc complaining about our size :

RDOC 692028: [18:08:18]          vk_core.cpp(2272) - Warning -
VkPhysicalDeviceDescriptorBufferPropertiesEXT.imageCaptureReplayDescriptorDataSizeis too large at 32
(must be <= 16), can't support capture of VK_EXT_descriptor_buffer

Since we only need 2 pointers (main + private), we can shrink this to
16bytes. The 1/2 planes have a relative offset from the base.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38625>
2025-11-25 19:38:53 +00:00
Lionel Landwerlin
7e72d392d7 brw: switch to load_(pixel_coord|frag_coord_z|frag_coord_w) intrinsics
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Allows us to better determine if we need Z/W payload delivery.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36392>
2025-11-25 15:50:48 +00:00
Lionel Landwerlin
6d3be477ab anv: enable application shader printfs with debug option
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38638>
2025-11-25 14:18:42 +00:00
Lionel Landwerlin
4c3bf04dd0 anv: enable mesh/task shader hashes
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38638>
2025-11-25 14:18:42 +00:00
Kenneth Graunke
e49418744a brw: Set extended_bindless_surface_offset to true for Gfx12.5+
anv sets device->uses_ex_bso on verx10 >= 125 and then sets the
compiler->extended_bindless_surface_offset to that.

iris was not setting anything.  However, LSC_ADDR_SURFTYPE_SS used for
scratch on Gfx12.5 is bindless, and Xe2 uses ExBSO for all UGM access,
so we need to be setting this.

Just set it in the compiler so both drivers have it set.

Fixes piglit arb_tessellation_shader-tes-gs-max-output -small -scan 1 50
on iris.

Fixes: 80c89909f3 ("brw: fixup immediate bindless surface handling")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38645>
2025-11-25 08:21:30 +00:00
Simon McVittie
b860ae309a vulkan: Optionally share one JSON manifest per driver between architectures
If the library_path is just a basename like `libvulkan_lvp.so`, then we
can share the same JSON manifest like `lvp_icd.json` between all of the
architectures, like we already do for Vulkan layers. The library will
be looked up in the dynamic linker's default search path in this case,
and in practice will be found in `${libdir}`. This is how the Mesa's
EGL driver and Vulkan layers work, how Mesa is packaged in Debian 13,
and also how the Nvidia proprietary driver works; it makes installation
simpler for distros, especially on multiarch systems like Debian and
the freedesktop.org SDK.

However, if we want a separate manifest per architecture in order to
be able to write the full path into it, we still need per-architecture
filename disambiguation like `lvp_icd.x86_64.json`.

We presumably still want a separate per architecture on Windows, because
the concept of a single monolithic `${libdir}` is less common there, and
it can also be helpful during development when setting `$VK_DRIVER_FILES`
to force the use of a specific driver installed in a non-default location.

Use the following parameter to passed to vk_icd_gen:
'--icd-lib-path', vulkan_icd_lib_path,
'--icd-filename', icd_file_name,
output : 'virtio_icd.' + vulkan_manifest_suffix,

and the output is passed by '--out', '@OUTPUT@',
so we can detect vulkan_manifest_per_architecture from the --out parameter in script.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13745
Signed-off-by: Simon McVittie <smcv@collabora.com>
Co-authored-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37314>
2025-11-24 19:05:57 +00:00
Lionel Landwerlin
d51c0b8988 brw: fix SS surfaces usage
In 80c89909f3 ("brw: fixup immediate bindless surface handling") I
forgot that we have a special usage for the only _SS surface (the
scratch surface).

Because it's only delivered in the 31:10 bits of R0 and because we
want to minimize the amount of shader instructions for scratch
messages, the surface offset in shifted right by the driver to align
things properly for the 31:6 extended descriptor format.

This is unfortunately incompatible with the full 32bit format of
ExBSO. So this surface type currently cannot be considered bindless.

We might revisit later if we start using _SS surfaces for other
things.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 80c89909f3 ("brw: fixup immediate bindless surface handling")
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38618>
2025-11-24 16:12:27 +00:00