Commit graph

15202 commits

Author SHA1 Message Date
Calder Young
646977348b anv: Fix typo when checking format's extended usage flag
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Fixes: f4c1753c1a ("anv: report color/storage features on YCbCr images with EXTENDED_USAGE")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35703>
2025-06-28 20:39:18 +00:00
Lionel Landwerlin
a742b859bd anv: add support for handling wa_18019110168 with gfx-libs
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:35 +00:00
Lionel Landwerlin
fcf4401824 brw: handle wa_18019110168 with independent shader compilation
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:35 +00:00
Lionel Landwerlin
bc8d18aee2 brw: make a helper for vertex attribute offset computation
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:34 +00:00
Lionel Landwerlin
8fabcd754f brw: move primitive_id_index field in fs_msaa
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:34 +00:00
Lionel Landwerlin
6336cf0ea2 brw: store the remapping table for wa_18019110168 in constant data
That way it can be accessed at runtime.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:33 +00:00
Lionel Landwerlin
e1a7eb1718 brw: extract out attribute register remapping
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:33 +00:00
Lionel Landwerlin
5cc66e2c8d anv/brw: move Wa_18019110168 handling to backend
We simplify the implementation by assuming the worse case, copying
entire per-vertex regions if necessary.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:32 +00:00
Lionel Landwerlin
8e7e0ef75a anv: make Wa_18019110168 deal with dynamic provoking vertex
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:32 +00:00
Lionel Landwerlin
f0f4f9c566 brw: fix vertex attribute offset computation
The formula uses scalar indices (4bytes), not slots (16bytes).

We also incorrectly passed a scalar (vertex case) & slot (mesh case)
offset in the push constants. Use slots instead so that the value is
smaller and we can pack more stuff into fs_msaa_flags.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 18bbcf9a63 ("intel: introduce new VUE layout for separate compiled shader with mesh")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:31 +00:00
Lionel Landwerlin
4b5539a0cb brw: fix set_range on load_per_primitive_output
load intrinsics don't have range

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 18bbcf9a63 ("intel: introduce new VUE layout for separate compiled shader with mesh")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:31 +00:00
Matt Turner
6842a8179f intel: Add support for float16 as cooperative matrix accumulator
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The number of passing tests in ./deqp-vk -n '*cooperative_matrix.khr*'
increases

- on PTL from 787 -> 914
- on RPL from 799 -> 926

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13304
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35616>
2025-06-27 01:26:22 +00:00
Matt Turner
6d786a0e4b brw: Use convert_cmat_intel intrinsic
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35616>
2025-06-27 01:26:22 +00:00
Matt Turner
41cd196886 brw: Implement convert_cmat_intel intrinsic
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35616>
2025-06-27 01:26:22 +00:00
Matt Turner
1215845b5b intel: Increase size of cooperative_matrix_configurations[] to 16
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35616>
2025-06-27 01:26:21 +00:00
Marek Olšák
1754507d49 nir: rename nir_lower_io_to_temporaries -> nir_lower_io_vars_to_temporaries
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:54 +00:00
Marek Olšák
1e03827c77 nir: rename nir_lower_io_arrays_to_elements -> nir_lower_io_array_vars_to_elements
same for *_no_indirects

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:54 +00:00
Marek Olšák
12df9b3def nir: rename nir_vectorize_tess_levels -> nir_lower_tess_level_array_vars_to_vec
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:50 +00:00
Marek Olšák
2aa94caf82 nir: rename nir_lower_io_to_vector -> nir_opt_vectorize_io_vars
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:50 +00:00
Marek Olšák
439d805291 nir: rename nir_lower_io_to_scalar_early -> nir_lower_io_vars_to_scalar
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:49 +00:00
Ian Romanick
b83f618fb2 brw: Fully write temporary destinations
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Consider an innocuous instruction like:

    and(1) v250:UD, g0.3<0,1,0>:UD, 4294967264u NoMask group0

If register allocation decides to spill v250, it will see this
instruction and say, "Oh no! The other components of v250 aren't set, so
I'd better add a fill before that instruction!"

But it gets even worse than that... if register coalesce decided to
merge two of these, the live range gets massively extended because the
writes don't fully initialize the value. This causes the need to spill
these registers in the first place.

Changing that instruction to SIMD16 on Xe2 or SIMD8 on other platforms
alleviates these issues.

shader-db:

Lunar Lake
total instructions in shared programs: 17118324 -> 17113191 (-0.03%)
instructions in affected programs: 93701 -> 88568 (-5.48%)
helped: 42 / HURT: 6

total cycles in shared programs: 895422566 -> 895079488 (-0.04%)
cycles in affected programs: 30111338 -> 29768260 (-1.14%)
helped: 35 / HURT: 40

total spills in shared programs: 3588 -> 3304 (-7.92%)
spills in affected programs: 285 -> 1 (-99.65%)
helped: 10 / HURT: 0

total fills in shared programs: 2218 -> 1663 (-25.02%)
fills in affected programs: 556 -> 1 (-99.82%)
helped: 10 / HURT: 0

Meteor Lake, DG2, Tiger Lake, and Ice Lake had similar results.  (Meteor Lake shown)
total instructions in shared programs: 20059218 -> 20053563 (-0.03%)
instructions in affected programs: 96938 -> 91283 (-5.83%)
helped: 43 / HURT: 6

total cycles in shared programs: 884174588 -> 883536475 (-0.07%)
cycles in affected programs: 22105268 -> 21467155 (-2.89%)
helped: 35 / HURT: 27

total spills in shared programs: 5032 -> 4679 (-7.02%)
spills in affected programs: 355 -> 2 (-99.44%)
helped: 12 / HURT: 0

total fills in shared programs: 4782 -> 4113 (-13.99%)
fills in affected programs: 671 -> 2 (-99.70%)
helped: 12 / HURT: 0

Skylake
total instructions in shared programs: 19097658 -> 19097665 (<.01%)
instructions in affected programs: 14202 -> 14209 (0.05%)
helped: 0 / HURT: 5

total cycles in shared programs: 862058109 -> 862058267 (<.01%)
cycles in affected programs: 3450244 -> 3450402 (<.01%)
helped: 7 / HURT: 11

fossil-db:

Lunar Lake
Totals:
Cycle count: 31439652246 -> 31439652272 (+0.00%)

Totals from 2 (0.00% of 707091) affected shaders:
Cycle count: 2602 -> 2628 (+1.00%)

No other Intel platforms had any fossil-db changes.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35721>
2025-06-26 17:59:47 +00:00
Caio Oliveira
30490de24a intel/executor: allow single line comments in macro lines
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Assembler supports them, so allow them on @-macro lines.  For now
we don't bother with multiline comments, if becomes a thing we
can add them later.

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35699>
2025-06-26 00:58:02 +00:00
Caio Oliveira
d14fa6683b intel/executor: update SFID names in macros to match recent changes
After commit 88309a9818, SFID names were renamed

- "dp data 1" became "hdc1"
- "thread_spawner" became "ts/btd"

Update macros in executor to use the new SFID names so the
generated assembly can be parsed correctly.

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35701>
2025-06-25 17:31:00 -07:00
Iván Briano
d964b8d5fa anv: don't report custom sample locations for sample count 1
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
We can't actually enable MSAA for images with sample count 1, and
without MSAA active, the sample location machinery does not get used.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35504>
2025-06-24 19:44:34 +00:00
Matt Turner
6a47531440 intel: Generate files with newline at end
This generator scripts uses the `write` function that, unlike `print`,
doesn't print a trailing newline. So let's add one to the template.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35697>
2025-06-24 14:01:04 +00:00
Dave Airlie
29c599ffea anv: only expose VK_KHR_cooperative_matrix on devices with hw instructions.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Currently anv exposes this on lots of devices, with the intent to be better
than apps can give, but I think this is wrong for a couple of reasons.

Apps want to know if hw exposes the fast path, Vulkan is meant to be explicit,
and telling llama.cpp if the fast path exists lets it make smarter decisions.

It seems unless someone heavily optimises the slow path, that CPU is usually
faster than GPU with llama-bench unless the hw path exists.

v2: added INTEL_LOWER_DPAS support

Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35564>
2025-06-23 21:06:51 +00:00
Konstantin Seurer
4cbbdc0a50 vulkan: Pass a structure to most BVH build callbacks
It is annoying to change all function signatures when a driver needs
more information. There are also some callbacks that have a lot of
parameters and there have already been bugs related to that.

This patch tries to clean the interface by adding a struct that contains
all information that might be relevant for the driver and passing that
to most callbacks.

radv changes are:
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>

anv changes are:
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

turnip changes are:
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>

vulkan runtime changes are:

Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35385>
2025-06-23 20:43:43 +00:00
Konstantin Seurer
28713789ad vulkan: Replace get_*_key with get_build_config
It is a cleaner API and gives more control about the build to the
driver.

Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35385>
2025-06-23 20:43:43 +00:00
José Roberto de Souza
bdd20457ed anv: Emit STATE_COMPUTE_MODE before COMPUTE_WALKER when new async compute limits are needed
Cc: stable
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35563>
2025-06-23 18:57:25 +00:00
José Roberto de Souza
b37747ce68 blorp: Emit STATE_COMPUTE_MODE before COMPUTE_WALKER
Cc: stable
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35563>
2025-06-23 18:57:25 +00:00
José Roberto de Souza
59d361043e intel/common: Use as much as possible spec recommended values for compute engine async thread limits
Spec recommended values should give us a good balance between progress
in render and compute engines, also with less possibility of values
it will reduce the number of times that we need to emit
STATE_COMPUTE_MODE reducing the number of stalls in the compute engine.

Cc: stable
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35563>
2025-06-23 18:57:25 +00:00
José Roberto de Souza
080b9a165c intel/common: Add function to compute optimal compute engine async thread limits
Spec has several restrictions to the values we program to compute
engine async thread limits.
Without those we risk hit deadlocks, so here adding a function to
return the optimal value based on those restrictions.

Cc: stable
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35563>
2025-06-23 18:57:24 +00:00
Eric Engestrom
99e8d804bf intel/compiler tests: fix variable type for getopt_long() return value
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
`getopt_long()` returns an `int`, not a `char`; putting the value in
a `char` before comparing it to `-1` was making the comparison always
fail, resulting in the invalid codepath taken that then fails with:

    option `-' is invalid: ignored

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34756>
2025-06-23 08:26:29 +00:00
Eric Engestrom
f545f9eed4 intel/compiler tests: fix "is there something after the options" check
cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34756>
2025-06-23 08:26:29 +00:00
Eric Engestrom
729922cdae intel/compiler tests: fix path-to-string conversion
cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34756>
2025-06-23 08:26:29 +00:00
Eric Engestrom
de6ab1beda intel/compiler tests: rewrite subprocess handling in run-test.py
`subprocess.Popen()` returns immediately, and the subprocess might not
have finished by the time `stdout` is read on the next line, spuriously
failing the tests.

`subprocess.check_output()` makes sure the output is available before
returning, solving this issue; it additionally raises an error if the
subprocess failed, giving a better error than a failed diff later in the
script.

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34756>
2025-06-23 08:26:29 +00:00
Georg Lehmann
9da23499ff compiler: add float8 glsl types
e4m3fn: 8bit floating point format with 4bit exponent, 3bit mantissa
        and no infinities (finite only)
e5m2:   8bit floating point format with 5bit exponent, 2bit mantissa
        and with infinities.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35434>
2025-06-23 07:59:24 +00:00
Lionel Landwerlin
4e25a4ce1e intel/ci: document a couple of vkd3d failures
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35679>
2025-06-23 05:45:25 +00:00
Lionel Landwerlin
786bace191 anv: fix sampler hashing in set layouts
The logic needs to handle embedded samplers without conversion state.

Fixes vkd3d-proton's test_sampler_border_color

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35679>
2025-06-23 05:45:25 +00:00
Lionel Landwerlin
32b53a7c6a anv: fix clears on single aspect of YCbCr images
Fixes vkd3d-proton's test_planar_video_formats

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35679>
2025-06-23 05:45:25 +00:00
Lionel Landwerlin
691ac65000 isl: handle DISABLE_AUX in get_mcs_surf
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35679>
2025-06-23 05:45:24 +00:00
Yiwei Zhang
64326d0be5 anv: use common vk_android_get_front_buffer_usage helper
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35568>
2025-06-23 00:46:42 +00:00
Rohan Garg
e103afe7be brw: run the nir_opt_offsets pass and set the maximum offset size
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Perf A/B testing on DG2: no changes
Perf A/B testing on BMG: +2.1% Blackops3, +1.5% Cyberpunk

DG2 stats (mostly insignificant):
Assassins Creed Valhalla:
 Totals from 1169 (55.67% of 2100) affected shaders:
 Instrs: 509237 -> 509215 (-0.00%)
 Cycle count: 30614325 -> 30607419 (-0.02%); split: -0.03%, +0.00%
 Non SSA regs after NIR: 83434 -> 85909 (+2.97%)

Blackops 3:
 Totals from 1045 (64.63% of 1617) affected shaders:
 Instrs: 527312 -> 527310 (-0.00%)
 Cycle count: 496912222 -> 496902846 (-0.00%); split: -0.00%, +0.00%
 Non SSA regs after NIR: 106883 -> 109095 (+2.07%)

Cyberpunk:
 Totals from 706 (56.03% of 1260) affected shaders:
 Instrs: 345976 -> 345974 (-0.00%); split: -0.00%, +0.00%
 Cycle count: 9775138 -> 9775472 (+0.00%); split: -0.00%, +0.00%
 Max live registers: 40295 -> 40297 (+0.00%)
 Non SSA regs after NIR: 93245 -> 94718 (+1.58%)

Fortnite:
 Totals from 4210 (55.98% of 7521) affected shaders:
 Instrs: 2205471 -> 2205469 (-0.00%)
 Cycle count: 91451040 -> 91450956 (-0.00%); split: -0.00%, +0.00%
 Non SSA regs after NIR: 952354 -> 961664 (+0.98%)

LNL stats (notable changes):
Assassins Creed Valhalla:
 Totals from 1684 (83.57% of 2015) affected shaders:
 Instrs: 774305 -> 764501 (-1.27%); split: -1.27%, +0.01%
 Cycle count: 58845842 -> 58699250 (-0.25%); split: -0.98%, +0.73%
 Spill count: 625 -> 638 (+2.08%)
 Fill count: 1490 -> 1503 (+0.87%)
 Scratch Memory Size: 41984 -> 44032 (+4.88%)
 Max live registers: 196424 -> 197561 (+0.58%); split: -0.10%, +0.68%

Blackops 3:
 Totals from 1125 (76.53% of 1470) affected shaders:
 Instrs: 781749 -> 773275 (-1.08%); split: -1.08%, +0.00%
 Subgroup size: 22896 -> 22912 (+0.07%)
 Cycle count: 659864454 -> 654641032 (-0.79%); split: -1.10%, +0.31%
 Max live registers: 116772 -> 116854 (+0.07%); split: -0.01%, +0.08%
 Non SSA regs after NIR: 172648 -> 168260 (-2.54%); split: -2.55%, +0.01%

Control:
 Totals from 378 (51.50% of 734) affected shaders:
 Instrs: 148184 -> 147544 (-0.43%)
 Cycle count: 6905200 -> 6913366 (+0.12%); split: -0.30%, +0.42%
 Max live registers: 41271 -> 41281 (+0.02%)
 Non SSA regs after NIR: 44964 -> 43868 (-2.44%); split: -2.45%, +0.01%

Cyberpunk:
 Totals from 1141 (92.46% of 1234) affected shaders:
 Instrs: 636744 -> 629333 (-1.16%)
 Subgroup size: 24256 -> 24272 (+0.07%)
 Cycle count: 24952258 -> 24801298 (-0.60%); split: -1.39%, +0.78%
 Max live registers: 125848 -> 126855 (+0.80%); split: -0.00%, +0.80%
 Non SSA regs after NIR: 127399 -> 119837 (-5.94%); split: -5.95%, +0.02%

Fortnite:
 Totals from 5497 (83.52% of 6582) affected shaders:
 Instrs: 4072831 -> 4041852 (-0.76%); split: -0.77%, +0.01%
 Subgroup size: 103296 -> 103312 (+0.02%)
 Cycle count: 133046874 -> 132789242 (-0.19%); split: -0.67%, +0.48%
 Spill count: 7218 -> 7254 (+0.50%); split: -0.33%, +0.83%
 Fill count: 11724 -> 11749 (+0.21%); split: -0.34%, +0.55%
 Scratch Memory Size: 591872 -> 599040 (+1.21%)
 Max live registers: 816530 -> 818522 (+0.24%); split: -0.01%, +0.26%
 Non SSA regs after NIR: 1610296 -> 1560284 (-3.11%); split: -3.11%, +0.00%

Hitman3:
 Totals from 4713 (92.39% of 5101) affected shaders:
 Instrs: 2731598 -> 2698224 (-1.22%)
 Cycle count: 186422098 -> 185472640 (-0.51%); split: -1.12%, +0.61%
 Spill count: 3244 -> 3242 (-0.06%)
 Fill count: 9937 -> 9933 (-0.04%)
 Max live registers: 585035 -> 589801 (+0.81%); split: -0.00%, +0.82%
 Non SSA regs after NIR: 347681 -> 324314 (-6.72%); split: -6.73%, +0.01%

Hogwarts Legacy:
 Totals from 930 (59.81% of 1555) affected shaders:
 Instrs: 464146 -> 459526 (-1.00%); split: -1.00%, +0.01%
 Subgroup size: 19104 -> 19120 (+0.08%)
 Cycle count: 24062460 -> 24078964 (+0.07%); split: -0.49%, +0.56%
 Spill count: 2068 -> 1964 (-5.03%); split: -5.22%, +0.19%
 Fill count: 2342 -> 2205 (-5.85%); split: -6.40%, +0.56%
 Scratch Memory Size: 147456 -> 141312 (-4.17%)
 Max live registers: 112384 -> 112787 (+0.36%); split: -0.08%, +0.44%
 Non SSA regs after NIR: 80293 -> 79161 (-1.41%); split: -1.72%, +0.32%

Metro Exodus:
 Totals from 29755 (78.62% of 37846) affected shaders:
 Instrs: 11495578 -> 11492951 (-0.02%); split: -0.02%, +0.00%
 Subgroup size: 644688 -> 644704 (+0.00%)
 Cycle count: 301572068 -> 301548054 (-0.01%); split: -0.03%, +0.02%
 Max live registers: 3369504 -> 3370454 (+0.03%); split: -0.00%, +0.03%
 Non SSA regs after NIR: 2476561 -> 2396090 (-3.25%); split: -3.27%, +0.02%

Red Dead Redemption 2:
 Totals from 4161 (78.61% of 5293) affected shaders:
 Instrs: 2428782 -> 2409032 (-0.81%); split: -0.82%, +0.00%
 Subgroup size: 85344 -> 85360 (+0.02%)
 Cycle count: 8514984142 -> 8533415324 (+0.22%); split: -0.02%, +0.23%
 Spill count: 4659 -> 4674 (+0.32%); split: -0.02%, +0.34%
 Fill count: 11236 -> 11231 (-0.04%); split: -0.19%, +0.14%
 Scratch Memory Size: 398336 -> 397312 (-0.26%)
 Max live registers: 473946 -> 475798 (+0.39%); split: -0.08%, +0.47%
 Non SSA regs after NIR: 616820 -> 567706 (-7.96%); split: -8.09%, +0.12%

Rise Of The Tomb Raider:
 Totals from 68 (46.58% of 146) affected shaders:
 Instrs: 28209 -> 27801 (-1.45%)
 Subgroup size: 1584 -> 1600 (+1.01%)
 Cycle count: 16182992 -> 16249364 (+0.41%); split: -0.97%, +1.38%
 Max live registers: 7320 -> 7296 (-0.33%); split: -0.38%, +0.05%
 Non SSA regs after NIR: 8438 -> 8207 (-2.74%); split: -2.82%, +0.08%

Spiderman Remastered:
 Totals from 6403 (93.87% of 6821) affected shaders:
 Instrs: 5662713 -> 5597949 (-1.14%); split: -1.28%, +0.14%
 Cycle count: 282861519016 -> 279806958122 (-1.08%); split: -1.26%, +0.18%
 Spill count: 61150 -> 60754 (-0.65%); split: -1.13%, +0.48%
 Fill count: 162597 -> 163190 (+0.36%); split: -0.84%, +1.21%
 Scratch Memory Size: 5834752 -> 5804032 (-0.53%); split: -0.70%, +0.18%
 Max live registers: 901926 -> 903820 (+0.21%); split: -0.01%, +0.22%
 Non SSA regs after NIR: 555053 -> 521016 (-6.13%); split: -6.14%, +0.01%

Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
2025-06-22 10:55:24 +00:00
Rohan Garg
8a5e062e5e brw: store the buffer offset for load/store intrinsics
This will later be encoded by the backend into the
LSC extended descriptor message.

Reworks:
  * Sagar: Add nir_intrinsic_ssbo_atomic_swap

Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
2025-06-22 10:55:24 +00:00
Rohan Garg
0186113640 brw: encode the offset into the message descriptor for Xe2
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
2025-06-22 10:55:24 +00:00
Rohan Garg
937d37f0b1 brw: introduce MEMORY_LOGICAL_ADDRESS_OFFSET to encode address offsets
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
2025-06-22 10:55:24 +00:00
Lionel Landwerlin
d5a58364b1 brw: add new helper for immediate integer register with type
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
2025-06-22 10:55:24 +00:00
Lionel Landwerlin
16fca611d7 nir: add new intel ssbo intrinsics
Similar to ir3 ones, to optimize offsets in the backend.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
2025-06-22 10:55:23 +00:00
Lionel Landwerlin
ba119c73c6 intel: replace RANGE_BASE by BASE for uniform block loads
We're not currently using RANGE_BASE and we'll use BASE for offset
optimizations on Xe2+.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
2025-06-22 10:55:23 +00:00
Lionel Landwerlin
098249ba66 brw: print descriptor & extended descriptors
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
2025-06-22 10:55:22 +00:00