Commit graph

13907 commits

Author SHA1 Message Date
Lionel Landwerlin
a3293eb26c anv/brw: stop turning load_push_constants into load_uniform
Those intrinsics have different semantics in particular with regards
to divergence. Turning one into the other without invalidating the
divergence information breaks NIR validation. But also the conversion
means we get artificially less convergent values in the shaders.

So just handle load_push_constants in the backend and stop changing
things in Anv.

Fixes a bunch of tests in
   dEQP-VK.descriptor_indexing.*
   dEQP-VK.pipeline.*.push_constant.graphics_pipeline.dynamic_index_*

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34546>
(cherry picked from commit df15968813)
2025-06-04 15:52:44 +02:00
Tapani Pälli
9ff1596f67 anv: use internal rt-null-ahs when any_hit is null
Tested on BMG and PTL using both settings for RT_CTRL.

Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35044>
(cherry picked from commit 5828612da2)
2025-05-20 20:18:08 +02:00
Tapani Pälli
ad159bbd93 intel/compiler: provide a helper for null any-hit shader
Xe driver will be disabling the HW functionality for null any-hit
shaders, drivers need to take care of it instead. This commit brings
back parts of older workaround (see b0624e414f) we used to have to
handle the null any-hit case.

Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35044>
(cherry picked from commit 0f591425c9)
2025-05-20 20:18:08 +02:00
Lionel Landwerlin
0b141f56a3 anv: enable preemption setting on command/batch correctly
The 2 helpers we're using for doing internal operations (copies,
command generation, etc...) can work on command buffers or lower level
batches.

When working with command buffers, the helpers should set the
preemption using genX(cmd_buffer_set_preemption) so that whatever
operation comes after toggles the state back to what it needs and we
minimize the toggles.

When working with batchs, the helpers should disable preemption using
genX(batch_set_preemption) and turn it back on when done.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35030>
(cherry picked from commit c570740272)
2025-05-20 20:18:08 +02:00
José Roberto de Souza
12ddaa6b8b anv: Enable preemption due 3DPRIMITIVE in GFX 12
The issues preventing it to be enabled were fixed so now we can enable
it but we need also to enable workaround 16013994831 back again.

Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34988>
(cherry picked from commit 3cd972a2d3)
2025-05-20 20:18:07 +02:00
José Roberto de Souza
5064fad403 anv: Implement missing part of Wa_1604061319
Description of this workaround are not clear but looking at Iris
implementation we need to emit all 3DSTATE_PUSH_CONSTANT_ALLOC_XS if
any 3DSTATE_PUSH_CONSTANT_ALLOC_XS is emitted.

Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34988>
(cherry picked from commit 2432d6677e)
2025-05-20 20:18:07 +02:00
Lionel Landwerlin
a4c042b67a brw: fix brw_nir_move_interpolation_to_top
In a case like this :
   block_0:
      %5 = ...
      %6 = ...
      block_1:
         %7 = load_interpolated_input %5, %6

The current logic would move load_interpolated_input to block_0 before
%5 but not move %5 & %6 which are sources of that instruction.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
(cherry picked from commit 6230f3029f)
2025-05-20 20:18:04 +02:00
Sagar Ghuge
36d45e2c5a anv: Fix untyped data port cache pipe control dump output
Fixes: 845ab3d627
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34855>
(cherry picked from commit bb61a78911)
2025-05-20 20:18:04 +02:00
Lionel Landwerlin
0ce1adf683 brw: add pre ray trace intrinsic moves
Some intrinsics are implemented by reading memory location that could
be rewritten by a further tracing calls. So we need to move those
reads prior to tracing operations in the shaders.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8979
Tested-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34214>
(cherry picked from commit c434050a00)
2025-05-06 17:24:10 +02:00
José Roberto de Souza
293aaa43b9 intel/tools: Fix batch buffer decoder
Some checks failed
macOS-CI / macOS-CI (dri) (push) Has been cancelled
macOS-CI / macOS-CI (xlib) (push) Has been cancelled
intel_decoder_init() initializes intel_batch_decode_ctx so later
we can call decode functions but it depends on data stored in
brw/elk_isa_info but that was being allocated in stack
of intel_decoder_init() then when the decode functions were executed
it was accessing garbage at the brw/elk_isa_info memory.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: ec2d20a70d ("intel/tools: Add helpers for decoder_init/disasm")
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34776>
(cherry picked from commit 3e5a735d01)
2025-05-03 12:48:02 +02:00
Lionel Landwerlin
d8cf36fbb1 intel: fix null render target setup logic
Or current render target cache setting is to key on the binding table
index, meaning the HW associates a number in the range [0, 7] to a
RENDER_SURFACE_STATE description. If you want change the render target
0 between 2 draw calls, you need to insert a PIPE_CONTROL in between
the 2 draw calls with pb-stall + rt-flush in order to flush an writes
to a previous RENDER_SURFACE_STATE that has now becomed disassociated
with the [0, 7] number.

This PIPE_CONTROL taking care of the flush is dealt with in
cmd_buffer_maybe_flush_rt_writes(). This function diffs the current
BTI setup for render targets (first 0 to 7 BTIs) with what the next
fragment shader wants.

The issue here is we might have a render pass with 0 color attachments
and yet in 98cdb9349a we added one pointing to the render target 0,
but in the emit_binding_table() when we finally program the BTI, we
check the render pass color count and program a null surface state
instead of an actual surface state. And this leads to hangs because
the render target cache will end up with inconsistent state data.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 98cdb9349a ("anv: ensure null-rt bit in compiler isn't used when there is ds attachment")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12955
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34603>
(cherry picked from commit 63f633557f)
2025-05-03 12:48:02 +02:00
Lionel Landwerlin
a17da10518 anv: force fragment shader execution when occlusion queries are active
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34732>
(cherry picked from commit f7bc22e0d7)
2025-05-01 09:38:49 +02:00
Tapani Pälli
5838a36951 intel/dev: update mesa_defs.json from internal database
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34753>
(cherry picked from commit eeffb4e674)
2025-04-30 14:16:00 +02:00
Iván Briano
b939de025d brw: make HALT instruction act as barrier in new CSE pass
This brings back c9e33e5cbf ("intel/fs/cse: Make HALT instruction act
as CSE barrier."), from the old CSE pass into the new one.

Fixes new CTS test: dEQP-VK.subgroups.shader_quad_control.terminated_invocation

Fixes: 9690bd369d ("intel/brw: Delete old local common subexpression elimination pass")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34643>
(cherry picked from commit 29d7b90cfc)
2025-04-30 14:15:58 +02:00
Sagar Ghuge
fd80d0027b intel/compiler: Fix stackIDs on Xe2+
For Xe2+, from Bspec 64643, bit field "StackID": The maximum number of
StackIDs can be 2^12- 1.

Cc: mesa-stable
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34709>
(cherry picked from commit 821c1bfa7e)
2025-04-30 14:15:57 +02:00
Caio Oliveira
bb56867a1b intel/executor: Fix bfloat example for converting F to packed BF
In float pointing rules adding +0.0f preserves all values except
for -0.0f, so what we want here is to add -0.0f.  In the future
we should add proper support for float immediates in the assembler.

Fixes: fafdd24285 ("intel/executor: Update bfloat example")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>
(cherry picked from commit 3e0418ba02)
2025-04-30 14:15:56 +02:00
Ian Romanick
f2a54f5244 brw/cmod: Don't propagate from CMP to possible Inf + (-Inf)
Most of the churn in this commit is changing unit tests that were
testing things that are now invalid.

shader-db:

All Intel platforms had similar results. (Lunar Lake shown)
total instructions in shared programs: 17122204 -> 17122669 (<.01%)
instructions in affected programs: 120669 -> 121134 (0.39%)
helped: 0 / HURT: 124

total cycles in shared programs: 895602370 -> 895613210 (<.01%)
cycles in affected programs: 17868974 -> 17879814 (0.06%)
helped: 35 / HURT: 85

fossil-db:

All Intel platforms had similar results. (Lunar Lake shown)
Totals:
Instrs: 210736518 -> 210743769 (+0.00%)
Cycle count: 30377733040 -> 30377699060 (-0.00%); split: -0.00%, +0.00%
Max live registers: 66056852 -> 66056966 (+0.00%)

Totals from 1505 (0.21% of 706776) affected shaders:
Instrs: 1890151 -> 1897402 (+0.38%)
Cycle count: 48397408 -> 48363428 (-0.07%); split: -0.11%, +0.04%
Max live registers: 256821 -> 256935 (+0.04%)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 020b0055e7 ("i965/fs: Propagate conditional modifiers from compares to adds")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34509>
(cherry picked from commit e26270249b)
2025-04-30 14:15:52 +02:00
Ian Romanick
aa08dfbad4 brw/cmod: Fix some errors when propagating from CMP to ADD.SAT
When I originally wrote that code, I didn't understand what a jerk NaN
can be.

v2: Remove the brw_type_is_uint stuff. This function is currently only
called for float types. In a later commit, integer types will be
supported but only for NZ and Z conditions. Noticed by Matt.

shader-db:

All Intel platforms had similar results. (Lunar Lake shown)
total instructions in shared programs: 17122197 -> 17122204 (<.01%)
instructions in affected programs: 1691 -> 1698 (0.41%)
helped: 0 / HURT: 4

total cycles in shared programs: 895602484 -> 895602370 (<.01%)
cycles in affected programs: 912964 -> 912850 (-0.01%)
helped: 2 / HURT: 2

fossil-db:

All Intel platforms had similar results. (Lunar Lake shown)
Totals:
Instrs: 210736388 -> 210736518 (+0.00%)
Cycle count: 30377728900 -> 30377733040 (+0.00%); split: -0.00%, +0.00%

Totals from 130 (0.02% of 706776) affected shaders:
Instrs: 169911 -> 170041 (+0.08%)
Cycle count: 18021210 -> 18025350 (+0.02%); split: -0.00%, +0.02%

Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 020b0055e7 ("i965/fs: Propagate conditional modifiers from compares to adds")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34509>
(cherry picked from commit 0dab520a19)
2025-04-30 14:15:51 +02:00
Tapani Pälli
a900b3f39d anv: put parenthesis to the set_sampler_size equation
This fixes errors seen with some renderdoc captures failing to allocate
descriptor sets.

Fixes: 76096d04bb ("anv: relax restriction on variable count descriptors")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34671>
(cherry picked from commit ed9f135936)
2025-04-30 14:15:45 +02:00
Lionel Landwerlin
dfc807a303 anv: use companion batch for operations with HIZ/STC_CCS destination
We're currently crashing a couple of tests :
   dEQP-VK.pipeline.monolithic.depth.xfer_queue_layout.*

   deqp-vk: ../src/intel/blorp/blorp_blit.c:2935:
     blorp_copy: Assertion `blorp_copy_supports_blitter(batch->blorp, src_surf->surf, dst_surf->surf, src_surf->aux_usage, dst_surf->aux_usage)' failed.

Tested on:
  dEQP-VK.api.copy_and_blit.copy_commands2.image_to_image_transfer_queue.all_formats.depth_stencil.*
  dEQP-VK.api.copy_and_blit.multiplanar_xfer.*
  dEQP-VK.pipeline.monolithic.depth.xfer_queue_layout.*

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 31eeb72e45 ("blorp: Add support for blorp_copy via XY_BLOCK_COPY_BLT")
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34023>
(cherry picked from commit e60416b4e4)
2025-04-27 11:45:21 +02:00
José Roberto de Souza
c912c746c5 intel: Fix the MOCS values in XY_BLOCK_COPY_BLT for Xe2+
One more instruction were the MOCS value was splited into two
registes.

Cc: mesa-stable
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34592>
(cherry picked from commit fcb6dfb29c)
2025-04-23 12:21:56 +02:00
José Roberto de Souza
14a045df1c intel: Fix the MOCS values in XY_FAST_COLOR_BLT for Xe2+
Xe2 changed the MOCS field in few instructions, those now have a field
for the MOCS index and other the encryption enable bit but ISL returns
the combination of both aka MEMORY_OBJECT_CONTROL_STATE.

To minimize changes I have added 2 macros to extract the values
from the value returned by isl.

From all the instructions changed Mesa only make use of two, so the
other instruction will be handled in the next patch.

Cc: mesa-stable
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34592>
(cherry picked from commit 161c412a82)
2025-04-23 12:21:56 +02:00
José Roberto de Souza
6fdcc55f6d intel: Program XY_FAST_COLOR_BLT::Destination Mocs for gfx12
Copy engine is not used in gfx12 platforms on ANV but that is possible
in Iris.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34560>
(cherry picked from commit a96e280dfe)
2025-04-23 12:21:56 +02:00
Rohan Garg
e106478551 anv: re enable compression for CPS surfaces on platforms other than Xe
I accidentally disabled compression on CPS surfaces marked as storage or
color attachment for all platforms, when this should only be limited to
Xe.

Fixes: 80f9b6 ('anv: CPB surfaces that are used as color attachments or for stores cannot be compressed')
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34297>
(cherry picked from commit cbc1ec4f73)
2025-04-22 01:24:32 +02:00
Ian Romanick
e783930b10 elk/algebraic: Don't optimize float SEL.CMOD to MOV
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Floating point SEL.CMOD may flush denorms to zero. We don't have enough
information at this point in compilation to know whether or not it is
safe to remove that.

Integer SEL or SEL without a conditional modifier is just a fancy
MOV. Those are always safe to eliminate.

See also 3f782cdd25.

Fixes: fab92fa1cb ("i965/fs: Optimize SEL with the same sources into a MOV.")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34192>
2025-04-15 23:59:31 +00:00
Ian Romanick
f4ede9c10a elk/algebraic: Clear condition modifier on optimized SEL instruction
The condition modifier on SEL means something completely different than
it means on MOV.  On MOV it means to modify the flags based on the value
written to the destination. On SEL it means to compare the sources using
that mode and pick the result (i.e., as min() or max()) without
modifying the flags.

The resulting MOV should not have a condition modifier for the same
reason it (already) doesn't have a predicate. This bug was found by
inspection, so I added a unit test.

Fixes: fab92fa1cb ("i965/fs: Optimize SEL with the same sources into a MOV.")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34192>
2025-04-15 23:59:31 +00:00
Ian Romanick
6a19d8915f brw/algebraic: Don't optimize float SEL.CMOD to MOV
Floating point SEL.CMOD may flush denorms to zero. We don't have enough
information at this point in compilation to know whether or not it is
safe to remove that.

Integer SEL or SEL without a conditional modifier is just a fancy
MOV. Those are always safe to eliminate.

See also 3f782cdd25.

Fixes: fab92fa1cb ("i965/fs: Optimize SEL with the same sources into a MOV.")

No shader-db changes on any Intel platform.

fossil-db:

All Intel platforms had similar results. (Lunar Lake shown)
Totals:
Instrs: 209903490 -> 209903492 (+0.00%)
Cycle count: 30546025224 -> 30546021980 (-0.00%); split: -0.00%, +0.00%
Max live registers: 65516231 -> 65516235 (+0.00%)

Totals from 2 (0.00% of 706657) affected shaders:
Instrs: 3197 -> 3199 (+0.06%)
Cycle count: 361650 -> 358406 (-0.90%); split: -10.05%, +9.15%
Max live registers: 300 -> 304 (+1.33%)

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34192>
2025-04-15 23:59:31 +00:00
Ian Romanick
07dc1d4043 brw/algebraic: Clear condition modifier on optimized SEL instruction
The condition modifier on SEL means something completely different than
it means on MOV.  On MOV it means to modify the flags based on the value
written to the destination. On SEL it means to compare the sources using
that mode and pick the result (i.e., as min() or max()) without
modifying the flags.

The resulting MOV should not have a condition modifier for the same
reason it (already) doesn't have a predicate. This bug was found by
inspection, so I added a unit test.

No shader-db or shader-db changes on any Intel platform.

Fixes: fab92fa1cb ("i965/fs: Optimize SEL with the same sources into a MOV.")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34192>
2025-04-15 23:59:31 +00:00
Caio Oliveira
fafdd24285 intel/executor: Update bfloat example
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Elaborate on the packed/unpack restrictions, use ADD(x, 0.0f)
as a workaround for F->BF conversion.

Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34506>
2025-04-14 18:23:43 +00:00
Caio Oliveira
fbe5d559bd brw: Update EU validation to allow packed BF mixed with packed F
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34506>
2025-04-14 18:23:43 +00:00
Caio Oliveira
d1dd088ede brw: Allow DPAS with BF on Gfx125
MTL doesn't support, but both ACM and ARL-H do.

Fixes: e384ccde28 ("brw: Expand EU validation for DPAS")
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34506>
2025-04-14 18:23:43 +00:00
Caio Oliveira
050acb9def intel: Disable has_bfloat16 for MTL
Not supported.  Some operations *do* work, but proper support
was removed since it also doesn't support DPAS.

Fixes: 9916cc1050 ("brw: Add BRW_TYPE_BF for bfloat16")
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34506>
2025-04-14 18:23:43 +00:00
Caio Oliveira
adfab666a4 intel: Add intel_device_info::has_systolic
Gfx125+ has systolic, with exception for MTL and some ARL
variants.  Update code and tests to use it.

Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34506>
2025-04-14 18:23:43 +00:00
Konstantin Seurer
cb31b5a958 clc,libcl: Clean up CL includes
This patch does a couple of things to make CL integration with drivers
as seamless as possible:
- We pull in opencl-c.h and opencl-c-base.h to stop relying on system
  headers.
- Parts of libcl.h are moved to new headers that are incomplete CL-safe
  variants of libc headers.
- A couple of util headers are changed to remove now unnecessary
  __OPENCL_VERSION__ guards and make more headers CL safe.
- Drivers now include src/compiler/libcl and use headers like
  macros.h,u_math.h instead of libcl.h.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33576>
2025-04-11 21:27:37 +00:00
Kenneth Graunke
eb1ec9cf8e brw: Don't assert about MAX_VGRF_SIZE in brw_opt_split_virtual_grfs()
This allows us to create temporary VGRFs that are larger than
MAX_VGRF_SIZE(devinfo), which will be split eventually.  They may not
be split on the initial pass, because we may need LOAD_PAYLOAD lowering,
copy propagation, and so on to occur first.  So we allow registers to
exceed that size initially.

The "Register allocation relies on split_virtual_grfs()" assertion in
brw_reg_allocate.cpp still asserts that all VGRFs which reach the
register allocator have been properly split.

One case where this is useful is for vectorizing convergent block loads.
We create temporaries to splat the SIMD1 values out to SIMD(N), which
can lead to some very large temporaries.  However, copy propagation and
so on ultimately eliminate these and they'll get split down to proper
sizes or elided entirely in the end.

(Note: both this and the prior commits from this merge request are
 needed to close the linked issue.)

Cc: mesa-stable
Reviewed-by: Matt Turner <mattst88@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12324
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34461>
2025-04-11 20:34:51 +00:00
Kenneth Graunke
a45583f078 brw: Use live->max_vgrf_size in pre-RA scheduling
Post-RA scheduling doesn't use liveness analysis, so we continue using
MAX_VGRF_SIZE(devinfo).  But for pre-RA scheduling, we now use
live->max_vgrf_size.

This helps get us to a place where we can emit arbitrarily large VGRFs
early on in compilation, but which will be split and cleaned up prior to
register allocation.  It may also allocate smaller arrays in practice
since MAX_VGRF_SIZE(devinfo) assumes the worst case scenario for things
we actually could need to allocate.

Cc: mesa-stable
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34461>
2025-04-11 20:34:51 +00:00
Kenneth Graunke
4b27b5895c brw: Use live->max_vgrf_size in register coalescing
We already require liveness, so just use the actual maximum size we saw
instead of a hardcoded pessimal size.

Cc: mesa-stable
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34461>
2025-04-11 20:34:51 +00:00
Kenneth Graunke
ea468412f6 brw: Track the largest VGRF size in liveness analysis
We're already looking at this data to calculate the per-component
vars_from_vgrf[] and vgrf_from_vars[] mappings, so just record the
largest VGRF size while we're here.  This will allow passes to size
arrays based on the actual size needed, rather than hardcoding some
fixed size.  In many cases, MAX_VGRF_SIZE(devinfo) is larger than
necessary, because e.g. vec5 sparse sampling results aren't used.
Not hardcoding this means we can also temporarily handle very large
VGRFs which we know will be split eventually, without having to
increase the maximum which is ultimately used for RA classes.

Cc: mesa-stable
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34461>
2025-04-11 20:34:51 +00:00
José Roberto de Souza
68a617076d intel/perf: Update intel_perf to match xe_drm.h
There was a mismatch between drm-next version of xe_drm.h and the one
in https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30142.
So this does the necessary changes to build with current and new
xe_drm.h

Fixes: 2a828c35a1 ("intel/perf: add eu stall sampling support")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34457>
2025-04-11 18:35:49 +00:00
Lionel Landwerlin
243c01c703 anv/iris: implement Wa_18040903259
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34433>
2025-04-11 13:54:35 +00:00
Lionel Landwerlin
d123aedfc7 anv: remove ALWAYS_INLINE from globally visible functions
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34433>
2025-04-11 13:54:35 +00:00
Lionel Landwerlin
bcaf08b47c intel/dev: remove ADLN references
Not used anymore, just use the existing ADL definitions.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34433>
2025-04-11 13:54:35 +00:00
Lionel Landwerlin
938f79ed82 anv: update Wa_1607156449 to use WA infrastructure
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34433>
2025-04-11 13:54:35 +00:00
Valentine Burley
b49eaf0966 ci/lava: Consolidate piglit trace job definitions
Clean up LAVA job definitions.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34424>
2025-04-11 07:05:07 +00:00
Valentine Burley
1aeedddbb6 ci/piglit: Drop redundant PIGLIT_PROFILES variable
PIGLIT_PROFILES was only used with the piglit-runner.sh script, which no
jobs were using anymore.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34424>
2025-04-11 07:05:06 +00:00
Valentine Burley
09f86df938 intel/ci: Convert iris-kbl-piglit to deqp-runner suite
This was the last job using the piglit-runner.sh script.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34424>
2025-04-11 07:05:06 +00:00
Lionel Landwerlin
06ad9a25e5 brw: fix Wa_22013689345 emission
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
2 problems :
  - not detecting null destination correctly
  - applied too late using SHADER_OPCODE_MEMORY_FENCE, when lowering
    already happened

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34319>
2025-04-10 16:44:28 +00:00
Lionel Landwerlin
e321c438dc anv: fix self dependency computation
Some upcoming changes in the runtime will make it impossible to rely
on the pipeline or runtime information to know whether a fragment
shader has input attachments.

Instead we gather that information at compile time and store it in our
shader bind_map.

At runtime we check whether the fragment shader has input attachments
and whether those map to the runtime depth/stencil input attachments
to set the 3DSTATE_PS_EXTRA::PixelShaderKillsPixel.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: d2f7b6d5a7 ("anv: implement VK_KHR_dynamic_rendering_local_read")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>
2025-04-10 13:17:53 +00:00
Paulo Zanoni
fdbdfaed01 anv: add ANV_SYS_MEM_LIMIT for debugging system memory restrictions
If you suspect a workload is failing because it needs more memory, you
can set ANV_SYS_MEM_LIMIT=100 to give it all the memory available.
This could make, for example, certain games start working (it really
depends on how much RAM you have and how much the game wants).

If you suspect a workload is too resource hungry, you can try to limit
it with ANV_SYS_MEM_LIMIT=30 (or some other value) to see if it can
deal with the more restricted environment and behave accordingly.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28513>
2025-04-09 22:48:18 +00:00
Paulo Zanoni
ec4b2ce664 anv: restore the old behavior of up to 75% of RAM for the system heap
"We paid for sixteen gigs of RAM, so we gonna use the whole damn
   sixteen gigs of RAM!"
     - My Mom

First, some history:

The Anv 50%-or-75% rule was originally added in 2017 by 060a6434ec
("anv: Advertise larger heap sizes"). When i915.ko started reporting
memory sizes in its ioctls, it didn't impose any restrictions: 100% of
SRAM was reported as available, so the restriction was in Mesa. When
xe.ko was introduced, it only reported 50% of the SRAM as available
through its ioctls, so commit b571ae6e7a ("intel: Make memory heaps
consistent between KMDs") adapted the code to not take an extra 25% of
the 50% that was already cut, and restricted i915.ko to 50% instead of
the 50%-or-75%. In Kernel commit d2d5f6d57884 ("drm/xe: Increase the
XE_PL_TT watermark"), xe.ko changed to reporting 100% of SRAM through
its ioctls, so we adapted Mesa to do the right thing depending on
which Kernel version was running.

While this was all happening, we were discussing about which behavior
was actually the best: restrict everything to 50% in order to avoid
issues when many things are running in parallel, or keep the
restriction only at 75% in order to allow high demanding workloads to
make full use of the hardware.

The way I see, if parallel applications are causing the system to run
out of resources, the user always has the option to kill applications
and use one thing at a time. On the other hand, if a single
application needs more than 50% of the SRAM and we don't allow it in
our heaps, the application will never work (unless, of course, the
user patches Mesa). So in this commit we go back to allowing
high-demanding applications to work by restoring the 50%-or-75% rule.

This commit is especially useful in systems with integrated graphics,
like LNL, where the option to upgrade RAM is not present.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28513>
2025-04-09 22:48:18 +00:00