Commit graph

645 commits

Author SHA1 Message Date
Alyssa Rosenzweig
0d7083d5bc brw: drop indirection on compiler options
I see no point, we allocate for every shader stage anyway. This is a bit
simpler.

I'm not a fan of the brw_compiler singleton at all but torching that is not on
today's agenda. Flattening it a little bit very much is.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37447>
2025-09-18 14:14:08 +00:00
Georg Lehmann
79d02047b8 intel: switch to new subgroup size info
Reviewed-by: Iván Briano <ivan.briano@intel.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37258>
2025-09-12 21:05:17 +00:00
Dylan Baker
ecfce9f9ad blorp: Fix potential read of uninitaized elk fields in debug paths
The intel_vue_map is only partially initialized before being used. All
used fields are initialized, but in debug paths the unitialzed fields
will also be read. To fix this initialize the struct to 0. In the brw
path this struct is part of the prog_data, and is rzalloc'd.

CID: 1665308
Reviewed-by: Iván Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37261>
2025-09-10 17:51:34 +00:00
Sagar Ghuge
ebbc358db5 blorp: Emit state cache invalidation after every compute dispatch
Implement HSD 16028171704/14025112257:
   LSC state cache livelock:- Once state cache entries are full,
   subsequent walker dispatches with two threads per thread group maybe
   gets stuck infinitely because of state cache live lock.

   One thread continuously stuck in loop doing UGM fence + evict and UGM
   read is waiting on UGM read to have certain value. while other thread
   supposed to update the value that first thread is waiting for. But
   since entries are full in state cache, there is second thread never
   make progress.

Closes: #12352
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37128>
2025-09-04 00:14:48 +00:00
Tapani Pälli
42088cd602 isl/blorp: handle failing 96bpp linear blit case
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Fix the aux usage assert in blorp for 96bpp linear blit and provide CMF
values for RGB formats supported by isl_format_rgb_to_rgba.

CC: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13670
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36709>
2025-08-13 16:09:12 +00:00
Qiang Yu
196569b1a4 all: rename gl_shader_stage to mesa_shader_stage
It's not only for GL, change to a generic name.

Use command:
  find . -type f -not -path '*/.git/*' -exec sed -i 's/\bgl_shader_stage\b/mesa_shader_stage/g' {} +

Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569>
2025-08-06 10:28:40 +08:00
Lionel Landwerlin
be16985c82 intel: move deref_block_size to intel_urb_config
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36512>
2025-08-01 11:35:05 +00:00
Antonio Ospite
ddf2aa3a4d build: avoid redefining unreachable() which is standard in C23
In the C23 standard unreachable() is now a predefined function-like
macro in <stddef.h>

See https://android.googlesource.com/platform/bionic/+/HEAD/docs/c23.md#is-now-a-predefined-function_like-macro-in

And this causes build errors when building for C23:

-----------------------------------------------------------------------
In file included from ../src/util/log.h:30,
                 from ../src/util/log.c:30:
../src/util/macros.h:123:9: warning: "unreachable" redefined
  123 | #define unreachable(str)    \
      |         ^~~~~~~~~~~
In file included from ../src/util/macros.h:31:
/usr/lib/gcc/x86_64-linux-gnu/14/include/stddef.h:456:9: note: this is the location of the previous definition
  456 | #define unreachable() (__builtin_unreachable ())
      |         ^~~~~~~~~~~
-----------------------------------------------------------------------

So don't redefine it with the same name, but use the name UNREACHABLE()
to also signify it's a macro.

Using a different name also makes sense because the behavior of the
macro was extending the one of __builtin_unreachable() anyway, and it
also had a different signature, accepting one argument, compared to the
standard unreachable() with no arguments.

This change improves the chances of building mesa with the C23 standard,
which for instance is the default in recent AOSP versions.

All the instances of the macro, including the definition, were updated
with the following command line:

  git grep -l '[^_]unreachable(' -- "src/**" | sort | uniq | \
  while read file; \
  do \
    sed -e 's/\([^_]\)unreachable(/\1UNREACHABLE(/g' -i "$file"; \
  done && \
  sed -e 's/#undef unreachable/#undef UNREACHABLE/g' -i src/intel/isl/isl_aux_info.c

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36437>
2025-07-31 17:49:42 +00:00
Nanley Chery
4de638ae1e intel: Enable CCS_E on linear surfaces on Xe2+
Allow CCS for non-display linear surfaces in isl_surf_supports_ccs().
We're going to rely more on the helper to determine CCS-enabling for Xe2
on iris.

Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32120>
2025-07-21 18:36:31 +00:00
jhananit
debd903a00 intel: Update all NIR_PASS_V to NIR_PASS
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35889>
2025-07-14 19:25:52 +00:00
Sagar Ghuge
5f1f67358c blorp: Set TG size based on number of threads
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35904>
2025-07-10 22:08:36 +00:00
José Roberto de Souza
aea519cbc2 intel/blorp: Program DispatchWalkOrder and ThreadGroupBatchSize with optimized values for regular computer walkers
It was only added to indirect compute walkers while HSD don't say
anything about this optimization be specific to indirect compute
walkers.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36058>
2025-07-10 20:54:30 +00:00
José Roberto de Souza
b37747ce68 blorp: Emit STATE_COMPUTE_MODE before COMPUTE_WALKER
Cc: stable
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35563>
2025-06-23 18:57:25 +00:00
Nanley Chery
42ef23ecd1 intel/blorp: Don't redescribe some Tile64 clears
We don't support redescribing Tile64 and 3D due to interleaved depth
planes.

Fixes: 312952048b ("intel/blorp: Redescribe gfx12.5 surfaces for CCS fast clears")
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35619>
2025-06-20 13:39:20 +00:00
Nanley Chery
69d91ae975 intel/blorp: Use get_copy_format_for_bpb more for gfx12.5
Use get_copy_format_for_bpb() instead of
get_ccs_compatible_uint_format() when performing blorp_copy(). This
matches the code path taken on gfx20 and increases the testing of cases
which would impact gfx12.0 in isl_get_sampler_clear_field_offset().

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35329>
2025-06-09 17:40:20 +00:00
Nanley Chery
27a5d84632 intel/isl: Fix isl_get_sampler_clear_field_offset()
Through testing, I've found that the sampler will fetch the clear color
pixel from the converted clear color field in more cases. So, stop
reporting the raw dword offset for them:

* On gfx12.5, for 32-bpc color images.
* On gfx11-12.0, for 64-bpp color images.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35329>
2025-06-09 17:40:20 +00:00
Rohan Garg
db8b07f88d anv: use the float qualifier to denote the right value
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34824>
2025-06-05 20:26:54 +02:00
Dylan Baker
2a3cf70db8 blorp: cast uint32_t -> int64_t to avoid potential overflow
In practice, I don't think it's actually going to overflow, but it could
in theory, which coverity is pointing out.

CID: 1647010
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35114>
2025-05-22 21:14:26 +00:00
Nanley Chery
67d60f4325 intel/blorp: Simplify get_fast_clear_rect() for gfx12.5
Refactor the scale factors to highlight the 16-tile width requirement on
Tile4. The fast-clear simulator code associated with HSD 1407682962
also contains a 16-tile requirement for Tile4 surfaces (for the pitch).

Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33776>
2025-05-13 15:13:05 +00:00
Nanley Chery
312952048b intel/blorp: Redescribe gfx12.5 surfaces for CCS fast clears
According to HSD 1407682962 and the associated simulator code,
fast-clear performance can be affected by: image alignment, tiling,
dimensionality, and row pitch. Redescribe surfaces in order avoid
fast-clearing at a slower rate.

Also, benchmarking the main patch in the performance CI (hw=A750)
shows that some traces are helped significantly:
* TotalWarWarhammer3 +5.58% (n=2)
* Factorio +3.75% (n=1)
* TerminatorResistance +3.3% (n=2)
* Borderlands3 +3.23% (n=2)

We could additionally increase the alignment requirements of surfaces in
order to deterministically increase fast-clear performance. That's left
out of this patch in order to avoid any functional pitfalls that can
arise with increased memory consumption. As a result, performance will
continue to be affected by how ISL/drivers/apps configure main surface
memory alignments (directly or indirectly).

Thanks to Lionel Landwerlin for pointing me to the relevant simulator
code.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11168
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11418
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33776>
2025-05-13 15:13:05 +00:00
Nanley Chery
169e22f962 intel/blorp: Drop clear color assignment prior to Xe2
This hasn't been used since the responsibility of clear color updates
moved to the drivers.

Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33776>
2025-05-13 15:13:05 +00:00
Nanley Chery
e353244553 intel/blorp: Disable repclear for gfx12 fast-clear
Docs indicate that this shouldn't be used.

Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33776>
2025-05-13 15:13:05 +00:00
Nanley Chery
fcdae4d4c0 intel: Add and use isl_surf_from_mem()
Unify code which creates surfaces from buffers. The behavior is slightly
changed to use array layers to enable arrayed buffer clears (as needed).

Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33776>
2025-05-13 15:13:04 +00:00
Lionel Landwerlin
2d396f6085 intel: prepare VUE layout for more than 2 layouts
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:35 +00:00
José Roberto de Souza
fcb6dfb29c intel: Fix the MOCS values in XY_BLOCK_COPY_BLT for Xe2+
One more instruction were the MOCS value was splited into two
registes.

Cc: mesa-stable
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34592>
2025-04-22 20:42:25 +00:00
José Roberto de Souza
161c412a82 intel: Fix the MOCS values in XY_FAST_COLOR_BLT for Xe2+
Xe2 changed the MOCS field in few instructions, those now have a field
for the MOCS index and other the encryption enable bit but ISL returns
the combination of both aka MEMORY_OBJECT_CONTROL_STATE.

To minimize changes I have added 2 macros to extract the values
from the value returned by isl.

From all the instructions changed Mesa only make use of two, so the
other instruction will be handled in the next patch.

Cc: mesa-stable
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34592>
2025-04-22 20:42:25 +00:00
José Roberto de Souza
a96e280dfe intel: Program XY_FAST_COLOR_BLT::Destination Mocs for gfx12
Copy engine is not used in gfx12 platforms on ANV but that is possible
in Iris.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34560>
2025-04-17 18:11:44 +00:00
Rohan Garg
ceba312ebd anv,blorp,isl: handle compressed CPS surfaces through the depth stencil hw
Compressed CPS surfaces operations such as copies and clears need to be
handled through the depth stencil hw to ensure that the aux data is
handled correctly.

Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20741>
2025-03-28 04:38:09 +00:00
Lionel Landwerlin
e18431273a blorp: relax depth/stencil<->color copy restriction
Currently blorp assumes that copies of depth/stencil is restricted
to/from depth/stencil formats. We want to allow color<->depth copies.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Acked-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31983>
2025-03-25 08:01:15 +00:00
Lionel Landwerlin
fe2f173413 blorp: assert that shaders don't spill
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Nanley Chery <nanley.g.chery@intel.com>
Acked-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31983>
2025-03-25 08:01:14 +00:00
Lionel Landwerlin
02341733df anv/iris: add drirc keys to disable VF/TE distribution
This is a request from debug engineers to be able to trace the HW
better when analyzing hangs.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33795>
2025-02-27 21:10:59 +00:00
Lionel Landwerlin
ca66f22e90 blorp: emit 3DSTATE_VF
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32418>
2025-02-13 14:36:15 +00:00
Sagar Ghuge
76bd7f9265 blorp: Enable SimpleFloatBlendEnable on Xe3+
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32739>
2025-02-05 22:27:54 -08:00
Caio Oliveira
fbacf3761f intel: Add meson option -Dintel-elk
Defaults to true.  When set to false Iris and various tools can be
built without ELK support.  In both cases this means supporting
only Gfx9+.  This option must be true to build Crocus or Hasvk.

This allows skipping re-building ELK when developing for newer platforms
with tools/tests enabled.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11575
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33054>
2025-01-30 00:45:59 +00:00
Francisco Jerez
7537f8edee intel/blorp/xe3+: Set RegistersPerThread during shader state setup based on prog_data.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32664>
2025-01-29 23:39:32 +00:00
Francisco Jerez
935f60c13c intel/blorp: Specify a subgroup size requirement of 16 for fast clear or repclear shaders.
Request a fixed subgroup size for pixel shaders that require it due to
the hardware restrictions of fast clears and repeated data clears.
This requires plumbing the "is_fast_clear" boolean across several
callers since blorp_compile_fs_brw() currently has no information
regarding whether the kernel is intended for a fast clear operation.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32664>
2025-01-29 23:39:32 +00:00
Lionel Landwerlin
10a4dc529f blorp: disable PS shaders with depth/stencil HiZ ops
Found on simulation, complaining about SIMD32 shaders enabled when
using MSAA 16x.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30753>
2025-01-18 17:52:19 +00:00
Sagar Ghuge
604a384e97 blorp: Use 3DSTATE_URB_ALLOC_* instructions
Use 3DSTATE_URB_ALLOC_* instruction to program URB for multislice device
config.

In case only one slice is available in the device, SliceN fields will be
ignored by HW.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32736>
2025-01-09 21:26:40 +00:00
Lionel Landwerlin
e0b5179869 blorp: use 2D dimension for 1D tiled images
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 31eeb72e45 ("blorp: Add support for blorp_copy via XY_BLOCK_COPY_BLT")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32608>
2024-12-12 17:10:45 +00:00
Lionel Landwerlin
bfcb9bf276 brw: rename brw_sometimes to intel_sometimes
Moving it to intel_shader_enums.h

The plan is to make it visible to OpenCL shaders.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32329>
2024-11-26 13:05:30 +00:00
Nanley Chery
385080fb92 intel: Allow CCS on 3D surfaces for gfx120
According to HSD 1406738321, full resolves and fast-clears don't work
properly on 3D textures. Up until now, we've disabled CCS for this case.
Instead, redescribe the surface as 2-dimensional to perform auxiliary
surface operations.

Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31880>
2024-11-22 20:11:43 +00:00
Nanley Chery
e32203827a intel/blorp: Assert 3D Ys fast-clear restriction
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31880>
2024-11-22 20:11:43 +00:00
Sagar Ghuge
e5776bcb39 blorp: Use the calculated execution mask
Instead of setting execution mask to 0xFFFFFFFF, use the previously
calculated execution mask.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30474>
2024-11-18 04:42:52 +00:00
Sagar Ghuge
17096f87c1 intel: Switch to COMPUTE_WALKER_BODY
Stuff COMPUTE_WALKER_BODY in COMPUTER_WALKER in both iris and anv.

This also fixes the tracepoint for ray dispatches. Stuffing
COMPUTE_WALKER_BODY allow us to set the
cmd_buffer->state.last_compute_walker.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31822>
2024-10-29 15:54:43 +00:00
Nanley Chery
edfb33efdd intel/blorp: Use original surface format for some copies
In iris, this should avoid some partial resolves when copying between
images. In anv, this will reduce restrictions on dmabufs which have
clear color support in the next patch.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31136>
2024-10-03 19:41:31 +00:00
Nanley Chery
73637dbce4 intel/blorp: Choose some copy formats independently
blorp_copy_get_formats() tries to make the source and destination view
formats match as much as possible. This avoids some casting in the copy
shader, but it makes determining the format that will be used for a
surface impossible without having the ISL surface for both that surface
and a source or destination.

We'd like to enable the Vulkan driver to know as early as possible what
format an image may be reinterpreted as for correctness. So, determine
the copy formats more independently and expose a helper which does so
for drivers.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31136>
2024-10-03 19:41:31 +00:00
Lionel Landwerlin
50cc738a6d blorp: convert fast clear color for unsupported formats
This tests is asserting on LNL like :

  dEQP-VK.pipeline.monolithic.sampler.border_swizzle.r8_srgb.gbar.custom.gather_1.no_swizzle_hint
  dEQP-VK.api.image_clearing.core.clear_color_image.2d.optimal.single_layer.e5b9g9r9_ufloat_pack32

Because blorp tries, for example, to setup a render target with
L8_UNORM_SRGB (which is mapped to the R8_UNORM_SRGB of Vulkan) but is
not supported for rendering.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 1c7fe9ad1b ("anv: Support fast clears in anv_CmdClearColorImage")
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31357>
2024-09-27 00:37:25 +00:00
Nanley Chery
b3882c4488 intel: Avoid no-op calls to anv_image_clear_color
Whenever we execute a fast-clear due to LOAD_OP_CLEAR, we decrease the
number of layers to clear by one. We then enter the slow clear function
and possibly exit without clearing if the layer count is zero.
Unfortunately, we've already compiled the shader for slow clears by the
time we exit. Skip the slow clear function if there are no layers to
clear.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31167>
2024-09-20 16:34:37 +00:00
Tapani Pälli
b01d76027d blorp: assert that color depth is not 96 for Wa_16021021469
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31263>
2024-09-19 22:44:49 +00:00
Nanley Chery
4a8f3181ba intel: Support any depth fast-clear value on Xe2
Remove the restriction that a depth fast-clear must have a clear value
which matches an image-dependent heuristic.

Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30767>
2024-08-27 06:15:36 +00:00