Commit graph

92 commits

Author SHA1 Message Date
Caio Oliveira
df4042371f anv: Set PIPELINE_SELECT systolic mode based on shader usage
For Gfx125 workloads that use systolic mode, this might mean
an extra PIPELINE_SELECT when flipping between a compute shader
that use the mode and another that doesn't use the mode
(or vice-versa).

Reviewed-by: Iván Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40014>
2026-02-26 19:05:56 +00:00
Lionel Landwerlin
9f4309cb8a anv: program HW to gather push constants at 3DSTATE_CONSTANT parsing time on Gfx9
Removes the need for emitting 3DSTATE_BINDING_TABLE_POINTER* commands
to make the HW gather push constants.

According to internal pointers, this been the default behavior on
Gfx11+.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39405>
2026-02-25 10:44:03 +00:00
Tapani Pälli
9aaed82543 anv: set DisableAnyMCTRresponsefix to zero on init
This is to make sure early culling related Wa_16020518922 is enabled
properly.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14204
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39712>
2026-02-05 15:09:02 +00:00
Tapani Pälli
f66ff97d58 drirc/anv: implement steps to disable RHWO for Wa_14024015672
Disable RHWO by default for singlesample draws and for MSAA
draws if a drirc key is set (avoid perf hit if not needed).

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39404>
2026-01-23 11:10:07 +00:00
Rohan Garg
e55a7bc83a anv: program STATE_COMPUTE_MODE to flush the L1 cache
This is required for upcoming resource barrier work to implement HDC
flush's.

Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38707>
2025-12-15 08:25:39 +00:00
Lionel Landwerlin
20f320b7c7 anv: program STATE_BASE_ADDRESS instruction ptr using pdevice address
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Michael Cheng <michael.cheng@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38869>
2025-12-10 20:32:10 +00:00
Sagar Ghuge
aeaf1cbc2b anv: Replay mode is only available on Gfx < 20
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38416>
2025-11-13 23:05:01 +00:00
Lionel Landwerlin
4d9dd5c3a2 anv: store a few default instructions
We will use those where no associated shaders is active but we still
need some default values programmed.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34872>
2025-09-05 07:46:20 +00:00
Tapani Pälli
ad2ef16198 iris/anv: toggle on CACHE_MODE_0::MsaaFastClearEnabled on BMG G31
This increases rate of depth fast clear rate on BMG G31
per HSD 22020044224.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35966>
2025-08-26 19:35:34 +00:00
Lionel Landwerlin
1bab95551a anv: fix uninitialized return value
We don't go through the loop when there are no queues.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 884df891d7 ("anv: allow device creation with no queue")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36910>
2025-08-21 16:07:56 +00:00
Lionel Landwerlin
99016a893a anv: avoid storing L3 config on the pipeline
On Gfx9 we only use 2 L3 config depending on SLM use or not. So it's
the same config for all Gfx pipelines.

On Gfx11+ there is only one config (since SLM is allocated from
somewhere else).

So avoid store this on the pipeline, pick the config when flushing the
pipeline.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36512>
2025-08-01 11:35:05 +00:00
Antonio Ospite
ddf2aa3a4d build: avoid redefining unreachable() which is standard in C23
In the C23 standard unreachable() is now a predefined function-like
macro in <stddef.h>

See https://android.googlesource.com/platform/bionic/+/HEAD/docs/c23.md#is-now-a-predefined-function_like-macro-in

And this causes build errors when building for C23:

-----------------------------------------------------------------------
In file included from ../src/util/log.h:30,
                 from ../src/util/log.c:30:
../src/util/macros.h:123:9: warning: "unreachable" redefined
  123 | #define unreachable(str)    \
      |         ^~~~~~~~~~~
In file included from ../src/util/macros.h:31:
/usr/lib/gcc/x86_64-linux-gnu/14/include/stddef.h:456:9: note: this is the location of the previous definition
  456 | #define unreachable() (__builtin_unreachable ())
      |         ^~~~~~~~~~~
-----------------------------------------------------------------------

So don't redefine it with the same name, but use the name UNREACHABLE()
to also signify it's a macro.

Using a different name also makes sense because the behavior of the
macro was extending the one of __builtin_unreachable() anyway, and it
also had a different signature, accepting one argument, compared to the
standard unreachable() with no arguments.

This change improves the chances of building mesa with the C23 standard,
which for instance is the default in recent AOSP versions.

All the instances of the macro, including the definition, were updated
with the following command line:

  git grep -l '[^_]unreachable(' -- "src/**" | sort | uniq | \
  while read file; \
  do \
    sed -e 's/\([^_]\)unreachable(/\1UNREACHABLE(/g' -i "$file"; \
  done && \
  sed -e 's/#undef unreachable/#undef UNREACHABLE/g' -i src/intel/isl/isl_aux_info.c

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36437>
2025-07-31 17:49:42 +00:00
Sagar Ghuge
3a9157a10b anv: Use thread group preemption granularity
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36337>
2025-07-29 22:47:56 +00:00
Caio Oliveira
887642b0f2 intel: Add INTEL_DEBUG=no-vrt
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Add support for disabling the VRT (Variable Register Thread) feature.
The strategy here is to force the old BRW_MAX_GRF limit for the
register allocator (locks the upper limit) and make sure
ptl_register_blocks() always return that amount of blocks (locks
the lower limit).

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35781>
2025-07-13 21:11:02 +00:00
Lionel Landwerlin
98bc185376 anv: rework embedded sampler hashing
Create a hashing key on all samplers so we can just copy that anywhere
we need it. That key already contains the needed parameters for
embedded samplers, so the sha1 stuff can go away.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35955>
2025-07-07 18:53:53 +00:00
Iván Briano
d964b8d5fa anv: don't report custom sample locations for sample count 1
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
We can't actually enable MSAA for images with sample count 1, and
without MSAA active, the sample location machinery does not get used.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35504>
2025-06-24 19:44:34 +00:00
José Roberto de Souza
37f4182ac3 Revert "anv: Enable preemption due 3DPRIMITIVE in GFX 12"
Enabling preemption in 3DPRIMITIVE is causing glitches on Dota 2,
so reverting this until the issue with preemption is fixed.

This reverts commit 3cd972a2d3.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13289
Fixes: 12ddaa6b8b ("anv: Enable preemption due 3DPRIMITIVE in GFX 12")
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35586>
2025-06-18 18:52:19 +00:00
Lionel Landwerlin
f0e18c475b intel: remove GRL/intel-clc
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35227>
2025-05-29 20:17:13 +00:00
José Roberto de Souza
3cd972a2d3 anv: Enable preemption due 3DPRIMITIVE in GFX 12
The issues preventing it to be enabled were fixed so now we can enable
it but we need also to enable workaround 16013994831 back again.

Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34988>
2025-05-15 15:25:12 +00:00
Sagar Ghuge
0463e14b94 anv: Enable 64bit memory structure mode for RT
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kevin Chuang <kaiwenjon23@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33047>
2025-04-21 20:10:45 +00:00
Sagar Ghuge
2c8148a76e anv: CPS LOD Compensation Enable is deprecated on Xe2+
On Xe2+, Hardware will always have scale.x and scale.y as 1.0.
This is not fixing any issues.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33726>
2025-02-27 19:49:02 +00:00
Lionel Landwerlin
26347b4876 anv: use heap size to program generate state heap
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32895>
2025-02-05 09:56:03 +00:00
Francisco Jerez
d455d5d86c anv/xe3+: Enable VRT.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32664>
2025-01-29 23:39:32 +00:00
Lionel Landwerlin
c2c3f19e88 anv: pass physical device to format helpers
So that we can have special behavior based on drirc configuration.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Lucas Fryzek <lfryzek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33194>
2025-01-29 13:57:26 +00:00
Sagar Ghuge
710624fcc0 anv: Use 3DSTATE_URB_ALLOC_* instructions
Use 3DSTATE_URB_ALLOC_* instruction to program URB for multislice device
config.

In case only one slice is available in the device, SliceN fields will be
ignored by HW.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32736>
2025-01-09 21:26:40 +00:00
Sagar Ghuge
33d9a685a5 anv: Add pipelined coarse pixel state
3DSTATE_CPS_POINTERS is deprecated on PTL, so let's switch to
3DSTATE_COARSE_PIXEL to deliver CPS state as pipelined state.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32737>
2025-01-07 23:53:44 +00:00
Sagar Ghuge
76e85df2d2 anv: Switch to ANISOTROPIC_FAST filter mode
Same thing as ANISOTROPIC including all restrictions except HW is
allowed to take liberties with precision to speed things up, Currently
only has an affect on formats of type *_sRGB.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32738>
2024-12-31 21:49:41 +00:00
José Roberto de Souza
2bd3df75e5 anv: Emit STATE_SYSTEM_MEM_FENCE_ADDRESS
According to HAS it is necessary to emit this instruction once per
context so MI_MEM_FENCE works properly.

Fixes: 86813c60a4 ("mi-builder: add read/write memory fencing support on Gfx20+")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32680>
2024-12-18 17:16:05 +00:00
José Roberto de Souza
b8f93bfd38 anv: Always create anv_async_submit in init_copy_video_queue_state()
A next patch will emit more instructions in video and copy queues
for Gfx 200 and newer but the current code only creates anv_async_submit
if device has aux_map.
Instead we can always create anv_async_submit and only submit it to
hardware if any instruction was emited.

Fixes: 86813c60a4 ("mi-builder: add read/write memory fencing support on Gfx20+")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32680>
2024-12-18 17:16:05 +00:00
Sagar Ghuge
41baeb3810 anv: Implement acceleration structure API
Rework: (Kevin)
- Properly setup bvh_layout
   Our bvh resides in contiguous memory and can be divided into two sections:
      1. anv_accel_struct_header, tightly followed by
      2. actual bvh, which starts with root node, followed by interleaving
         leaves or internal nodes.
- Update comments for some fields for BVH and nodes.
- Properly populate the UUIDs in serialization header
- separate header func into completely two paths based on compaction bit
- Encode rt_uuid at second VK_UUID_SIZE.
- Write query result at correct slot
- add assertion for a 4B alignment
- move bvh_layout to anv_bvh
- Use meson option to decide which files to compile
- The alignment of serialization size is not needed
- Change static_assert to STATIC_ASSERT and move them inside functions

Rework (Sagar)
- Use anv_cmd_buffer_update_buffer instead of MI to copy data

Rework (Lionel)
- Remove flush after builds, and add flush in copy before dispatch
- Handle the flushes in CmdWriteAccelerationStructuresPropertiesKHR properly

Co-authored-by: Kevin Chuang <kaiwenjon23@gmail.com>
Co-authored-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31588>
2024-12-04 10:41:45 +00:00
Tapani Pälli
fbe5d41b58 anv: extend Wa_14017794102 with lineage Wa_14023061436
This workaround is applicable for Xe3 with new lineage.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31963>
2024-11-13 04:54:32 +00:00
José Roberto de Souza
aa5b2c4165 anv: Set recommended values for gfx20 async compute registers in STATE_COMPUTE_MODE
This recommended values should improve the performance of async
compute in gfx20, we may want to tweek this for Linux but at least
this values should give us a better baseline than default values.

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30796>
2024-10-22 15:24:32 +00:00
José Roberto de Souza
3efba707bf anv: Set all async compute registers in STATE_COMPUTE_MODE
Setting the missing registers to specification recommended values that
is also the default value, so it is not expected any changes in
behavior or performance here.

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30796>
2024-10-22 15:24:32 +00:00
Lionel Landwerlin
1f9c40a8d1 anv: explicitly disable BT pool allocations at device init
The default state doesn't seem well defined (or kernel driver bug
maybe?). Let's just set it to disabled on platforms where we're not
using it.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Found-by: Chuansheng Liu <chuansheng.liu@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30841>
2024-08-26 10:34:31 +00:00
Lionel Landwerlin
78ae7ab856 anv/hasvk: add indirect tracepoint arguments
Gives visibility on some indirect parameter dispatches :
  - draw count
  - compute dispatch size

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29944>
2024-08-03 16:03:04 +03:00
Paulo Zanoni
3ab8ff99fa anv/trtt: fix the process of picking device->trtt.queue
We want to use actual sparse-capable queues as the default
trtt->queue, not copy queues that may have a companion_rcs_batch.
Before this patch, if we expose more than one queue *and* the
application creates a copy queue first, we'll end up setting
trtt->queue as the copy queue, which will GPU hang when we submit the
TR-TT batches as they don't support the pipe_control commands we
issue.

The trtt->queue queue is used for binding/unbinding buffers in code
paths where there's no specific queue coming from user space, such as
when we're creating or destroying a sparse resource.

This is not a problem yet on i915.ko since we are exposing
only a single queue, and it is not a problem for xe.ko since TR-TT is
not the default there. This is also not a problem in applications
that create the render or compute queue first. We plan to expose more
queues when using TR-TT, so this would become a problem without this
patch.

None of VK-GL-CTS seems to exercise that, and none of the Steam games
I tested exercise that as well. I was able to reproduce this issue
using our internal tracing tool.

v2: New implementation that doesn't break when we only have a compute
    queue (Lionel).

Fixes: 04bfe828db ("anv/sparse: allow sparse resouces to use TR-TT as its backend")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30252>
2024-07-22 10:04:34 -07:00
Paulo Zanoni
5ca224aa0c anv/trtt: make all contexts have the same TR-TT programming
On Gen12 (the oldest we support on Mesa right now for TR-TT) we
started having per-engine TR-TT registers and we are supposed to make
all contexts share the same TR-TT programming.

On LNL+, this is documented in the BSpec page for the TRTT_CNTRL
register (68417), with more details in HSDs 14020454786 and
16022013154.

On Gen12 platforms this information is a little harder to find and
there's a whole trail of HSDs leading up to 1209977595, which links to
the documents that describe the programming. BSpec for TR-TT on Gen12
is very confusing as it still contains registers and other information
from Gen11 that were not removed.

Regarding the additional BLT and COMP registers, please notice that on
the BSpec pages for the TR-TT registers, the "Register Instance"
section only lists the GFX registers as non-privileged. However, the
"User Mode Privileged Commands" lists the other instances of the TR-TT
Regsiters as non-privileged, which matches what we see: there's no
need to put these addresses in the FORCE_TO_NONPRIV registers.

Notice that for now, when TR-TT is being used we only expose a single
queue, so this change effectively does nothing until we start exposing
extra queues. I left that part for later to help bisectability.

v2:
 - s/trtt_init_context_state/trtt_init_queues_state/ (José)
 - pass device as the argument to init_queues_state (José)
v3:
 - use async_submit_end (José)

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30252>
2024-07-22 10:04:34 -07:00
Paulo Zanoni
fb9d94f4ed anv/trtt: make genX(init_trtt_context_state) a little more compact
In this series we're going to further change this function, adding a
lot more lines, so this patch should make the next diffs a little
easier to comprehend and review.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30252>
2024-07-22 10:04:34 -07:00
Lionel Landwerlin
692e1ab2c1 anv: get rid of the second dynamic state heap
Pretty big change... Sorry for that.

I can't exactly remember why I created 2 heaps. I think it's because I
mistakenly thought the samplers in the binding sampler pointers needed
to be indexed from the binding table. But that's not the case, they
just need to be in the dynamic state heap.

In the future, this change will allow to also allocate buffers for
push constant data in the newly created dynamic_visible_pool which
will be useful on < Gfx12.0 where this is the only place push constant
data can live for compute shaders.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30047>
2024-07-19 12:21:46 +00:00
José Roberto de Souza
19a8abde5f anv: Implement Wa_14019857787
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29619>
2024-06-20 21:47:59 +00:00
Lionel Landwerlin
0147908a89 anv: predicate emission of STATE_BASE_ADDRESS
Completely skip the stall & programming if the bindless address has
not changed. Only on Gfx12.5+ since previous generations also program
the binding table pool base address through STATE_BASE_ADDRESS.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29595>
2024-06-18 20:44:51 +00:00
Francisco Jerez
8e61d32db8 iris,anv/xe2+: Use pipelined variant of 3DSTATE_DRAWING_RECTANGLE.
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29562>
2024-06-17 16:19:17 -07:00
Francisco Jerez
2aa4652a68 iris,anv/xe2+: Enable the DX10/OGL border mode for YCrCb as per Wa_14014226147.
Hardware defaults to DX9 YCrCb border color mode instead of the
behavior expected for DX10/OGL.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29562>
2024-06-17 16:19:17 -07:00
Lionel Landwerlin
49d2d25e24 anv: make device initialization more asynchronous
With this change, the engine initialization batches are build and
submitted at vkCreateDevice() but the function doesn't wait for them
to complete. Instead we wait at vkDestroyDevice() or whenever another
submission happens on the queue, we check whether the initialization
batch has completed (without waiting) and free it if completed.

Seems to be about 25% reduction time of vkCreateDevice()

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28975>
2024-06-13 08:29:25 +00:00
Lionel Landwerlin
729c0b54b6 anv: use reserved array pool for legacy custom border colors
The array pool does a single allocation and then splits it out. The
downside is that the pool is not lockless, but for border colors it
likely doesn't matter much as there is a max border colors for 4k.

Seems to be a 30% time reduction for vkCreateDevice()

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28975>
2024-06-13 08:29:25 +00:00
Lionel Landwerlin
7da5b1caef anv: move trtt submissions over to the anv_async_submit
We can remove a bunch of TRTT specific code from the backends as well
as manual submission tracking.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28975>
2024-06-13 08:29:25 +00:00
José Roberto de Souza
62a25f0649 anv/xe2: Add STATE_COMPUTE_MODE individual masks
So we can enable each mask individually when programming registers.
Also setting Mask2/mask of the second double word so all registers in
it are also zeored during state init.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29616>
2024-06-10 14:08:03 +00:00
José Roberto de Souza
a472d415bc anv/xe2: Enable compute walker and BTD thread preemption
GFX versions older than GFX 20 have 'Thread Preemption disable' while
GFX 20 has 'Thread Preemption' with value flipped in compute walker
instruction.
So here by default enabling thread preemption, only disabling it
when BTD mode is enabled as instructed in Wa_14017794102.

Similar for 3DSTATE_BTD, enabling preemption by default and
only disabling when platform is affected by Wa_14017794102.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29616>
2024-06-10 14:08:02 +00:00
Jordan Justen
410ca6a3e9 Revert "anv: Disable Ray Tracing on xe2 until our compiler supports Xe2 RT"
This reverts commit 65684b0c7f.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29273>
2024-05-28 18:45:49 +00:00
Rohan Garg
6fc6f95e90 intel/genxml: Update STATE_COMPUTE_MODE for Xe2
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29264>
2024-05-28 14:42:19 +00:00