Commit graph

14002 commits

Author SHA1 Message Date
Lionel Landwerlin
5c7c1eceb5 anv/brw: handle pipeline libraries with mesh
I always thought there was a massive issue with pipeline libraries &
mesh shaders. Indeed recent CTS tests have exposed a number of issues.

Some values delivered to the fragment shader are coming from different
places depending on whether the preceding shader is Mesh or not. For
example PrimitiveID is delivered in the per-primitive block in Mesh
pipelines whereas for other pipelines it's coming as a VUE slot (which
is per-vertex). Those are 2 different locations in the payload.

We have to find a layout for fragment shaders that is compatible with
everything. Leaving gaps here and there in the thread payload.

Fixes the following test pattern :

  dEQP-VK.mesh_shader.ext.smoke.fast_lib.shared_*

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:35 +00:00
Lionel Landwerlin
18bbcf9a63 intel: introduce new VUE layout for separate compiled shader with mesh
Mesh shaders have per vertex block in URB pretty much identical to the
VUE format. Let's just reuse that concept to do all of our layout in
the payload attribute registers. This will ensure that we have
consistent VUE layout between Mesh & non-Mesh pipelines.

We need a new way of laying out the VUE though as we have to
accomodate a HW constraint of maximum (per-primitive + per-vertex) of
32 varying. This means we cannot have 2 locations in the payload for
things like PrimitiveID which can come from either the per-primitive
or the per-vertex block. The new layout places the PrimitiveID at the
end of the per-vertex attributes and shrinks the delivery dynamically
if the mesh stage is active. The shader is compiled with a
MOV_INDIRECT to read the PrimitiveID from the right location in the
attributes.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:35 +00:00
Lionel Landwerlin
2d396f6085 intel: prepare VUE layout for more than 2 layouts
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:35 +00:00
Lionel Landwerlin
95efdca00b brw: add documentation pointers to FS attribute layout
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:35 +00:00
Lionel Landwerlin
9d342081e7 brw/nir: add intrinsics to read attribute payload register indirectly
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:35 +00:00
Lionel Landwerlin
ef17fbf8e5 anv/brw: use separate_shader to deduced MUE compaction
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:35 +00:00
Lionel Landwerlin
6230f3029f brw: fix brw_nir_move_interpolation_to_top
In a case like this :
   block_0:
      %5 = ...
      %6 = ...
      block_1:
         %7 = load_interpolated_input %5, %6

The current logic would move load_interpolated_input to block_0 before
%5 but not move %5 & %6 which are sources of that instruction.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:34 +00:00
Lionel Landwerlin
5ff1b31c3f brw: document some brw_wm_prog_data fields
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:34 +00:00
Lionel Landwerlin
2f654ddd03 brw: use VARYING_BIT_* macros more
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:34 +00:00
Lionel Landwerlin
75b2d000fc anv: tidy up (CLIP|SBE)_MESH emission
Moving it to is related functions.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:34 +00:00
Lionel Landwerlin
62d2e323ba anv/brw: shrink FS varying payload
We're currently allocating payload spots for 3 fields already
delivered somewhere else in the payload.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:34 +00:00
Lionel Landwerlin
c467444670 brw/nir: use a new intrinsic for fs_msaa_flag
Avoid NIR code doing offset computations.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:34 +00:00
Lionel Landwerlin
dd1ef73aae brw: use newer NIR constructs
nir_shader_intrinsics_pass() & NIR_PASS()

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:34 +00:00
Lionel Landwerlin
b64f237dc4 brw: move helper to brw_nir.c
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:34 +00:00
Lionel Landwerlin
cbbe7ff66e brw: add new helper to print out FS URB setup
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:34 +00:00
Lionel Landwerlin
b8a80c88cb brw: improve VUE printout
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:34 +00:00
Lionel Landwerlin
4f10a1f618 anv: switch to brw helpers to figure out if a fragment is dynamic
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:34 +00:00
Lionel Landwerlin
cb461fa287 anv: switch to use the tcs_prog_data for dynamic input vertices
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:34 +00:00
Lionel Landwerlin
7f500cc6e4 brw: store input_vertices on tcs_prog_data
Will allow the driver to know if the vertices count is dynamic.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:34 +00:00
Lionel Landwerlin
a9ee498347 brw: add helpers to check if a fragment shader execution is dynamic
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:34 +00:00
Lionel Landwerlin
4717382f84 anv: lower input vertices for TCS unconditionally
Take the opportunity to reuse the backend pass.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:34 +00:00
Lionel Landwerlin
119ef792c5 anv: remove tbimr workaround check
Already handled by having a special range created

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:34 +00:00
Emma Anholt
7c130e5dcf intel/ds: Fix formatting of stage index.
draws had been bumped to stage #10, so they ended up appearing nested
between frame (#1) and cmdbuf (#2), instead of nested under them.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22350>
2025-05-08 01:21:25 +00:00
Emma Anholt
4cc66123ec anv/ds: Forward VkDebugUtilsObjectNameInfoEXT to perfetto.
This gets us names on zink/wsi command buffers in perfetto, but may also
be useful some day for getting app names onto framebuffers and non-dynamic
renderpasses.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22350>
2025-05-08 01:21:25 +00:00
Emma Anholt
55d788f434 anv/ds: Associate the VkCommandBuffer some anv-only renderstage events.
This means the perfetto UI will have a non-zero/NULL handle/name in the UI
on these renderstages.  Unfortunately, intel/ds is outside of vulkan so
unless we pull in anv headers, we can't just pass in the anv_cmd_buffer.
This also means it would be much more painful to pass the cmd buffer to
the rest of the events, so they'll still have unset command buffers.
Still, being able to see the name of the command buffer in at least one of
the events should be useful once that's glued together.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22350>
2025-05-08 01:21:25 +00:00
Emma Anholt
546a100f26 intel/ds: Move "have we already sent initial state?" into the helper.
I'm going to have to send initial state from another function too, shortly.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22350>
2025-05-08 01:21:25 +00:00
Emma Anholt
dd81420ef1 perfetto: Create a common MesaRenderpassIncrementalState.
... and explain what its role is.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22350>
2025-05-08 01:21:25 +00:00
Sagar Ghuge
bb61a78911 anv: Fix untyped data port cache pipe control dump output
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Fixes: 845ab3d627
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34855>
2025-05-07 19:29:50 +00:00
Lionel Landwerlin
c434050a00 brw: add pre ray trace intrinsic moves
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Some intrinsics are implemented by reading memory location that could
be rewritten by a further tracing calls. So we need to move those
reads prior to tracing operations in the shaders.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8979
Tested-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34214>
2025-05-06 13:34:53 +00:00
Lionel Landwerlin
37608c075f anv: promote VK_EXT_robustness2 to VK_KHR_robustness2
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34821>
2025-05-06 13:16:13 +00:00
Hyunjun Ko
86d21fd2cf anv: Set tc/beta offset according to the flag from PPS.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Consider the flag from PPS when setting tc/beta offset.

This fixes some artifacts when decoding a hevc video,
hevc_scaling_list4.mkv from Lynne.

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34782>
2025-05-06 04:24:22 +00:00
José Roberto de Souza
a82b569649 anv: Reduce memory pool usage in MTL and ARL
Those platforms requires aux map with 1MB alignment, for slab that
means that any buffer needs to have size of multiple of 1MB what
causes a lot of memory to be wasted causing it to run out of memory
when running multiple GPU applications.

Fixes: ea18572ff2 ("anv: Add support for ANV_BO_ALLOC_AUX_CCS in anv_slab_bo")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34817>
2025-05-05 16:42:14 +00:00
Valentine Burley
0f5ab7af3d anv/ci: Update expectations
These Vulkan Video tests were fixed in f7ff9b240d ("anv: Do not support the tiling of DRM modifier if DECODE_DST")

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34792>
2025-05-05 13:28:40 +00:00
Valentine Burley
9ac2b73cf4 iris/ci: Update trace checksums
These traces have been failing for a while now.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34792>
2025-05-05 13:28:40 +00:00
Iván Briano
cf9b0dd589 anv, hasvk: ignore QFOT if both src and dst queue families are equal
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34779>
2025-05-02 17:38:56 +00:00
José Roberto de Souza
3e5a735d01 intel/tools: Fix batch buffer decoder
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
intel_decoder_init() initializes intel_batch_decode_ctx so later
we can call decode functions but it depends on data stored in
brw/elk_isa_info but that was being allocated in stack
of intel_decoder_init() then when the decode functions were executed
it was accessing garbage at the brw/elk_isa_info memory.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: ec2d20a70d ("intel/tools: Add helpers for decoder_init/disasm")
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34776>
2025-05-01 13:27:44 +00:00
Lionel Landwerlin
63f633557f intel: fix null render target setup logic
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Or current render target cache setting is to key on the binding table
index, meaning the HW associates a number in the range [0, 7] to a
RENDER_SURFACE_STATE description. If you want change the render target
0 between 2 draw calls, you need to insert a PIPE_CONTROL in between
the 2 draw calls with pb-stall + rt-flush in order to flush an writes
to a previous RENDER_SURFACE_STATE that has now becomed disassociated
with the [0, 7] number.

This PIPE_CONTROL taking care of the flush is dealt with in
cmd_buffer_maybe_flush_rt_writes(). This function diffs the current
BTI setup for render targets (first 0 to 7 BTIs) with what the next
fragment shader wants.

The issue here is we might have a render pass with 0 color attachments
and yet in 98cdb9349a we added one pointing to the render target 0,
but in the emit_binding_table() when we finally program the BTI, we
check the render pass color count and program a null surface state
instead of an actual surface state. And this leads to hangs because
the render target cache will end up with inconsistent state data.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 98cdb9349a ("anv: ensure null-rt bit in compiler isn't used when there is ds attachment")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12955
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34603>
2025-05-01 11:25:18 +00:00
José Roberto de Souza
615d0c9669 anv: Remove ANV_BO_ALLOC_HOST_CACHED from ANV_BO_ALLOC_MAPPED assert() on anv_device_alloc_bo()
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
ANV_BO_ALLOC_MAPPED are internal allocated bos that need mmap() but as
internally we don't do any cflush() we need to make sure those are also
ANV_BO_ALLOC_HOST_COHERENT.

Checking for ANV_BO_ALLOC_HOST_CACHED could lead a cached+uncoherent
bo being allocated internally with ANV_BO_ALLOC_MAPPED.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34778>
2025-05-01 02:44:03 +00:00
José Roberto de Souza
57bf646685 anv: Fix assert failure in discrete GPUs when allocating a LMEM+SMEM slab parent
It was failing in the first assert of anv_device_alloc_bo() because
it has ANV_BO_ALLOC_MAPPED but it don't have ANV_BO_ALLOC_HOST_COHERENT or
ANV_BO_ALLOC_HOST_CACHED(this second one is wrong and fixed in the next
patch).

LMEM is always write-combine, even SMEM on discrete GPU is always
write-back + coherent because the PCI bus protocol snooping at CPU
caches and that behavior can't be disabled.
So we can add this coherent flag without any side effects.

The ANV_BO_ALLOC_MAPPED is needed for ANV_BO_SLAB_HEAP_LMEM_SMEM
because to trigger SMEM+LMEM in anv_device_alloc_bo() we need
ANV_BO_ALLOC_MAPPED or ANV_BO_ALLOC_LOCAL_MEM_CPU_VISIBLE but the
second one is mostly used with small PCI bar discrete GPUs.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: dabb012423 ("anv: Implement anv_slab_bo and enable memory pool")
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34778>
2025-05-01 02:44:03 +00:00
Karmjit Mahil
9d01b318a3 anv,tu: Bypass RMV pcie_family_id check
Since RMV 1.9 pcie_family_id is checked to verify whether a
capture is supported.

Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34763>
2025-04-30 16:12:11 +00:00
Lionel Landwerlin
f7bc22e0d7 anv: force fragment shader execution when occlusion queries are active
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34732>
2025-04-30 13:37:14 +00:00
Felix DeGrood
4f0aa96d26 anv: Do conservative oversubscription of pages to 2MB
Round up allocations to nearest 2MB interval if this increases
the allocation by no more than 1.33x. This reduces page count but
at the cost of extra memory consumption. Optimization only applied
to MTL(Xe KMD only)/LNL platforms, which are particularly impacted by
page misses.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33558>
2025-04-30 12:56:40 +00:00
José Roberto de Souza
2c05488be1 anv: Align size of bos larger than 1MB to 64k to enable 64k pages
BOs larger than 1MB don't go memory pool due the size but applications
tend to use a lot of VkMemory with size larger than 1MB so to reduce
the number of pages and improve performance here I'm aligning the size
of BOs larger than 1MB to 64kb, this allows 64kb pages to be used at
least on Xe KMD.
This bring substantial perfomance benefit in exchange of a small
memory waste.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33558>
2025-04-30 12:56:40 +00:00
José Roberto de Souza
dde91cf9cb anv: Always grow fixed address pools by 2MB in platforms that there is a performance gain
MTL and newer integrated platforms has a performance gain when using
transparent huge pages, because of the fixed address requirement
we can't use slab for this case but we can change the initial pool
size to 2MB so all allocations get the transparent huge page
optimization.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33558>
2025-04-30 12:56:40 +00:00
José Roberto de Souza
7361b3287f anv: Remove useless if block
I can't think in any case where that would be false, so lets drop it.
While at it, also making some variables const.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33558>
2025-04-30 12:56:40 +00:00
José Roberto de Souza
6f7a32ec92 anv: Add support for batch buffers in anv_slab_bo in i915
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33558>
2025-04-30 12:56:40 +00:00
José Roberto de Souza
39bb51ab27 anv: Add support for batch buffers in anv_slab_bo in Xe KMD
Because of the ANV_BO_ALLOC_CAPTURE flag, batch buffers were not
allowed to use memory pool.
So to workaround that here adding a new anv_bo_slab_heap heap for
cached+coherent+capture buffers with the main goal to get batch
buffer to memory pool but other buffers will as well.

For now that will only work in Xe KMD as i915 requires more changes
to support it.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33558>
2025-04-30 12:56:40 +00:00
José Roberto de Souza
a0a600ca5f anv: Skip anv_bo_pool if memory pool is enabled
The whole purpose of anv_bo_pool is to reduce the number of
gem_create/destroy calls in command buffers that is something with
a short life span.
But slab_bo/memory pool does the same with even other benefits like
doing 2MB allocations to enable THP.

So here skipping the meat of anv_bo_pool_free() to directly return
the bo to slab_bo. This change is also necessary because the way
anv_bo_pool stores freed buffers it requires that all bos has a unique
gem handle, what not true of buffer allocated by anv_slab.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33558>
2025-04-30 12:56:40 +00:00
José Roberto de Souza
0b561f691b anv: Add support for ANV_BO_ALLOC_DYNAMIC_VISIBLE_POOL in anv_slab_bo
This flag was not supported in anv_slab_bo because it is set together
with ANV_BO_ALLOC_CAPTURE and more important it has a specific VMA
range.

We can support it by adding a custom heap and allocating all bos in
the heap with all necessary flags, but because application can also
allocate those with vkAllocateMemory() here the ANV_BO_ALLOC_CAPTURE
is appended to the vkAllocateMemory() path for integrated gpu and
anv_slab_bo check if all the alloc_flags matches, because application
could choose to allocate it in a cached but not coherent memory type
for example.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33558>
2025-04-30 12:56:40 +00:00
José Roberto de Souza
8fd4423d99 anv: Add support for ANV_BO_ALLOC_DESCRIPTOR_POOL in anv_slab_bo
This flag was not supported in anv_slab_bo because it is set together
with ANV_BO_ALLOC_CAPTURE and more important it has a specific VMA
range.

But we can easily support it by adding a custom heap with it and
allocating all bos in the heap with all necessary flags.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33558>
2025-04-30 12:56:39 +00:00