Commit graph

61 commits

Author SHA1 Message Date
Paulo Zanoni
4c366ef67b anv/trtt: set every entry to NULL when we create an L2 table
When we create sparse resources the first thing we do is a NULL bind
on them, as the Vulkan spec mandates certain behavior even for unbound
sparse resources. We do this with the minimal effort possible: if we
can get away with marking an L2 pointer as NULL in the L3 table, we
just do it and return, instead of going all the way to creating L1
tables and marking all the final entries as NULL.

The strategy we were using had a bug that could lead to previously
created NULL entries not being marked as NULL anymore. Let's give an
example:

 (before proceeding, keep in mind that a NULL entry in the L3 and L2
  tables has bit 1 set, it does *not* have the value 0)

 - Create a 64mb buffer that uses an entire L1 table (needs to be
   properly aligned), which triggers a NULL bind.
     - Our algorithm will just set the L3 entry (pointing to the L2
       table) as NULL.
 - Create a 64kb buffer that uses the same L2 table (but a different
   L1 table).
     - The NULL bind triggered won't do anything as the L2 table is
       already NULL.
 - Bind the first buffer to actual memory. This will end up creating
   the L2 table and the L1 table. The only entry we will set in the L2
   table will be the one pointing to the L1 table. All the other
   values will be 0 (so they won't have neither the NULL or Invalid
   bits set: access to them will lead to page faults).
 - Try to use the second buffer, which is still unbound. It was
   relying on the fact that its L2 table pointer was NULL, but now
   it's not anymore, so the page walker will fetch the L1 entries in
   the L2 table and they will all be zero instead of having the NULL
   bit set.

The fix is pretty simple: whenever we create a new L2 table, set every
entry to NULL (except the one we're about to set to non-NULL). This
preserves behavior for every other NULL resource relying on the L3
entry being set to NULL.

We don't need to do this for the L1 table because its entries are
different and instead of having bits to signal NULL entries we have
a special TR-TT register that we can set that gets compared to check
if an entry is NULL, and we conveniently program it to 0: see
ANV_TRTT_L1_NULL_TILE_VAL.

I am not aware of any real workloads that are triggering this
behavior, I found this issue while investigating something else,
running a custom sparse program in our pre-silicon environment, and it
told us about the page faults.

Cc: mesa-stable
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30953>
2024-10-15 23:05:30 +00:00
Paulo Zanoni
fe59044f47 anv/trtt: mark vk_sync_get_value()'s value as defined for Valgrind
Valgrind doesn't seem to know that drmSyncobjQuery() writes to the
variable that we pass as 'last_value'. This gets rid of:

==6275== Conditional jump or move depends on uninitialised value(s)
==6275==    at 0x5308370: anv_sparse_trtt_garbage_collect_batches (anv_sparse.c:540)
==6275==    by 0x53091E2: anv_sparse_bind_trtt (anv_sparse.c:825)
==6275==    by 0x5309771: anv_sparse_bind (anv_sparse.c:953)
==6275==    by 0x5309A3B: anv_free_sparse_bindings (anv_sparse.c:1041)
==6275==    by 0x529FF21: anv_DestroyBuffer (anv_buffer.c:248)
==6275==    by 0x932ADBD: ??? (in /usr/lib/x86_64-linux-gnu/libVkLayer_khronos_validation.so)
==6275==    by 0x127AA2: MyVkBuffer::~MyVkBuffer() (sparse.cpp:364)
==6275==    by 0x12B2D4: MyApp::test1_trivial_sparse() (sparse.cpp:1421)
==6275==    by 0x13E01A: MyApp::run_test(int) (sparse.cpp:6594)
==6275==    by 0x13E3B0: main (sparse.cpp:6656)
==6275==  Uninitialised value was created by a stack allocation
==6275==    at 0x53082D3: anv_sparse_trtt_garbage_collect_batches (anv_sparse.c:525)

An alternative to these Valgrind macros would simply have been to
zero-intialize last_value.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31332>
2024-09-27 04:10:12 +00:00
Dylan Baker
ed8d1d3c9b anv: if queue is NULL in vm_bind return early
In the error handling path we end up creating a vk_sync and then later
we vk_sync_wait() on it. If that wait fails somehow we'll end up calling
vk_queue_set_lost(&queue->vk, ...) which would segfault if queue is
NULL.

If we end up in this situation (no queue), return directly whatever the
backend's vm_bind function returned, propagating the error up if
necessary.

Fixes: dd5362c78a ("anv/xe: try harder when the vm_bind ioctl fails")
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31048>
2024-09-13 20:17:40 +00:00
Paulo Zanoni
dd5362c78a anv/xe: try harder when the vm_bind ioctl fails
From all the many possible errors returned by the vm_bind ioctl, some
can actually happen in the wild when the system is under memory
pressure. Thomas Hellström pointed to us that, due to its asynchronous
nature, the vm_bind ioctl itself has to pin some memory, so if the
number of bind operations passed is too big, there is a probability
that it may run out of memory.

Previously the Kernel would return ENOMEM when this condition
happened.  Since commit e8babb280b5e ("drm/xe: Convert multiple bind
ops into single job") the Kernel has started returning ENOBUFS when it
doesn't have enough memory to do what it wants but thinks we'd succeed
if we tried to do one bind operation at a time (instead of doing
multiple operations in the same ioctl), and ENOMEM in some other
situations. Still-uncommitted commit "drm/xe: Return -ENOBUFS if a
kmalloc fails which is tied to an array of binds" proposes converting
a few more ENOMEM cases no ENOBUFS.

Still, even ENOMEM situations could in theory be possible to recover
from, because if we wait some amount of time, resources that may have
been consuming memory could end up being freed by other threads or
processes, allowing the operations to succeed. So our main idea in
this patch is that we treat both ENOMEM and ENOBUFS in the same way,
so our implementation can work with any xe.ko driver regardless of
having or not having the commits mentioned above.

So in this patch, when we detect the system is under memory pressure
(i.e., the vm_bind() function returns VK_ERROR_OUT_OF_HOST_MEMORY), we
throw away our performance expectations and try to go slowly and
steady. First we wait everything we're supposed to wait (hoping that
this alone could also help to alleviate the memory pressure), and then
we synchronously bind one piece at a time (as this will ensure ENOBUFS
can't be returned), hoping that this won't cause the Kernel to try to
reserve too much memory. All this while also hoping that whatever
thing that may be eating all the memory goes away in the meantime. If
even this fails, we give up and hope the upper layer will be able to
figure out what to do.

This fixes a bunch of LNL failures and flaky tests (as LNL is our
first officially supported xe.ko platform). This can be seen in dEQP
but only if multiple tests are being run parallel. Happens in multiple
tests, some of which may include:

  - dEQP-VK.sparse_resources.image_sparse_binding.2d_array.rgba8_snorm.1024_128_8
  - dEQP-VK.sparse_resources.image_sparse_binding.3d.rgba16_snorm.1024_128_8
  - dEQP-VK.sparse_resources.image_sparse_binding.3d.rgba16ui.512_256_6

I don't ever see these errors when running Alchemist/DG2 with xe.ko.

Fixes: e9f63df2f2 ("intel/dev: Enable LNL PCI IDs without INTEL_FORCE_PROBE")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30276>
2024-07-24 23:18:36 +00:00
Paulo Zanoni
c65a76db85 anv/trtt: don't just crash when we can't find device->trtt.queue
Please refer to the big comment this patch introduces.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30252>
2024-07-22 10:04:34 -07:00
Paulo Zanoni
3ab8ff99fa anv/trtt: fix the process of picking device->trtt.queue
We want to use actual sparse-capable queues as the default
trtt->queue, not copy queues that may have a companion_rcs_batch.
Before this patch, if we expose more than one queue *and* the
application creates a copy queue first, we'll end up setting
trtt->queue as the copy queue, which will GPU hang when we submit the
TR-TT batches as they don't support the pipe_control commands we
issue.

The trtt->queue queue is used for binding/unbinding buffers in code
paths where there's no specific queue coming from user space, such as
when we're creating or destroying a sparse resource.

This is not a problem yet on i915.ko since we are exposing
only a single queue, and it is not a problem for xe.ko since TR-TT is
not the default there. This is also not a problem in applications
that create the render or compute queue first. We plan to expose more
queues when using TR-TT, so this would become a problem without this
patch.

None of VK-GL-CTS seems to exercise that, and none of the Steam games
I tested exercise that as well. I was able to reproduce this issue
using our internal tracing tool.

v2: New implementation that doesn't break when we only have a compute
    queue (Lionel).

Fixes: 04bfe828db ("anv/sparse: allow sparse resouces to use TR-TT as its backend")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30252>
2024-07-22 10:04:34 -07:00
Paulo Zanoni
5ca224aa0c anv/trtt: make all contexts have the same TR-TT programming
On Gen12 (the oldest we support on Mesa right now for TR-TT) we
started having per-engine TR-TT registers and we are supposed to make
all contexts share the same TR-TT programming.

On LNL+, this is documented in the BSpec page for the TRTT_CNTRL
register (68417), with more details in HSDs 14020454786 and
16022013154.

On Gen12 platforms this information is a little harder to find and
there's a whole trail of HSDs leading up to 1209977595, which links to
the documents that describe the programming. BSpec for TR-TT on Gen12
is very confusing as it still contains registers and other information
from Gen11 that were not removed.

Regarding the additional BLT and COMP registers, please notice that on
the BSpec pages for the TR-TT registers, the "Register Instance"
section only lists the GFX registers as non-privileged. However, the
"User Mode Privileged Commands" lists the other instances of the TR-TT
Regsiters as non-privileged, which matches what we see: there's no
need to put these addresses in the FORCE_TO_NONPRIV registers.

Notice that for now, when TR-TT is being used we only expose a single
queue, so this change effectively does nothing until we start exposing
extra queues. I left that part for later to help bisectability.

v2:
 - s/trtt_init_context_state/trtt_init_queues_state/ (José)
 - pass device as the argument to init_queues_state (José)
v3:
 - use async_submit_end (José)

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30252>
2024-07-22 10:04:34 -07:00
Paulo Zanoni
6415027d85 anv/trtt: submit a separate batch in anv_trtt_init_context_state()
Having this as a separate batch was the normal behavior until
7da5b1caef ("anv: move trtt submissions over to the
anv_async_submit").

While it certainly sounds better to do everything related to TR-TT
initialization in one batch, we need to revert it back to be a
separate batch (but now using the new anv_async_submit infrastructure)
because we'll want to run this batch on every engine.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30252>
2024-07-22 10:04:34 -07:00
Paulo Zanoni
abbb4b20f3 anv/trtt: check the return value of anv_trtt_init_context_state()
I haven't seen this happening anywhere, but let's have it for
correctness.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30252>
2024-07-22 10:04:34 -07:00
Paulo Zanoni
41a95d0b13 anv/sparse: use ANV_SPARSE_BLOCK_SIZE instead of tile_size when possible
When I wrote sparse resources support for Anv we didn't have TileYs
support so I made non-opaque binds work even for non-standard block
shapes, which meant the block size could be either 64k or 4k. Since
then we merged TileYs support and changed our sparse resources
implementation to treat all the non-standard block shape cases as
"everything is the miptail", which means non-opaque binds are not
possible. So here we adjust the code to more explicitly represent
that.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29337>
2024-06-24 17:54:30 +00:00
Paulo Zanoni
8271e12b8e anv/sparse: unify and rework tile size calculation
There are 3 different places in our code where we calculate the tile
size and until recently the 3 implementations were different and with
slight bugs. Unify everything and also change the calculation to use
tile_info->phys_extent_B.

While doing this we move the isl_surf_get_tile_info() calls from
anv_sparse_calc_block_shape() to its callers so we total amount of
times we call it doesn't change.

v2: Adjust the patch now that tile_info is not part of isl_surf
anymore.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v1)
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29337>
2024-06-24 17:54:30 +00:00
Paulo Zanoni
2ac35116d1 anv/sparse: remove obsolete linear tiling code path
The code that tries to create a "pretend block shape" for linear
tiling surfaces was necessary back when we were going to support
sparse residency (non-opaque binds) for non-standard block shapes
(since there was uncertainty about TileYs support). That hasn't been
the case since before we merged sparse resources upstream, so remove
the code and leave an assertion instead, just in case.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29337>
2024-06-24 17:54:30 +00:00
Paulo Zanoni
2f65acfbb8 anv/sparse: fix TR-TT page table bo size and flags
Since commit 18d8c3ca33 we were allocating a little more than what
we were actually using (2621440 bytes instead of 2097152, aka 0x280000
instead of 0x200000), and we were not properly marking the BO as
internal. No applications should be misbehaving because of this.

Fixes: 18d8c3ca33 ("anv: Add missing ANV_BO_ALLOC_INTERNAL")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29337>
2024-06-24 17:54:30 +00:00
Paulo Zanoni
23e91fdd64 anv/sparse: dump info about opaque binds when DEBUG_SPARSE
I've found myself adding this piece of code to our codebase when
debugging some Zink sparse failures recently, so let's upstream it.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29337>
2024-06-24 17:54:30 +00:00
Paulo Zanoni
6a6d449a1d anv/sparse: fix reporting of VK_SPARSE_IMAGE_FORMAT_SINGLE_MIPTAIL_BIT
This calculation was wrong for both compressed formats and
multi-sampled images. As a result, we misreported the image as having
a single miptail.

No Vulkan or GL CTS tests were tripping on this bug. I found this
while looking for tile size calculations after fixing a similar bug
elsewhere in the code.

The calculation should now match what we have in
anv_sparse_bind_image_memory(), which is widely tested.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29337>
2024-06-24 17:54:30 +00:00
Paulo Zanoni
789b53c523 anv/sparse: fix the image property sizes for multi-sampled images
We have to take the number of samples into account when calculating
the tile size. If we don't do this, multi-sampled images may end up
falling in the "goto out_everything_is_miptail" case, while in reality
multi-sampled images don't even have miptails.

Also assert that the value is one of the only two values we expect
this to be. This assert would have been useful to catch this issue,
since with multi-sampled images we were getting values like 16k or 32k
depending on the number of samples.

This helps move forward progress in some Zink tests, but does not
make them fully pass yet, as those tests are full of sub-cases and
this only helps some of them:
  KHR-GL46.sparse_texture2_tests.UncommittedRegionsAccess
  KHR-GL46.sparse_texture2_tests.SparseTexture2Commitment
  KHR-GL46.sparse_texture2_tests.SparseTexture2Lookup

Fixes: 7ef3d652b2 ("anv/sparse: enable MSAA for Sparse when applicable")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29337>
2024-06-24 17:54:30 +00:00
Paulo Zanoni
5c18ccd2d3 anv/sparse: reject 1D sparse residency images
The Vulkan spec splits sparse resources in two different features:
sparse binding and sparse residency.

Sparse binding is much simpler. It requires the resources to be fully
bound before being used and it treats them as a black box. We're
required to support sparse binding for all the formats that are
supported by non-sparse, but that's easy beacause this feature is
simpler.

Now sparse residency is the one where we're allowed to partially bind
resources, and the one that comes with more complicated features such
as block shapes and non-opaque binding of images. This feature is
subdivided into:
  - sparseResidencyBuffer
  - sparseResidencyImage2D
  - sparseResidencyImage3D
  - sparseResidency{2,4,8,16}Samples (which refers to 2D images)

Notice that there's no sparseResidencyImage1D. And if you read the
specs it's clear that sparse residency is meant for non-1D images.
Still, supporting it didn't require any extra effort in Anv so we just
did it.

That's until we started running GL CTS tests on Zink. There's a CTS
test that checks for the standard block shapes. It creates 1D images
and expects the block shapes for them to be the standard 2D block
shapes. While we could very well just patch
anv_sparse_calc_image_format_properties() to return the standard 2D
block shapes for 1D images, that's just wrong (block shapes for 1D
images are just line segments, not rectangles!) so let's just reject
this all until maybe one day Vulkan defines sparseResidencyImage1D and
we get GL_ARB_sparse_texture3 to match it, or somebody decides to
change the GL CTS test.

Testcase: KHR-GL46.sparse_texture2_tests.StandardPageSizesTestCase
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29337>
2024-06-24 17:54:30 +00:00
Lionel Landwerlin
7da5b1caef anv: move trtt submissions over to the anv_async_submit
We can remove a bunch of TRTT specific code from the backends as well
as manual submission tracking.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28975>
2024-06-13 08:29:25 +00:00
Lionel Landwerlin
8c7e1052a3 anv: simplify TRTT initialization
Drop usage of pthread mutex so initialization never fails.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28975>
2024-06-13 08:29:25 +00:00
Paulo Zanoni
e3e5f8e6db anv/sparse: assert a format can't be standard and non-standard
A format can't be standard and non-standard at the same time. If we
ever hit this assertion, it's because something behind the scenes has
evolved (such as the tiling formats) so something that was marked as
non-standard became standard. Add an assertion so we can quickly catch
these issues in the future and adjust the code.

I don't want to mix this assertion with the one in the line above
since that one is the most useful assertion we have in all the sparse
code, so it's good to know which one we're hitting.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27306>
2024-05-15 08:00:16 +00:00
Paulo Zanoni
8abfdfe576 anv/sparse: exclude Xe2's Tile64's non-standard block shapes
The Tile64 format from Xe2 is weird and some of its MSAA shapes are
non-standard. Reject them. Otherwise, we'll get dEQP failures such as:

  deqp-vk: ../../src/intel/vulkan/anv_sparse.c:829: anv_sparse_calc_image_format_properties: Assertion `is_standard || is_known_nonstandard_format' failed.

Many tests can reproduce this issue, including:

  dEQP-VK.memory.requirements.extended.image.sparse_tiling_optimal

Testcase: dEQP-VK.memory.requirements.extended.image.sparse_tiling_optimal
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27306>
2024-05-15 08:00:16 +00:00
Paulo Zanoni
e69c7cd149 anv/sparse: fix block_size_B when the image is multi-sampled
This is all that's needed to make anv_sparse_bind_image_memory() work
with multi-sampled images.

The assert() we just added would have been really helpful when
debugging this.

All the dEQP tests with "sparse" in their names are passing *even*
without this patch. Real-world applications show very clear visual
corruption for sparse MSAA images bound through non-opaque binds since
only a fraction of the the actual image ends up being bound.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27306>
2024-05-15 08:00:15 +00:00
Paulo Zanoni
620f1d1a7a anv/sparse: properly reject sample counts we don't support
Yes, I understand that this looks like the kind of check that the
applications should be doing instead of us, but if we don't that, dEQP
will have failures. If we claim support for any multi-sampled sparse
feature, dEQP will try to create multi-sampled sparse images with all
possible sample counts, including the ones supported by non-sparse but
not supported by sparse (x8 and x16 on Tile64 platforms) and also the
ones not supported at all, like x32 and x64.

This change affects a number of dEQP tests, including:
  - dEQP-VK.api.info.sparse_image_format_properties2.2d.optimal.r32g32_sfloat

Without this patch, and with sparse multi-sampling enabled, this would
hit the following assertion:
  anv_sparse.c:866: anv_sparse_calc_image_format_properties: Assertion `is_standard || is_known_nonstandard_format' failed.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27306>
2024-05-15 08:00:15 +00:00
Paulo Zanoni
af725a2ccc anv/sparse: we can't do multi-sampled depth/stencil sparse images
Our hardware has more than one layout for multi-sampled images that
use the tiling formats that give us the sparse standard block shapes:
see enum isl_msaa_layout. Only the layout we use for colored images is
compatible with the standard block shapes, so it's the only one we can
expose for multi-sampled sparse.

This change affects a number of dEQP tests, including:
  - dEQP-VK.memory.requirements.create_info.image.sparse_residency_aliased_tiling_optimal

Without this patch, and with sparse multi-sampling enabled, this test
would hit the following assertion:
  anv_sparse.c:866: anv_sparse_calc_image_format_properties: Assertion `is_standard || is_known_nonstandard_format' failed.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27306>
2024-05-15 08:00:15 +00:00
Paulo Zanoni
6d38801ebd anv/sparse: add the MSAA block shape tables
We're not enabling sparse on multi-sampled images yet, but having the
table here is a first step. The current approach should make the code
a little more compact.

These tables are in section 33.4.3: Standard Sparse Image Block Shapes
of the Vulkan 1.3 spec.

PS: I know we've questioned the need for us to have these tables here
as they are something dEQP should check, but I've hit the "this shape
is not standard" assertion multiple times during development of the
various sparse features, and that really helps narrowing down the
problems. For example, see the next 2 patches in this MR.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27306>
2024-05-15 08:00:15 +00:00
José Roberto de Souza
18d8c3ca33 anv: Add missing ANV_BO_ALLOC_INTERNAL
Some places doing driver internal allocations was not setting
ANV_BO_ALLOC_INTERNAL, so adding the flag in those places here.

This will increase the accuracy of the RMV report.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28677>
2024-04-19 13:15:01 +00:00
Paulo Zanoni
f17d7655fe anv/xe: add a 'flags' parameter to the vm_bind() kmd_backend function
For now there's only one flag, but we're about to add another.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28792>
2024-04-18 19:42:27 +00:00
Paulo Zanoni
a791805d10 anv/sparse: rework anv_free_sparse_bindings() error handling
None of the callers of anv_free_sparse_bindings() check for its return
result, and they also don't have a way to propagate it up the stack.
So just don't return error codes that won't be checked. Instead,
add an assertion so at least we can detect failures in our CI or
development runs.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28724>
2024-04-16 01:52:28 +00:00
Paulo Zanoni
95dc34cd97 anv/sparse: replace device->using_sparse with device->num_sparse_resources
The device->using_sparse variable is only used at cmd_buffer_barrier()
to decide if we need to apply the heavier-weight flushes that are only
applicable to sparse resources. The big problem here is that we need
to apply the flushes to the non-image and non-buffer memory barriers,
so we were trying to limit those only to applications that ever submit
a sparse resource to the sparse queue.

The reason why we were applying this only to devices that ever
submitted sparse resources is that dxvk games have this thing where
during startup they create and then delete tiny sparse resources, so
switching device->using_sparse to true at resource creation would make
basically every dxvk game start applying the heavier-weight
workaround.

The problem with all that is that even if an application creates a
sparse resource but doesn't ever bind them, the resource should still
behave as an unbound resource (because they are bound with a NULL
bind), so the flushes affecting them should happen. This case is
exercised by vkd3d-proton/test_buffer_feedback_instructions_sm51.

In order to satisfy all the above cases and only really apply the
heavier-weight flushes to applications actually using sparse
resources, let's just count the number of sparse resources that
currently exist and then apply the workaround only if it's not zero.
That covers the dxvk case since dxvk deletes the resources as soon as
they create, so num_sparse_resources goes back to 0.

Testcase: vkd3d-proton/test_buffer_feedback_instructions_sm51
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10960
Fixes: 6368c1445f ("anv/sparse: add the initial code for Sparse Resources")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28724>
2024-04-16 01:52:28 +00:00
Paulo Zanoni
0c1dbfe899 anv/sparse: remove unused dump_vk_sparse_memory_bind()
This went unused a while ago. If we decide we want it again we can
just add it back.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28724>
2024-04-16 01:52:28 +00:00
Paulo Zanoni
ba3b1c2d12 anv/sparse: adjust sparse_bind_image_memory debug messages
Since we moved the dump_anv_vm_bind() call to anv_sparse_bind(), that
BEGIN/END block stopped making sense, so just keep the first set of
messages.

Also wrap everything around a single INTEL_DEBUG() check so we'll only
run this check once when debug is disabled (we don't care about
running the check multiple times if it's enabled).

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28724>
2024-04-16 01:52:28 +00:00
Paulo Zanoni
f73385f8ff anv/sparse: remove unnecessary popcount assertions
In both cases we end up calling anv_image_aspect_to_plane(), which
already includes the same assertion.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28724>
2024-04-16 01:52:28 +00:00
Paulo Zanoni
2f5638cf2e anv/sparse: remove useless isl_surf_get_tile_info() call
If isl_surf_get_tile_info() returned the struct instead of having it
passed as a pointer, gcc would have detected this. I can write patches
for that if we want it.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28724>
2024-04-16 01:52:28 +00:00
José Roberto de Souza
9102cb972a anv: Replace the 2 sparse booleans by 1 enum
Having just one place to check the Sparse type is less error prone.
For example in i915 it was always setting sparse_uses_trtt to true
even if running in gfx 9 that don't support sparse.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28161>
2024-03-14 15:53:22 +00:00
Paulo Zanoni
a8f7d26c2b anv: change the vm_bind-related kmd_backend vfuncs to return VkResult
All these vfuncs funnel down to either stubs or the xe_vm_bind_op()
function. By returning int we're shifting VkResult generation to the
callers, which are simply not doing the correct job. If they get
VkResult they can simply throw the errors up the stack without having
to erroneously try to figure out what really happened.

Today the callers are returning either VK_ERROR_UNKNOWN or
VK_ERROR_OUT_OF_DEVICE_MEMORY, but after the patch we're returning
either VK_ERROR_OUT_OF_HOST_MEMORY or VK_ERROR_DEVICE_LOST.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27926>
2024-03-08 23:14:09 +00:00
Paulo Zanoni
8051919b3c anv/sparse: leave the semaphore waits and signals to the vm_bind ioctl
We can now finally leave the semaphore waits and signals to the
vm_bind ioctl, making vm_bind operations truly asynchronous.

This was previously done for TR-TT in 18bd00c024 ("anv/trtt: don't
wait/signal syncobjs using the CPU anymore").

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27926>
2024-03-08 23:14:09 +00:00
Paulo Zanoni
aa07d8a04c anv/sparse: don't issue a single bind operation per vm_bind ioctl
The xe.ko driver finally fixed bug 746, which means we can finally
pass multiple bind operations in a single ioctl. There's a dEQP test
that issues 960 bind operations in a single call, so our gains here
have potential, although most real-world apps are not even remotely
close to this.

Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/746
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27926>
2024-03-08 23:14:09 +00:00
Paulo Zanoni
2526308dcd anv/sparse: allow binding operations to match the resource size
The resource size doesn't need to match the binding granularity. For
example, if the user wants to create a 32kb buffer, Anv will require
its memory to have 64kb, but the buffer size will still be the
original 32kb. And the spec says:

  VUID-VkSparseMemoryBind-size-01100:
    "size must be less than or equal to the size of the resource minus
     resourceOffset"
  VUID-VkSparseMemoryBind-size-01102:
    "size must be less than or equal to the size of memory minus
     memoryOffset"

So when binding such buffer, size should actually be the lesser of the
two values: 32kb, and we have to accept that. Since our binding
granularity is 64kb, we're safe to simply extend the requested size to
match our binding granularity, since we already require the memory to
be appropriately sized.

None of this is exercised by dEQP. This was caught by
piglit/arb_sparse_buffer-basic using Zink.

Testcase: piglit/arb_sparse_buffer-basic
Issue: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10220
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26410>
2024-02-21 22:58:42 +00:00
Paulo Zanoni
a501a840a3 anv/sparse: add an extra step before anv_sparse_bind_resource_memory()
I need to add some sparse-related checks that require having the
anv_buffer and anv_image, and putting them directly inside
anv_queue_submit_sparse_bind_locked() doesn't feel like the right
thing to do. Here we change the interface so now we have
anv_sparse_bind_buffer() and anv_sparse_bind_image_opaque() as the
main interface into anv_sparse.c, so they both can call the lower
level anv_sparse_bind_resource_memory() function.

In the next patch we'll be adding changing the code of the functions
we just created, justifying their addition.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26410>
2024-02-21 22:58:42 +00:00
Rohan Garg
c69650a95e isl,blorp,anv: introduce ISL_TILING_64_XE2 for Xe2+ platforms
Xe2+ changed the msaa mapping for 2D/3D Tile64 surfaces, introduce a
Xe2+ specific enum to handle this change.

Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27113>
2024-01-24 17:01:48 +01:00
Lionel Landwerlin
e1b9a6e4f3 anv: initial RMV support
Launch with :

$ MESA_VK_TRACE=rmv MESA_VK_TRACE_TRIGGER=/tmp/trig ./my_app

In another terminal, trigger a capture :

$ touch /tmp/trig

The application with create a snapshot and print out :

RMV capture saved to '/tmp/my_app_2024.01.19_10.56.33.rmv'

Then just open it with RMV :

./RadeonMemoryVisualizer /tmp/my_app_2024.01.19_10.56.33.rmv

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26843>
2024-01-23 17:24:19 +00:00
Paulo Zanoni
bf0f261c1e anv/sparse: document USAGE_2D_3D_COMPATIBLE as non-standard too
The standard block shapes (and by extension, the tiling formats they
require) are simply incompatible with getting a 2D view of a 3D image.
I couldn't find in the Vulkan spec anything related to what are the
expectations when trying to use both at the same time.

So here we "document" that this case is known non-standard. Please
notice that since we report residencyStandard3DBlockShape as true we
were actually supposed to support this case, but I can't see how this
would be possible, so set is_known_nonstandard_format to true so we
can avoid the assert() that comes right after.

Fixes the following when using Zink:
  KHR-GL46.sparse_texture_tests.SparseTextureAllocation

Also "moves forward" the following test on Zink, so it now hits a
different assertion:
  KHR-GL46.sparse_texture_tests.SparseTextureCommitment

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26454>
2023-12-06 00:29:58 +00:00
Paulo Zanoni
181aa83027 anv/tr-tt: assert the bind size is a multiple of the granularity
If the size here is not a multiple of the granularity (64kb) then
we'll miss our "pages" estimation by 1. We could fix this with
DIV_ROUND_UP() or by simply putting a "+1" there, but the upper layers
should now be preventing this case so let's just put the assertion
here.

Previously it was possible to hit this case with Zink by running
under certain conditions piglit/arb_sparse_buffer-basic.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26454>
2023-12-06 00:29:58 +00:00
Paulo Zanoni
c87f7c13fa anv/sparse: reject binds that are not a multiple of the granularity
From the spec:

  "Resources can be bound at some defined (sparse block) granularity."

  "The sparse block size in bytes for sparse buffers and
   fully-resident images is reported as
   VkMemoryRequirements::alignment. alignment represents both the
   memory alignment requirement and the binding granularity (in bytes)
   for sparse resources."

Not only the upper layer (the Spec) doesn't allow this, the lower
layers (both the vm_bind ioctl and TR-TT) also work on a granularity.
Just check for this case and return an error.

Before this check, what would happen was:
  - for the vm_bind backend, the vm_bind ioctl would fail
  - for the TR-TT backend, we'd understimate l1_binds_capacity and
    fail an assertion, or we'd just silently bind 64kb instead of the
    original size

Currently, some Zink tests such as piglit/arb_sparse_buffer-basic can
trigger this behavior, but we're working to fix Zink for this case
(and that commit may be merged before this one).

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26454>
2023-12-06 00:29:58 +00:00
Paulo Zanoni
563678f310 anv/sparse: don't support YCBCR 2x1 compressed formats
Regarding supporting these formats, the spec says:

  "A sparse image created using VK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
   supports all non-compressed color formats with power-of-two element
   size that non-sparse usage supports. Additional formats may also be
   supported and can be queried via
   vkGetPhysicalDeviceSparseImageFormatProperties.
   VK_IMAGE_TILING_LINEAR tiling is not supported."

Regarding the formats themselves, the spec says:
  "VK_FORMAT_B8G8R8G8_422_UNORM specifies a four-component, 32-bit
   format containing a pair of G components, an R component, and a B
   component, collectively encoding a 2×1 rectangle of unsigned
   normalized RGB texel data. One G value is present at each i
   coordinate, with the B and R values shared across both G values and
   thus recorded at half the horizontal resolution of the image. This
   format has an 8-bit B component in byte 0, an 8-bit G component for
   the even i coordinate in byte 1, an 8-bit R component in byte 2,
   and an 8-bit G component for the odd i coordinate in byte 3. This
   format only supports images with a width that is a multiple of two.
   For the purposes of the constraints on copy extents, this format is
   treated as a compressed format with a 2×1 compressed texel block."

Since these formats are to be considered compressed 2x1 blocks and we
don't necessarily have to support non-compressed formats that
non-sparse support, we can claim them as not supported with sparse.

In addition to all of that, if you look at isl_gfx125_filter_tiling()
you'll see that we don't even support Tile64 for these formats, so
sparse residency (i.e., non-opaque image binds) doesn't really make
sense for them yet.

The Vulkan spec defines 4 other YCBCR "2x1 compressed" formats like
the ones we have in this commit, but we don't support them even
without sparse, so there's no reason to check them here.

A recent change in VK-GL-CTS made tests that use these formats go from
unsupported to failures:
  7ecc7716a983 ("Do not use and check for STORAGE image support, when
  it is not used in the test")

This commit "fixes" the following VK-GL-CTS failures (by making them
return NotSupported):
  dEQP-VK.sparse_resources.image_block_shapes.2d.b8g8r8g8_422_unorm.samples_1
  dEQP-VK.sparse_resources.image_block_shapes.2d.g8b8g8r8_422_unorm.samples_1
  dEQP-VK.sparse_resources.image_block_shapes.2d_array.b8g8r8g8_422_unorm.samples_1
  dEQP-VK.sparse_resources.image_block_shapes.2d_array.g8b8g8r8_422_unorm.samples_1

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25512>
2023-11-17 17:58:29 +00:00
Paulo Zanoni
fda5163f34 anv/trtt: properly handle the lifetime of TR-TT batch BOs
We need to wait for the batches to complete before we return the BOs
to the pool. We were previously doing this completely synchronously,
which made the code unnecessarily wait. Now we have a timeline syncobj
that signals completion of the previous BOs, so sometimes we check
where we are in the timeline and then return the BOs that we know are
unused.

This, in addition to the previous patch that made us wait for the
other syncobjs through the execbuf ioctl instead of through the CPU,
makes TR-TT batches at least an order of magnitude faster. Still, I
don't think we'll notice any changes in games's FPS as they don't bind
sparse resources that often.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25512>
2023-11-17 17:58:29 +00:00
Paulo Zanoni
18bd00c024 anv/trtt: don't wait/signal syncobjs using the CPU anymore
Pass them as part of the TR-TT batch. This is what a lot of the
previous commits were building up to.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25512>
2023-11-17 17:58:28 +00:00
Paulo Zanoni
040063c156 anv/sparse: move waiting/signaling syncobjs to the backends
Move waiting/signaling to the backends so we can fix each backend
separately.

As I write this patch the vm_bind backend is back to using synchronous
vm_binds so we can't pass syncobjs to the synchronous vm_bind ioctl
anymore. We'll need more discussions and possibly some rework before
we go back to asynchronous vm_binds. This commit should allow us to
fix the TR-TT backend in the next commit and leave vm_bind for later.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25512>
2023-11-17 17:58:28 +00:00
Paulo Zanoni
f6d28bec6d anv/sparse: add 'queue' to anv_sparse_submission
If we're going to move syncobj waiting/signaling down to the backend
we're going to need a queue to signal as lost in case those operations
fail.

In some places of the stack we don't have a queue available, such as
when we're creating or destroying resources. For those, for vm_bind
cases we don't use the queue for anything so passing it as NULL is
fine. For TR-TT we are already using device->trtt.queue.

For TR-TT specifically this also means we're going to start using the
actual queues from the call stack instead of trtt->queue, but that
shouldn't make any difference since we only ever have one queue.
Still, this is more technigally correct.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25512>
2023-11-17 17:58:28 +00:00
Paulo Zanoni
576275907a anv/sparse: pass anv_sparse_submission to the backend functions
Our ultimate goal is to have the backend functions deal with the wait
and signal syncobjs instead of waiting for them on the CPU inside
anv_queue_submit_sparse_bind_locked(). For that, we'll need waits and
signals parameters to be passed all the way to the backend functions
that actually make the submission, and this is what this patch does,
through struct anv_sparse_submission.

This patch just deals with passing the parameters to the functions,
nothing is using the new variables yet. There should be no functional
changes here. The goal here is to make code review easier.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25512>
2023-11-17 17:58:28 +00:00