Xe2+ platforms don't use fast-type buffer for its new design.
We don't have to track different fast-clear types, so we just
return the highest level of support.
Fixes: Vulkan CTS
dEQP-VK.api.copy_and_blit.core.resolve_image.whole_array_image
_one_region.8_bit_not_all_remaining_layers
src/intel/vulkan/anv_private.h:5439: anv_image_get_fast_clear_type_addr:
Assertion `device->info->ver < 20' failed.
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29966>
Xe2+ doesn't use aux tracking buffers, and we should not
have access to the fast-clear type and compression state.
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29966>
Newer hardware is smart enough to know that if something writes to a
NULL tile and immediately reads back the value (from the cache), the
value should read back as zero, not whatever was written to the cache
but not the memory. Due to that, we don't need to flush the tile
cache, which is quite expensive.
Link: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11029
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29953>
Coverity has spotted a place where we could in theory overflow. In
reality it wont happen as the potential overflow is a bitfield with a
maximum of two values. Add an `assume()` statement to help out the
compiler and document our assumption.
fixes: dc1aedef2b
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29825>
One variant uses a protected scratch surface the other not.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29778>
Replacing the i915_perf_version that is i915 specific by a feature
mask makes easier to support Xe KMD.
Also this allow us to group a bool and a int into a single enum(int).
No changes in behavior is expected here.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29312>
GuC offers a mechanism for KMD/UMD to provide workload hints and one of
that strategy is low latency hint. We can utilize this hint when the
workload is more latency sensitive like compute usecases.
Signed-off-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28282>
Stop using the option in the generic pass
nir_lower_compute_system_values and use the same code as brw uses for
compute instead.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29828>
Function anv_physical_device_try_create() creates the devinfo variable
and then at some point it copies its contents to device->info:
device->info = devinfo;
Much much later we're calling:
intel_common_update_device_info(fd, &devinfo);
... which is updating devinfo but not device->info. As a consequence,
we're only creating one queue, as engine_class_supported_count[klass]
is zero for everybody.
Fixes: 5b8b4f7878 ("intel/dev: Add engine_class_supported_count to intel_device_info")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29927>
These were only applied in emit_apply_pipe_flushes but in theory could
be required for some other individually shot pipe controls.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29897>
This fixes corruption of push constants on Xe2 due to a mismatch in
the uniform layout implemented by the compiler and assumed by the
driver. To fix it we need to align the push constant ranges computed
by the Vulkan driver to a multiple of the GRF size of the platform.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29926>
Plumb the prog_data bits recently introduced for ALU-based
interpolation down to 3DSTATE_PS_EXTRA emission in the Vulkan driver.
Even though this is only going to be used on Xe2+ for now there seems
to be no reason not to plumb the bits on all platforms back to gfx11,
since the 3DSTATE_PS_EXTRA enables already existed on ICL.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29847>
This fixes some cacheline flush artifacts
Signed-off-by: Juston Li <justonli@google.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29882>
Next patch will need to frequently get the count of supported engine
for compute and copy engines, so to reduce the overhead of doing
KMD queries at every call here caching this information into
intel_device_info struct.
With that ANV and Iris would need to set this information as intel/dev
can't depend on intel/common, so here adding a single function
to update intel_device_info with all fields filled by intel/common
functions.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29899>
When I wrote sparse resources support for Anv we didn't have TileYs
support so I made non-opaque binds work even for non-standard block
shapes, which meant the block size could be either 64k or 4k. Since
then we merged TileYs support and changed our sparse resources
implementation to treat all the non-standard block shape cases as
"everything is the miptail", which means non-opaque binds are not
possible. So here we adjust the code to more explicitly represent
that.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29337>
There are 3 different places in our code where we calculate the tile
size and until recently the 3 implementations were different and with
slight bugs. Unify everything and also change the calculation to use
tile_info->phys_extent_B.
While doing this we move the isl_surf_get_tile_info() calls from
anv_sparse_calc_block_shape() to its callers so we total amount of
times we call it doesn't change.
v2: Adjust the patch now that tile_info is not part of isl_surf
anymore.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v1)
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29337>
The code that tries to create a "pretend block shape" for linear
tiling surfaces was necessary back when we were going to support
sparse residency (non-opaque binds) for non-standard block shapes
(since there was uncertainty about TileYs support). That hasn't been
the case since before we merged sparse resources upstream, so remove
the code and leave an assertion instead, just in case.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29337>
Since commit 18d8c3ca33 we were allocating a little more than what
we were actually using (2621440 bytes instead of 2097152, aka 0x280000
instead of 0x200000), and we were not properly marking the BO as
internal. No applications should be misbehaving because of this.
Fixes: 18d8c3ca33 ("anv: Add missing ANV_BO_ALLOC_INTERNAL")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29337>
I've found myself adding this piece of code to our codebase when
debugging some Zink sparse failures recently, so let's upstream it.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29337>
This calculation was wrong for both compressed formats and
multi-sampled images. As a result, we misreported the image as having
a single miptail.
No Vulkan or GL CTS tests were tripping on this bug. I found this
while looking for tile size calculations after fixing a similar bug
elsewhere in the code.
The calculation should now match what we have in
anv_sparse_bind_image_memory(), which is widely tested.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29337>
We have to take the number of samples into account when calculating
the tile size. If we don't do this, multi-sampled images may end up
falling in the "goto out_everything_is_miptail" case, while in reality
multi-sampled images don't even have miptails.
Also assert that the value is one of the only two values we expect
this to be. This assert would have been useful to catch this issue,
since with multi-sampled images we were getting values like 16k or 32k
depending on the number of samples.
This helps move forward progress in some Zink tests, but does not
make them fully pass yet, as those tests are full of sub-cases and
this only helps some of them:
KHR-GL46.sparse_texture2_tests.UncommittedRegionsAccess
KHR-GL46.sparse_texture2_tests.SparseTexture2Commitment
KHR-GL46.sparse_texture2_tests.SparseTexture2Lookup
Fixes: 7ef3d652b2 ("anv/sparse: enable MSAA for Sparse when applicable")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29337>
The Vulkan spec splits sparse resources in two different features:
sparse binding and sparse residency.
Sparse binding is much simpler. It requires the resources to be fully
bound before being used and it treats them as a black box. We're
required to support sparse binding for all the formats that are
supported by non-sparse, but that's easy beacause this feature is
simpler.
Now sparse residency is the one where we're allowed to partially bind
resources, and the one that comes with more complicated features such
as block shapes and non-opaque binding of images. This feature is
subdivided into:
- sparseResidencyBuffer
- sparseResidencyImage2D
- sparseResidencyImage3D
- sparseResidency{2,4,8,16}Samples (which refers to 2D images)
Notice that there's no sparseResidencyImage1D. And if you read the
specs it's clear that sparse residency is meant for non-1D images.
Still, supporting it didn't require any extra effort in Anv so we just
did it.
That's until we started running GL CTS tests on Zink. There's a CTS
test that checks for the standard block shapes. It creates 1D images
and expects the block shapes for them to be the standard 2D block
shapes. While we could very well just patch
anv_sparse_calc_image_format_properties() to return the standard 2D
block shapes for 1D images, that's just wrong (block shapes for 1D
images are just line segments, not rectangles!) so let's just reject
this all until maybe one day Vulkan defines sparseResidencyImage1D and
we get GL_ARB_sparse_texture3 to match it, or somebody decides to
change the GL CTS test.
Testcase: KHR-GL46.sparse_texture2_tests.StandardPageSizesTestCase
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29337>
Some applications may benefit from this while some can get a performance
hit. Default to false and make it possible to toggle only for selected
workloads.
See workaround 14022483228 for some measurements.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29760>
This is a follow up to 059e82a4 ("anv: remove descriptor array bounds
checking"), that kind of bound checking is not required by the spec.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29663>
Makes Cyberpunk, Hitman and Total War Warhammer 3 run on LNL.
Fixes: c9e41f25a1 ("anv: Add heaps for Xe KMD in platforms without LLC")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29775>
In this assert we want to enforce that if a cached buffer is created
it is a cached+coherent as Xe KMD don't support cached+incoherent.
Did not caught this issue because it only reproduces in platforms with
GPU outside of LLC.
Fixes: 9d8d5cf8c9 ("anv: Remove block promoting non CPU mapped bos to coherent")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29826>
This field encodes bits [27:6] of the scratch surface state offset
according to the hardware spec, already on XeHP platforms. However,
on previous platforms we were passing bits [25:4] instead, which was
apparently okay for two reasons:
1/ We never used more than 8 MB of scratch surface states apparently.
2/ A shift right by 2 was implicitly happening while copying the
value of r0.5 into the address register holding the extended
descriptor, which with the ExBSO addressing mode disabled
considered bits [31:12] as the surface state index within the
pool.
However on Xe2 ExBSO addressing mode is always enabled for the UGM
shared function, so we have to add an extra SHR instruction to format
the extended descriptor regardless, and there is no point in
disobeying the hardware spec passing a left-shifted offset.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29543>
Compressed memory types are not CPU visible and Vulkan specification
don't have any requirement about that but some applications like
vkcube fails to run without a host visible option, so here appending
default_buffer_mem_types and compressed_mem_types.
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28833>
Xe2 replaces auxiliary surface mapping by software to compress buffers,
instead it reserves part of the memory for the compression purpose.
To enable compression in Xe2 it is necessary bind memory with one of
the PAT indexes that has compression enabled.
It is still always returning false in anv_image_is_pat_compressible()
as it still needs more work before compression can be enabled but the
foundation for the compressed allocation is here.
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28833>
We'll reduce pitch alignment in a following patch. However, CCS
fast-clears don't seem to work unless the pitch is 512B aligned.
Disable fast clears for unaligned pitches.
Prevents the next patch from failing the following piglit tests:
* fbo-attachments-blit-scaled-linear
* hiz-stencil-test-fbo-d24s8
* hiz
* polygon-mode-facing
* clearbuffer-mixed-format
* glsl-lod-bias (transient failure)
No failures have been observed in anv, but there are more restrictions
for fast-clears in that driver compared to iris.
Note:
* The -fbo flag is necessary to make these fail. Otherwise, they end up
with aligned render targets.
* Each of these tests allocate an image that has a pitch greater than
512B and they collectively cover all the misalignment options - 128B,
256B and 384B.
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29659>
Intel modifiers supporting compression are specified to be compatible
with the display engine, even if they won't actually be used for
scanout.
Attempting to capture a wider scope of modifiers resulted in test
errors. I chose to narrow the scope instead of digging into them.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29659>