Pretty big change... Sorry for that.
I can't exactly remember why I created 2 heaps. I think it's because I
mistakenly thought the samplers in the binding sampler pointers needed
to be indexed from the binding table. But that's not the case, they
just need to be in the dynamic state heap.
In the future, this change will allow to also allocate buffers for
push constant data in the newly created dynamic_visible_pool which
will be useful on < Gfx12.0 where this is the only place push constant
data can live for compute shaders.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30047>
Commit 94989b45a5 ("anv,driconf: Add fake non device local memory WA
for Total War: Warhammer 3") implemented a workaround to make
Warhammer 3 work on ADL, but the game still doesn't work on LNL, which
uses xe.ko, and MTL, which uses i915.ko: it still fails at launch
claiming it couldn't allocate memory.
So in this implementation, instead of clearing DEVICE_LOCAL_BIT we
just duplicate our memory types, one having the bit and one not
having.
v2:
- Check for VK_MAX_MEMORY_TYPES (José)
- Invert the order of the memory types (José)
- Fix white space issue (José)
v3:
- Comment our non-spec-compliance (José)
- Remove useless lines (José)
Link: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8721
Fixes: 94989b45a5 ("anv,driconf: Add fake non device local memory WA for Total War: Warhammer 3")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30162>
Replacing the i915_perf_version that is i915 specific by a feature
mask makes easier to support Xe KMD.
Also this allow us to group a bool and a int into a single enum(int).
No changes in behavior is expected here.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29312>
GuC offers a mechanism for KMD/UMD to provide workload hints and one of
that strategy is low latency hint. We can utilize this hint when the
workload is more latency sensitive like compute usecases.
Signed-off-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28282>
Function anv_physical_device_try_create() creates the devinfo variable
and then at some point it copies its contents to device->info:
device->info = devinfo;
Much much later we're calling:
intel_common_update_device_info(fd, &devinfo);
... which is updating devinfo but not device->info. As a consequence,
we're only creating one queue, as engine_class_supported_count[klass]
is zero for everybody.
Fixes: 5b8b4f7878 ("intel/dev: Add engine_class_supported_count to intel_device_info")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29927>
Next patch will need to frequently get the count of supported engine
for compute and copy engines, so to reduce the overhead of doing
KMD queries at every call here caching this information into
intel_device_info struct.
With that ANV and Iris would need to set this information as intel/dev
can't depend on intel/common, so here adding a single function
to update intel_device_info with all fields filled by intel/common
functions.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29899>
Some applications may benefit from this while some can get a performance
hit. Default to false and make it possible to toggle only for selected
workloads.
See workaround 14022483228 for some measurements.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29760>
Xe2 replaces auxiliary surface mapping by software to compress buffers,
instead it reserves part of the memory for the compression purpose.
To enable compression in Xe2 it is necessary bind memory with one of
the PAT indexes that has compression enabled.
It is still always returning false in anv_image_is_pat_compressible()
as it still needs more work before compression can be enabled but the
foundation for the compressed allocation is here.
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28833>
As each anv_device has its own address space it was necessary create
one dummy_aux_bo per anv_device.
Also this workaround requires us to disable the
buffer_length_in_aux_addr optimization, that is done in the physical
device creating because isl_dev of physical device is copied
to isl_dev in anv_device.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29619>
7da5b1caef ("anv: move trtt submissions over to the anv_async_submit")
added a hard dependency on timeline semaphore which is still optional.
And since it gates the sparseBinding feature, we should not use it if
sparseBinding is not enabled.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 7da5b1caef ("anv: move trtt submissions over to the anv_async_submit")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29779>
With this change, the engine initialization batches are build and
submitted at vkCreateDevice() but the function doesn't wait for them
to complete. Instead we wait at vkDestroyDevice() or whenever another
submission happens on the queue, we check whether the initialization
batch has completed (without waiting) and free it if completed.
Seems to be about 25% reduction time of vkCreateDevice()
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28975>
The array pool does a single allocation and then splits it out. The
downside is that the pool is not lockless, but for border colors it
likely doesn't matter much as there is a max border colors for 4k.
Seems to be a 30% time reduction for vkCreateDevice()
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28975>
We can remove a bunch of TRTT specific code from the backends as well
as manual submission tracking.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28975>
Drop usage of pthread mutex so initialization never fails.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28975>
This is a performance tuning value, recommended value is 512 on DG2.
On DG2 this was in the privileged register RT_CTRL.
Minor CFE_STATE defintion fixes from Jose.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29616>
Our implementation is a no-op for the following reasons :
- ISL always tries to go for the smallest tiling mode (see
isl_surf_choose_tiling())
- In the few cases where we need to use Tile64 for compression
workarounds, VK_MESA_image_alignment_control doesn't require use
to disable compression
- vkd3d-proton has the ability to disable compression using
VK_EXT_image_compression_control, disabling Tile64 requirements
and ensuring ISL can select a 4k tiling mode
So vkd3d-proton should always be able to get a 4k tiling mode if it
wants to.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29175>
WA 14018283232 indicates that we need to emit the resource barrier
when the following expression toggles value :
STATE_DEPTH_BOUNDS::depthboundstestenable & 3DSTATE_PS_EXTRA:: Pixel Shader Kills Pixel & 3DSTATE_PS_EXTRA:: Pixel Shader Valid
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29297>
so that the queue count override logic can catch Android system
properties.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29492>
Xe2 platforms allows for a larger compute shared memory(SLM).
For LNL this limit is 160KB but due to a workaround the limit is 128K.
BSpec: 71053
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28910>
So that as soon as pipelines are freed, they're removed from the
cache.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11185
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Tested-by: Brian Paul <brian.paul@broadcom.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29283>
The newer platforms can't support 8x and 16x since Tile64's shape for
them is not a standard block shape (and claiming standard block shapes
is higher priority than supporting things without it). The TileYs
platforms are fine.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27306>
Toggle on dithering if it has been enabled on device and is set on the
rendering flags used.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19414>
The debug identifier is put into the captured buffers for error
capture. This helps us figure out what version of the driver people
are running when encountering a GPU hang. This identifier has the
git-sha1 + driver name.
libintel_dev is also a dependency of the compiler so any change to the
git-sha1 also triggers recompile which we want to avoid.
This changes moves the debug identifier to src/intel/common which
drivers already depend on, so the compiler is not affected anymore.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11136
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29128>
The runtime grew support for VkPhysicalDevicePresentationPropertiesANDROID,
so we can use that now and get rid of anv_GetPhysicalDeviceProperties2.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27717>