Up to now preferred SLM size was being set to maximum preferred SLM
size for GFX 12.5 platforms and to workgroup SLM size for Xe2 but
neither of those values are the optimal.
The optimal value is:
<number of workgroups that can run per subslice> * <workgroup SLM size>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28910>
Xe2 has 2 requirements for preferred SLM size:
- this value needs to be >= then SLM size
- this value must be less than shared SLM/L1$ RAM in the sub-slice of platform
Also Xe2 don't have the special '0' encode that sets preferred SLM
allocation size to the maximum supported.
So here setting a value that is equal or larger than SLM size.
It was always setting SLM_ENCODES_128K for LNL A0 stepping probably
because of Wa_16018610683 but this restriction applies to all Xe2
platforms, also because of the first restriction mentioned here
this workaround is not being properly implemented, will fix that
in the next patch.
We should have a formula to calculate a preferred SLM allocation size
for gfx125 and Xe2 platfoms but until that this is enough to fix at
least the applications and tests below on LNL:
- GFXBench Aztec Ruins VK
- GravityMark VK
- Wildlife Extreme VK
- 5 crucible tests
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28910>
This functions were inlined in a header and duplicated between brw and
elk.
That would be enough reasons to move to a C file but next patches
will add more code to support Xe2 platforms, what would cause more
code to be inlined, duplicating even more code and increasing lib
size.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28910>
These 2 compute code paths were checking for
anv_cmd_buffer_is_render_queue() before calling
flush_pipeline_select_gpgpu() causing cmd_buffer->state.current_pipeline
to never to be set to GPGPU, trigerring
assert(cmd_buffer->state.current_pipeline == GPGPU) when running in
the compute engine.
So here just dropping the anv_cmd_buffer_is_render_queue() check.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28053>
A number of allocations during command buffer building are sourced
from the dynamic state heap. They're not actually access using an
offset in the dynamic state heap, it just happens to be a conveninent
place.
Use different helpers for thoses so we dynamically change the dynamic
state heap location in the next commits.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22151>
Take the opportunity to fix the flush of the descriptor buffer surface
when needed. Previously we would only flush it if the shader used one
of the push descriptor.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27504>
This adds a few new fields in the brw_cs_prog_data struct and then
uses them to fill in the relevant COMPUTE_WALKER fields.
Although the Tile Layout field theoretically has different settings for
32/64/128bpe, it appears that the recommended programming is to always
pick either TileY 32bpe or Linear. It's not very practical to look at
the surface formats involved, anyway.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27167>
We're missing the ISA code in renderdoc. You can reproduce with the
Sascha Willems graphics pipeline demo.
The change is large here because we have to fix a confusion between
anv_shader_bin & anv_pipeline_executable. anv_pipeline_executable is
there as a representation for the user and multiple
anv_pipeline_executable can point to a single anv_shader_bin.
In this change we split the anv_shader_bin related logic that was
added in anv_pipeline_add_executable*() and move it to a new
anv_pipeline_account_shader() function.
When importing RT libraries, we add all the anv_pipeline_executable
from the libraries.
When importing Gfx libraries, we add the anv_pipeline_executable only
if not doing link time optimization.
anv_shader_bin related properties are added whenever we're importing a
shader from a library, compiling or finding in the cache.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 3d49cdb71e ("anv: implement VK_EXT_graphics_pipeline_library")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26594>