One variant uses a protected scratch surface the other not.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29778>
Plumb the prog_data bits recently introduced for ALU-based
interpolation down to 3DSTATE_PS_EXTRA emission in the Vulkan driver.
Even though this is only going to be used on Xe2+ for now there seems
to be no reason not to plumb the bits on all platforms back to gfx11,
since the 3DSTATE_PS_EXTRA enables already existed on ICL.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29847>
This field encodes bits [27:6] of the scratch surface state offset
according to the hardware spec, already on XeHP platforms. However,
on previous platforms we were passing bits [25:4] instead, which was
apparently okay for two reasons:
1/ We never used more than 8 MB of scratch surface states apparently.
2/ A shift right by 2 was implicitly happening while copying the
value of r0.5 into the address register holding the extended
descriptor, which with the ExBSO addressing mode disabled
considered bits [31:12] as the surface state index within the
pool.
However on Xe2 ExBSO addressing mode is always enabled for the UGM
shared function, so we have to add an extra SHR instruction to format
the extended descriptor regardless, and there is no point in
disobeying the hardware spec passing a left-shifted offset.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29543>
See also HSDES#14015504893 regarding the region-based tessellation
redistribution feature which allows fine-tuning the number of regions
per patch. This sets it to the recommended value, since region-based
redistribution is enabled by default.
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29562>
Up to now preferred SLM size was being set to maximum preferred SLM
size for GFX 12.5 platforms and to workgroup SLM size for Xe2 but
neither of those values are the optimal.
The optimal value is:
<number of workgroups that can run per subslice> * <workgroup SLM size>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28910>
Xe2 has 2 requirements for preferred SLM size:
- this value needs to be >= then SLM size
- this value must be less than shared SLM/L1$ RAM in the sub-slice of platform
Also Xe2 don't have the special '0' encode that sets preferred SLM
allocation size to the maximum supported.
So here setting a value that is equal or larger than SLM size.
It was always setting SLM_ENCODES_128K for LNL A0 stepping probably
because of Wa_16018610683 but this restriction applies to all Xe2
platforms, also because of the first restriction mentioned here
this workaround is not being properly implemented, will fix that
in the next patch.
We should have a formula to calculate a preferred SLM allocation size
for gfx125 and Xe2 platfoms but until that this is enough to fix at
least the applications and tests below on LNL:
- GFXBench Aztec Ruins VK
- GravityMark VK
- Wildlife Extreme VK
- 5 crucible tests
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28910>
This functions were inlined in a header and duplicated between brw and
elk.
That would be enough reasons to move to a C file but next patches
will add more code to support Xe2 platforms, what would cause more
code to be inlined, duplicating even more code and increasing lib
size.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28910>
Workaround describes that we need to set instance level distribution
granularity when primitive id is used by the draw.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27955>
When moving the static part, I missed that the
pipeline->primitive_id_override field isn't set yet when we check it
to emit 3DSTATE_TE.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 1e081bd680 ("anv: split 3DSTATE_TE packing between static & dynamic parts")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27692>
When URB state for DS changes, we need to emit URB setup for VS with
256 handles and 0 for rest, commit this using a HDC flush before
setting real values.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26920>
Patch adds a structure holding urb configuration. This makes it nicer
to pass it around as example for blorp. We need to be able to sometimes
compare with last urb configuration to be able to implement some
workaround.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26920>
Before this, the render pass code or the driver combined the pipeline
create flags and the implicit flags from the render pass, but the
pipeline create flags will need to be sanitized when they are dynamic
state, so we need to do it in vk_graphics_state where we know that
information.
We also weren't combining pipeline flags correctly when linking, which
on turnip was being hidden by the lack of sanitizing for driver-provided
flags. We can't combine them correctly if they're part of the render
pass state, so they need to be pulled out into the overall pipeline
state.
For drivers using emulated renderpasses or tracking feedback loop
information themselves, this won't make a difference, but we have to
adapt turnip to not pass pipeline flags. This also means that we can
drop all handling of feedback_loop_input_only in turnip and just set it
in the runtime.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25436>
Switch to using VkPipelineCreateFlags2KHR, and use the new common helper
to get the right flags.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25436>
While we don't need to emit all of the unused mesh/task states when mesh
is disabled, if we don't have them we fail some assertions in the
difference checks due to the corresponding state being empty.
This may happen when going from a mesh pipeline to a non-mesh one, or
one that uses task shaders to one that doesn't.
It may be possible to avoid having to do this, but I'd rather start from
a working state and optimize it later.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25109>
3DSTATE_VFG was moved into a section that only gets emitted for legacy
pipelines, not mesh pipelines.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 0ce772bd19 ("anv: split 3DSTATE_VFG emission")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25109>
The goal of this change it to move away from a single batch buffer
containing all kind of pipeline instructions to a list of instructions
we can emit separately.
We will later implement pipeline diffing and finer state tracking that
will allow fewer instructions to be emitted.
This changes the following things :
* instead of having a batch & partially packed instructions, move
everything into the batch
* add a set of pointer in the batch that allows us to point to each
instruction (almost... we group some like URB instructions,
etc...).
At pipeline emission time, we just go through all of those pointers
and emit the instruction into the batch. No additional packing is
involved.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24536>
Leave the static part in genX_pipeline.c and only repack the dynamic
part in genX_gfx_state.c
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24536>
We can reduce the amount of packing we do by only packing the dynamic
part.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24536>
This bit is set in the dynamic state emission. This is currently not
breaking anything because LEADING=0.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 71ebd9b9d7 ("anv,hasvk: respect provoking vertex setting on geometry shaders")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24536>
The initialization can be simplified and the real programming moved
over to genX_pipeline.c
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24632>
Following 71ebd9b9d7, 3DSTATE_GS can be emitted as part of the
pipeline batch and as a dynamic state. Just do the latter.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 71ebd9b9d7 ("anv,hasvk: respect provoking vertex setting on geometry shaders")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24632>
Mismatch allocator could cause bad things, so better set the allocator
on anv_reloc_list_init() and use it in every reloc function.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24411>