In iris, this should avoid some partial resolves when copying between
images. In anv, this will reduce restrictions on dmabufs which have
clear color support in the next patch.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31136>
blorp_copy_get_formats() tries to make the source and destination view
formats match as much as possible. This avoids some casting in the copy
shader, but it makes determining the format that will be used for a
surface impossible without having the ISL surface for both that surface
and a source or destination.
We'd like to enable the Vulkan driver to know as early as possible what
format an image may be reinterpreted as for correctness. So, determine
the copy formats more independently and expose a helper which does so
for drivers.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31136>
blorp_copy() will sometimes use a complex shader if the source and
destination surface formats differ. For example, it will do this when
both formats support CCS_E, but have differing numbers of
bits-per-channel.
To reduce the chance of using this complex shader during transfers
between images and buffers, ensure the same format is used. We can't
completely prevent the complex shader because a copy may happen between
surface formats that have a different number of bits-per-pixel.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31136>
Patch converts MI_LOAD_REGISTER_MEM, MI_LOAD_REGISTER_IMM to use
mi_builder in CmdBeginTransformFeedbackEXT.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31502>
This avoids sprinkling those all over the code base. Debug breakpoints
are put in there too.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Backport-to: 24.2
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31481>
A few EU validation tests had to be updated to account for larger GRF,
extra supported types for 3-src instructions and the lack of AccWrEnable
in Xe2.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31299>
There is this weird thing in the current tests that the linear & tiled
buffers are sized the same.
They don't need to be, compute a max area we want to check (aligned to
a tile logical size), allocate the linear buffer using that aligned
size and allocate the tiled buffer using a size aligned to the
physical tile size.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31469>
The workaround is already implemented by
batch_emit_pipe_control_write(), we don't need to do it here as well.
This was spotted by Lionel Landwerlin. The credits go to him, I just
wrote the patch.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31412>
Commit a603cc0633 ("anv: move some pc was to
batch_emit_pipe_control_write") moved some WAs from
emit_apply_pipe_flushes() to batch_emit_pipe_control_write(), but it
turns out one of them was already there since cf7e1f3817 ("anv,
iris: add missing CS_STALL bit for GPGPU texture invalidation").
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31412>
This is also encouraged by another wa, Wa_14018813551.
Both workarounds state that StackIDControlOverride_RTGlobals should
always be set to 0 (i.e. 2k).
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30937>
Take advantage of 3 spare JSL in Collabora lab to load the balance of
those jobs:
job name avg duation (min)
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
--- ---
anv-jsl 15
anv-jsl-angle 20
iris-jsl-deqp 18
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31414>
We can't just always negate the alu instruction's cmod, because negating
it can produce different results when the argument is NaN float. We can
still do that if the condition is == or !=.
Fixes: 0ba9497e ("intel/fs: Improve discard_if code generation")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11800
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31042>
The per-primitive have their own separate section in the FS thread
payload, and are not considered when setting the mask in
3STATE_SBE's ConstantInterpolationEnable.
This is also consistent with what is done for brw_interp_reg().
Fixes
- dEQP-VK.mesh_shader.ext.misc.clip_geom_provoking_last
- dEQP-VK.mesh_shader.ext.misc.clip_geom_and_task_shader_provoking_last
Backport-to: 24.2
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11844
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31417>
When allocating a buffer normally, this flag gets to the allocator from
the memory requirements, but when sparse bindings are created we were
checking for them but never setting them.
Fixes sparse descriptor buffers on Xe2.
Makes the failure on TRTT more obvious.
Fixes: c6a91f1695 ("anv: add new heap/pool for descriptor buffers")
Fixes: 692e1ab2c1 ("anv: get rid of the second dynamic state heap")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31372>
Valgrind doesn't seem to know that drmSyncobjQuery() writes to the
variable that we pass as 'last_value'. This gets rid of:
==6275== Conditional jump or move depends on uninitialised value(s)
==6275== at 0x5308370: anv_sparse_trtt_garbage_collect_batches (anv_sparse.c:540)
==6275== by 0x53091E2: anv_sparse_bind_trtt (anv_sparse.c:825)
==6275== by 0x5309771: anv_sparse_bind (anv_sparse.c:953)
==6275== by 0x5309A3B: anv_free_sparse_bindings (anv_sparse.c:1041)
==6275== by 0x529FF21: anv_DestroyBuffer (anv_buffer.c:248)
==6275== by 0x932ADBD: ??? (in /usr/lib/x86_64-linux-gnu/libVkLayer_khronos_validation.so)
==6275== by 0x127AA2: MyVkBuffer::~MyVkBuffer() (sparse.cpp:364)
==6275== by 0x12B2D4: MyApp::test1_trivial_sparse() (sparse.cpp:1421)
==6275== by 0x13E01A: MyApp::run_test(int) (sparse.cpp:6594)
==6275== by 0x13E3B0: main (sparse.cpp:6656)
==6275== Uninitialised value was created by a stack allocation
==6275== at 0x53082D3: anv_sparse_trtt_garbage_collect_batches (anv_sparse.c:525)
An alternative to these Valgrind macros would simply have been to
zero-intialize last_value.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31332>
I just noticed that my custom sparse program was not working correctly
when I used ANV_QUEUE_OVERRIDE (instead of enabling the compute queue
by default or using INTEL_ENGINE_CLASS_COMPUTE, which was removed by
commit 600d88ab3c ("intel: Remove INTEL_ENGINE_CLASS_COMPUTE and
INTEL_ENGINE_CLASS_COPY parameters").
It turns out we were not setting the same engine class type when using
ANV_QUEUE_OVERRIDE vs the other cases. Move the code around so the
behavior can stay the same.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31332>
This avoids massively long shader compile times when there is lots of
spilling, at a minor cost of a few more spills/fills. Choose 15 as it is
already the default used by the Cyberpunk 2077 driconf workaround.
Surprisingly the number of additional spills/fills are miniscule in
fossil-db:
Instructions in all programs: 152680595 -> 152681525 (+0.0%)
SENDs in all programs: 7672789 -> 7672789 (+0.0%)
Loops in all programs: 48469 -> 48469 (+0.0%)
Cycles in all programs: 11981743456 -> 11984228708 (+0.0%)
Spills in all programs: 42989 -> 42779 (-0.5%)
Fills in all programs: 76380 -> 76776 (+0.5%)
partly because of the chaotic unpredictability that the choice of
registe to spill has on a shader. For example, this patch massively
helps some shaders in terms of spills/fills:
Spills helped fossils/fossil-db/steam-native/red_dead_redemption2.vk-g6.foz/4101ff9c9b83bf22/SIMD8 fragment: 3208 -> 2894 (-9.8%)
Fills helped fossils/fossil-db/steam-native/red_dead_redemption2.vk-g6.foz/4101ff9c9b83bf22/SIMD8 fragment: 7258 -> 6795 (-6.4%)
Spills helped fossils/q2rtx/q2rtx-rt-pipeline.976f4ab1c0fee975.1.foz/c496e8a549f6b4bf/compute: 109 -> 92 (-15.6%)
Related: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31133
Related: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9241
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11709
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11844
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31269>
Having the actual generated assembly is helpful when trying to figure
out if the code emission and disassembly are implemented correctly.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31305>
This tests is asserting on LNL like :
dEQP-VK.pipeline.monolithic.sampler.border_swizzle.r8_srgb.gbar.custom.gather_1.no_swizzle_hint
dEQP-VK.api.image_clearing.core.clear_color_image.2d.optimal.single_layer.e5b9g9r9_ufloat_pack32
Because blorp tries, for example, to setup a render target with
L8_UNORM_SRGB (which is mapped to the R8_UNORM_SRGB of Vulkan) but is
not supported for rendering.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 1c7fe9ad1b ("anv: Support fast clears in anv_CmdClearColorImage")
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31357>
During compute state save/restore, let's track all the descriptor sets.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30798>
When tests were added, there was a single pipe (float), so there wasn't
a pipe to compare in `operator==`. Add it there now and adjust
expectations accordingly.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31335>