Now that I got a hands on access to this GPU and could run deqp-vk, it uses
blob v676.0 and the values are different from v744.19. Not only they
are different, with the values from v744 there are CTS test faulures.
Fixes at least ASTC tests.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29786>
this is now at parity with what we did. this gets rid of all divergence between
us and common vk enqueue.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28682>
it worked out because the offset was 0 but it was semantically wrong and rather
confusing
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28682>
implementation from lavapipe. there's no CTS coverage for this but hopefully
it works...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28682>
We need some manual logic to work out the size of pData, so we handroll this
one. This fixes push DUT with emulated secondaries.
Affects dEQP-VK.binding_model.shader_access.secondary_cmd_buf.*push*templ* if
emulated secondaries are used.
Neither panvk nor dozen support push DUT yet, so this isn't hurting anyone and
doesn't need to be cc'd stable. But hopefully panvk & dozen get on that :}
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28682>
We need to extend the lifetime of DUTs for capture/replay based secondaries.
Copy the reference counting boilerplate from vk_descriptor_set_layout.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Suggested-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28682>
Fix "dEQP-VK.draw.renderpass.offscreen_viewport.x_off_screen_negative_y*" tests.
Also did a bit of clean up around emit_viewport.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29802>
Most devices with a740 have blob v6xx which doesn't have TP_UBWC_FLAG_HINT
set. Match them for better compatibility by default.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29754>
On a740 TPL1_DBG_ECO_CNTL1.TP_UBWC_FLAG_HINT must be the same between
all drivers in the system, somehow having different values affects
BLIT_OP_SCALE. We cannot automatically match blob's value, so the
best thing we could do is a toggle.
Example:
FD_DEV_FEATURES=enable_tp_ubwc_flag_hint=0
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29754>
si_get_shader_prefetch_size is called each time a shader is changed.
Since the size of a given variant never changes, we can compute the
value once and store the result.
This has to be done in 2 places:
* si_create_shader_variant for all types of shaders
* si_create_compute_state_async for compute shader, when a shader
is loaded from the cache.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29304>
LLVM implements multiple workarounds for gfx11.
The problem is that they're not applied for shaders built in
parts.
LLVM will be modified to be more conservative and apply the
workaround in more places but in the meantime, add a simpler
implementation in the NIR to LLVM backend: insert a wait at
the end of each shader part.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10785
Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29304>
7da5b1caef ("anv: move trtt submissions over to the anv_async_submit")
added a hard dependency on timeline semaphore which is still optional.
And since it gates the sparseBinding feature, we should not use it if
sparseBinding is not enabled.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 7da5b1caef ("anv: move trtt submissions over to the anv_async_submit")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29779>
We no longer need to reserve registers for constructing spill/fill
messages. We have split sends and construct message headers in new
temporary registers with a very short lifespan which are simply added
to the existing interference graph as new nodes and allocated via the
normal mechanism.
This means that when we need to spill for the first time, we can avoid
discarding and recomputing the entire interference graph. We also avoid
needing to recreate all spill candidate information once ra_allocate()
fails, because the graph remains valid, and none of the existing nodes
had any changes to their interference. The existing spill candidates
remain valid.
This will slightly help improve compile time when needing to spill.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25811>
Instead of reserving a register to contain the spill header, which
gets marked live for the entire program, we can just emit the ALU
instructions to build it on the fly. (This is similar to the way
we handle scratch on Alchemist with the newer LSC data port.)
There are a couple of downsides that make this not obviously a win.
First, in order to construct the scratch header on Gfx9-12, we have
to use fields from g0, which will have to remain live anywhere that
scratch access is required. This could negate the register pressure
benefits of creating the header on the fly. However, g0 is oft used
in other places anyway, so it may already be there. Another is that
it's a non-trivial number of ALU instructions to construct the value.
Still, trading lower pressure (so fewer spills, less memory access
and stalls) for more cheap ALU seems like it ought to be a win.
There is another valuable benefit: by not reserving a register, we
eliminate the need to reconstruct the interference graph. (The next
patch will actually do so.)
shader-db on Icelake shows spills/fills at 54/53 helped, 4/10 hurt,
and an 8% increase in ALU on affected shaders. Synmark's OglCSDof
(a benchmark that spills) performance remains the same on Alderlake.
fossil-db on Icelake shows a 5.6%/5.1% reduction in spills/fills and a
4% reduction in scratch memory size on affected shaders. Instruction
counts go up by 11.07%, but cycle estimates only increase by 0.57%.
Assassin's Creed Odyssey and Wolfenstein Youngblood both see 20-30%
reductions in spills/fills, a significant improvement.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25811>
Makes sure that sample_functions is not modified while shaders are
running.
Fixes: 7ebf7f4 ("llvmpipe: Compile sample functioins on demand")
Reviewed-by: Roland Scheidegger <roland.scheidegger@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29699>
sample_functions can be reallocated between get_sample_function and llvmpipe_clear_sample_functions_cache.
Fixes: 7ebf7f4 ("llvmpipe: Compile sample functioins on demand")
Reviewed-by: Roland Scheidegger <roland.scheidegger@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29699>
Was missing from the original commit.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: f164819698 ("panvk: Advertise VK_EXT_shader_module_identifier")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29800>
NIR_DEBUG=validate_ssa_dominance failed because dgc_cs_emit() weren't
actually in the if.
Fixes: 33a849e004 ("radv: emit indirect sets for indirect compute pipelines with DGC")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29782>
Fix returned fd by populating directly from the handle, instead of
from the fds array which is never populated.
Fixes: 7ae4a2ae34 ("u_gralloc/fallback: Extract modifier from QCOM native_handle")
Reviewed-by: Roman Stratiienko <r.stratiienko@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29785>
Here we move anything that expects the IR to have already been linked
so that in a future patch we can use glsl_to_nir() to convert IR that
has only been compiled.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29761>