7da5b1caef ("anv: move trtt submissions over to the anv_async_submit")
added a hard dependency on timeline semaphore which is still optional.
And since it gates the sparseBinding feature, we should not use it if
sparseBinding is not enabled.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 7da5b1caef ("anv: move trtt submissions over to the anv_async_submit")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29779>
We no longer need to reserve registers for constructing spill/fill
messages. We have split sends and construct message headers in new
temporary registers with a very short lifespan which are simply added
to the existing interference graph as new nodes and allocated via the
normal mechanism.
This means that when we need to spill for the first time, we can avoid
discarding and recomputing the entire interference graph. We also avoid
needing to recreate all spill candidate information once ra_allocate()
fails, because the graph remains valid, and none of the existing nodes
had any changes to their interference. The existing spill candidates
remain valid.
This will slightly help improve compile time when needing to spill.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25811>
Instead of reserving a register to contain the spill header, which
gets marked live for the entire program, we can just emit the ALU
instructions to build it on the fly. (This is similar to the way
we handle scratch on Alchemist with the newer LSC data port.)
There are a couple of downsides that make this not obviously a win.
First, in order to construct the scratch header on Gfx9-12, we have
to use fields from g0, which will have to remain live anywhere that
scratch access is required. This could negate the register pressure
benefits of creating the header on the fly. However, g0 is oft used
in other places anyway, so it may already be there. Another is that
it's a non-trivial number of ALU instructions to construct the value.
Still, trading lower pressure (so fewer spills, less memory access
and stalls) for more cheap ALU seems like it ought to be a win.
There is another valuable benefit: by not reserving a register, we
eliminate the need to reconstruct the interference graph. (The next
patch will actually do so.)
shader-db on Icelake shows spills/fills at 54/53 helped, 4/10 hurt,
and an 8% increase in ALU on affected shaders. Synmark's OglCSDof
(a benchmark that spills) performance remains the same on Alderlake.
fossil-db on Icelake shows a 5.6%/5.1% reduction in spills/fills and a
4% reduction in scratch memory size on affected shaders. Instruction
counts go up by 11.07%, but cycle estimates only increase by 0.57%.
Assassin's Creed Odyssey and Wolfenstein Youngblood both see 20-30%
reductions in spills/fills, a significant improvement.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25811>
Makes sure that sample_functions is not modified while shaders are
running.
Fixes: 7ebf7f4 ("llvmpipe: Compile sample functioins on demand")
Reviewed-by: Roland Scheidegger <roland.scheidegger@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29699>
sample_functions can be reallocated between get_sample_function and llvmpipe_clear_sample_functions_cache.
Fixes: 7ebf7f4 ("llvmpipe: Compile sample functioins on demand")
Reviewed-by: Roland Scheidegger <roland.scheidegger@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29699>
Was missing from the original commit.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: f164819698 ("panvk: Advertise VK_EXT_shader_module_identifier")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29800>
NIR_DEBUG=validate_ssa_dominance failed because dgc_cs_emit() weren't
actually in the if.
Fixes: 33a849e004 ("radv: emit indirect sets for indirect compute pipelines with DGC")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29782>
Fix returned fd by populating directly from the handle, instead of
from the fds array which is never populated.
Fixes: 7ae4a2ae34 ("u_gralloc/fallback: Extract modifier from QCOM native_handle")
Reviewed-by: Roman Stratiienko <r.stratiienko@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29785>
Here we move anything that expects the IR to have already been linked
so that in a future patch we can use glsl_to_nir() to convert IR that
has only been compiled.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29761>
When calculating the height for multi line uploads we should ensure that
we do not exceed max_dim rather than using at least max_dim.
The assert is also changed to ensure that we do not upload more than the
source size.
Fixes: 22e44d54fd
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29784>
Steps for uprev:
- copy files from BLAKE3/c src/util/blake3/
- edit README
- `for file in *.asm; do mv "$file" "${file%.asm}.masm"; done`
- keep
- blake3.h (no relevant changes), only change BLAKE3_VERSION_STRING
- blake3_sse2_x86-64_unix.S (no changes)
- blake3_avx512_x86-64_unix.S (no changes)
- blake3_sse41_x86-64_unix.S (no changes)
Acked-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: David Heidelberg <david@ixit.cz>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29687>
We frequently create a new display, query some stuff, then throw it away.
Using different queue names for the different queries is a little more
expressive when debugging.
Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29787>
The intention of this block was to set one of the flags that is used
to select a PAT index but this was doing more than that.
It was promoting WB+0 way coherency BOs to WC+1 way coherency possibly
causing regression in platforms without LLC.
anv_device_get_pat_entry() return WC/writecombining if no flags is
set so we don't need this block after all.
Reported-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com>
Fixes: a65e982b44 ("anv: Split ANV_BO_ALLOC_HOST_CACHED_COHERENT into two actual flags")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29769>
When LDP uses a negative offset (which it valid), since
`struct ir3_register` uses `{i,u}nt32_t` for the immediate
values, using `extract_reg_uim()` wasn't sign extending
negative immediate values.
Addresses:
```
src/freedreno/isa/encode.h:84:
pack_field: Assertion '!(( val & ~BITFIELD64_MASK(1 + high - low)) &&
(~val & ~BITFIELD64_MASK(1 + high - low)))' failed.
```
seen in https://gitlab.freedesktop.org/mesa/mesa/-/issues/11153 .
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29768>
cmd->state.attachments was accessed out of bounds, which somehow instead
of crash caused the tracepoint to be skipped.
drawcall_bandwidth_per_sample_sum was divided by 0 when there were no
draw calls in a renderpass.
Fixes: 1aab0fc4f5
("tu: Add attachments' UBWC info to renderpass tracepoint")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29752>
That's not enough to make
"dEQP-VK.api.device_init.create_instance_device_intentional_alloc_fail.basic"
happy but contribute to it.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29783>
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: 2eaa437574 ("panvk: Use memory pools for internal GPU data attached to vulkan objects")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29783>
This fix failure on "dEQP-VK.api.buffer.basic.size_max_uint64".
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: 822478ec20 ("panvk: Move the VkBuffer logic to its own source file")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29783>
Contribute to fixing "dEQP-VK.api.object_management.alloc_callback_fail.device".
We needs a way to report errors in mempools.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29783>
When pan_blend_shader_key_table_create was failing, we weren't
destroying the mutex and panvk_pool.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29783>
Fix a crash when a null handle is passed.
(dEQP-VK.api.null_handle.destroy_command_pool)
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: afbac1af77 ("panvk: Move the VkCommandPool logic to panvk_cmd_pool.{c,h}")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29783>
Gert has committed using both the com and co.uk email address. They lead
to the same inbox, but let's make sure they get counted as one
contributor in the git history.
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29240>
Tomeu hasn't been working at Collabora for a while now, let's invert the
mapping so his private email address is more prominent.
Reviewed-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29240>
CTS tests both layered and separate DPB, but radv wasn't handling
layered properly when used with the tier 2 dpb handling.
This adjusts the addresses to use the layer index for tier2.
Fixes dEQP-VK.video.decode.*layered*
Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29758>