Our backend does a somewhat unusual sequence:
1. Set up the interference graph
2. Try to register allocate
3. Fail and realize we have to spill
4. Recreate(!) the interference graph with different node counts,
because unfortunately spills and fills may need temporary registers
set aside for that purpose, which can no longer be used generally.
5. Ask for the best spill node because we know we must spill
On step 4, ra_realloc_interference_graph() reallocs the in_stack
bitset for the new nodes. However, it leaves the new bitset words
uninitialized, because it's supposed to be set up by ra_select().
On step 5, however, the Intel backend calls ra_get_best_spill_node()
_without_ first calling ra_select() (or ra_allocate()). So at that
point, the in_stack bitset is not properly initialized, and we'll
end up reading uninitialized garbage in ra_get_best_spill_node(),
and non-deterministically end up skipping candidates for spilling.
While debugging this, I observed ra_get_best_spill_node() seeing
non-zero in_stack bits set while g->tmp.stack_count was 0. So no
nodes could possibly be in the stack.
We could simply initialize the memory, but there's a deeper problem:
in Chaitin-Briggs allocators, the list of spill candidates is built in
the "Select" step. In our implementation, we technically don't make a
list of candidates, but rather flag registers that *aren't* candidates.
By never running ra_allocate() on our new graph, we never produce that
info. So when we ask for a spill node, we consider *all* registers as
spill candidates, which is far from ideal.
To fix this, we simply call ra_allocate() to rebuild that information
on the new graph. It's worth noting that it may not be quite the same
as the information we had for our old graph, too, as we reserved some
registers, increasing interference.
This escaped our notice for a long time because our allocation loop
tries to spill a single register, tries to allocate, and repeats if
it fails. Because retrying calls ra_select(), which initializes the
spill candidate info, this non-determinism only happened for the first
register selected. However, recently the backend gained support for
spilling multiple registers in each loop step, which highlighted this
problem, as different per-step-spill-sizes produced different results
due to this non-determinism.
Cc: mesa-stable
Fixes: e99081e76d ("intel/fs/ra: Spill without destroying the interference graph")
The first time we spill a register, we may need to discard and rebuild
the interference graph with some registers reserved, as we unfortunately
have to use registers to send messages to spill registers. We also need
to set spill costs.
It makes sense to do both of these tasks in choose_spill_reg(), rather
than open coding it in the middle of the spill/retry-allocation loop.
We also introduce a new boolean for whether the current interference
graph supports spilling, in order to simplify some logic.
Cc: mesa-stable
Fixes: e99081e76d ("intel/fs/ra: Spill without destroying the interference graph")
NIR_DEBUG=validate_ssa_dominance failed because dgc_cs_emit() weren't
actually in the if.
Fixes: 33a849e004 ("radv: emit indirect sets for indirect compute pipelines with DGC")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29782>
Fix returned fd by populating directly from the handle, instead of
from the fds array which is never populated.
Fixes: 7ae4a2ae34 ("u_gralloc/fallback: Extract modifier from QCOM native_handle")
Reviewed-by: Roman Stratiienko <r.stratiienko@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29785>
Here we move anything that expects the IR to have already been linked
so that in a future patch we can use glsl_to_nir() to convert IR that
has only been compiled.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29761>
When calculating the height for multi line uploads we should ensure that
we do not exceed max_dim rather than using at least max_dim.
The assert is also changed to ensure that we do not upload more than the
source size.
Fixes: 22e44d54fd
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29784>
Steps for uprev:
- copy files from BLAKE3/c src/util/blake3/
- edit README
- `for file in *.asm; do mv "$file" "${file%.asm}.masm"; done`
- keep
- blake3.h (no relevant changes), only change BLAKE3_VERSION_STRING
- blake3_sse2_x86-64_unix.S (no changes)
- blake3_avx512_x86-64_unix.S (no changes)
- blake3_sse41_x86-64_unix.S (no changes)
Acked-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: David Heidelberg <david@ixit.cz>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29687>
We frequently create a new display, query some stuff, then throw it away.
Using different queue names for the different queries is a little more
expressive when debugging.
Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29787>
The intention of this block was to set one of the flags that is used
to select a PAT index but this was doing more than that.
It was promoting WB+0 way coherency BOs to WC+1 way coherency possibly
causing regression in platforms without LLC.
anv_device_get_pat_entry() return WC/writecombining if no flags is
set so we don't need this block after all.
Reported-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com>
Fixes: a65e982b44 ("anv: Split ANV_BO_ALLOC_HOST_CACHED_COHERENT into two actual flags")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29769>
When LDP uses a negative offset (which it valid), since
`struct ir3_register` uses `{i,u}nt32_t` for the immediate
values, using `extract_reg_uim()` wasn't sign extending
negative immediate values.
Addresses:
```
src/freedreno/isa/encode.h:84:
pack_field: Assertion '!(( val & ~BITFIELD64_MASK(1 + high - low)) &&
(~val & ~BITFIELD64_MASK(1 + high - low)))' failed.
```
seen in https://gitlab.freedesktop.org/mesa/mesa/-/issues/11153 .
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29768>
cmd->state.attachments was accessed out of bounds, which somehow instead
of crash caused the tracepoint to be skipped.
drawcall_bandwidth_per_sample_sum was divided by 0 when there were no
draw calls in a renderpass.
Fixes: 1aab0fc4f5
("tu: Add attachments' UBWC info to renderpass tracepoint")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29752>
That's not enough to make
"dEQP-VK.api.device_init.create_instance_device_intentional_alloc_fail.basic"
happy but contribute to it.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29783>
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: 2eaa437574 ("panvk: Use memory pools for internal GPU data attached to vulkan objects")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29783>
This fix failure on "dEQP-VK.api.buffer.basic.size_max_uint64".
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: 822478ec20 ("panvk: Move the VkBuffer logic to its own source file")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29783>
Contribute to fixing "dEQP-VK.api.object_management.alloc_callback_fail.device".
We needs a way to report errors in mempools.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29783>
When pan_blend_shader_key_table_create was failing, we weren't
destroying the mutex and panvk_pool.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29783>
Fix a crash when a null handle is passed.
(dEQP-VK.api.null_handle.destroy_command_pool)
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: afbac1af77 ("panvk: Move the VkCommandPool logic to panvk_cmd_pool.{c,h}")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29783>
Gert has committed using both the com and co.uk email address. They lead
to the same inbox, but let's make sure they get counted as one
contributor in the git history.
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29240>
Tomeu hasn't been working at Collabora for a while now, let's invert the
mapping so his private email address is more prominent.
Reviewed-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29240>
CTS tests both layered and separate DPB, but radv wasn't handling
layered properly when used with the tier 2 dpb handling.
This adjusts the addresses to use the layer index for tier2.
Fixes dEQP-VK.video.decode.*layered*
Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29758>
There's no good reason for this to be header-only besides laziness on my
part when I first wrote a few "small" helpers. Some of those are pretty
good sized and don't need to be inlined.
Keeping the original copyright since this is just moving code.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28793>