This isn't needed anymore, because vk_format_get_component_bits now accesses the proper
pipe formats and therefore returns the correct bit count since the following commit:
57c81bab04 ("vulkan/format: Translate two 420_UNORM formats properly")
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30493>
unified-runtime tries to call this unconditionally. It handles errors
correctly, but calling None here isn't an error, it's a crash. Just
return CL_INVALID_VALUE so we don't crash.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30699>
When RADV_DEBUG=shaders is set, printing e.g. different NIR shaders from
different threads at the same time makes the output unreadable. Use a mutex
to synchronize shader dumping so that all shaders get printed in once piece.
Since we're writing everything to a file or terminal anyway, the
performance impact of forcing singlethreaded compilation is negligible.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25215>
Our u_hexdump() squeezes 16-byte chunks filled of zeros, where the unix
hexdump squeezes repeated 16-byte chunks. Turns out panfrost/panvk dumps
can be pretty big when when VM dump is requested
(PANVK_DEBUG/PAN_MESA_DEBUG=dump) and memory regions are
filled with repeated non-zero patterns (like a Z16_UNORM buffer cleared
to 1.0, AKA 0xffff).
Avoiding the repetition of such non-zero patterns in dumps significantly
reduces the size of the dumps. It also clears any confusion for people
used to the original hexdump semantics where a star means the previous
line is repeated.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30692>
We stick to a rule in the driver that each field is only set in a
single place in the driver. Therefore when merging instructions, we
should never have any bit set to 1 from both sides.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30684>
Currently we can end up merging 2 prepacked 3DSTATE_CLIP instructions
where 2 different places in the driver fill the MaximumVPIndex.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 50f6903bd9 ("anv: add new low level emission & dirty state tracking")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30684>
lower_driver_param_to_ubo would call ir3_const_state_mut
unconditionally. However, since 850f2aab03 ("ir3, tu: Use a UBO for VS
primitive params on a750+"), it can be called for the binning VS,
causing an assert. This commit makes sure to only call
ir3_const_state_mut when it's really necessary to have mutable access to
the const state.
Fixes: 2c47ad7774 ("ir3: make ir3_const_state less error-prone to use")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30718>
This commit solves the shortage-problem at the blit-functions by
checking the number of fence-registers after updating the batch.
If too many registers are used,
the batch-entries and relocs for the current blit function are
removed by setting batch->ptr and reloc_count to value before
the blit call and calling drm_intel_gem_bo_clear_relocs.
This truncated batch is flushed,
and the batch is updated again for the current blit function.
Cc: mesa-stable
Signed-off-by: GKraats <vd.kraats@hccnet.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26769>
If there are no uniforms to push, don't emit the AND or invalidate the
shader analysis. This affects only compute shaders.
Not a significant impact since lots of shaders end up pushing
uniforms. Fossil-db numbers (restricted to compute pipelines only) for DG2
```
Totals:
Instrs: 3071016 -> 3070894 (-0.00%)
Cycle count: 8320268863 -> 8320264519 (-0.00%)
Totals from 122 (2.70% of 4520) affected shaders:
Instrs: 10675 -> 10553 (-1.14%)
Cycle count: 2060003 -> 2055659 (-0.21%)
```
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30631>
We need to flip trianges from CW to CCW based on the domain origin
specified as dynamic state. Instead of tracking all this on the CPU,
add a scratch register and do the conversion in the MME.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30703>
This effectively splits the two states apart so that we can set them
independently. Inside the macros, we only update states that have
actually changed which should also be a bit more efficient.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30703>
We're always storing it in a scratch register for register pressure
reasons anyway. We may as well just stash it there as a state reg and
we can avoid emitting it all over the place. This reduces each draw
call to nvk_flush_gfx_state() followed by the actual draw, which is now
independenty of any dynamic state.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30703>
mme_set_priv_reg() needs the first three registers to send data to/from
FALCON04. If we don't reserve these in the register space, it may stomp
other things. This only really matters pre-Volta where we need to use
privileged registers for conservative rasterization. However, it's a
good idea to reserve the space none the less.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30703>
Instead of the state part of the simulator being baked in, it's now
broken out into a pluggable component that the simulator talks to via a
function pointer interface. This will let us run the simulator without
the full state simulator under the hood.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30703>
out_args->scratch_offset and in_wg_id_x will alias on <gfx9.
To avoid the conversion code reading a garbage WG ID, move the
scratch/ring offset writing to the very end.
Fixes: 1e354172 ("radv,aco: Convert 1D ray launches to 2D")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30707>
A loop that looks like:
loop {
do_work_1();
if (cond) {
break;
} else {
}
do_work_2();
break;
}
We can't pull that break ahead of do_work_1() after hoisting the initial
do_work_1() out of the loop. So bail in this case.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11711
Fixes: 6b4b044739 ("nir/opt_loop: add loop peeling optimization")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30702>