This complements our existing nir_get_io_index_src helper. Most, but annoyingly
not all, stores put their data source in source 0. Having a helper for this lets
us reduce special casing in a bunch of random places.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39939>
In some cases, this would incorrectly set higher dpbArraySize
when overwriting already existing dpb slot.
This didn't seem to cause any issues, but the extra slot would
have zero va which was wrong.
Get the actual ref count from codec param, instead of using
cmd->num_refs which always includes current slot. Also add sanity
check that the ref surface was found.
Fixes: 79af03556c ("ac: Add VCN ac_video_dec implementation")
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39877>
These tests were previously skipped because they contain dynamic loops
in the VS, which can cause GPU resets on VC4. However, (1) the only
tests that cause GPU resets are the ones that have divergent loops and
(2) now, the compiler is able to fail shader linking when it finds
divergent loops.
Therefore, allow tests with non-divergent loops to run on the CI and
add tests with divergent loops to the fail list.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39768>
VC4 hardware doesn't have a dispatch mask for the VS, so divergent
loops can have undefined/garbage contents in some execution channels,
potentially causing infinite loops and GPU hangs.
Fail shader linking instead of hanging the GPU when a divergent loop is
detected in a vertex shader.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39768>
Add load_texture_scale to the list of intrinsics whose divergence
depends on their sources. This is needed to support running divergence
analysis on VC4.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39768>
This fixes some dead assignment issues detected with static analyzer.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39578>
On V3D 4.2 (Raspberry Pi 4), there is a hardware bug where the binner
can trigger a GPU reset in some situations where primitives are
discarded, such as due to primitive restarts.
The way to avoid this is to force the binner to do always something, by
emitting the proper CL. In this case we decided to always set point
size, as it is a very simple and fast operation.
This fixes resets caused by
dEQP-VK.pipeline.monolithic.input_assembly.primitive_restart.*.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39826>
We just read this from the NIR and store it in iris_compiled_shader,
there's no reason for the backend compiler to be involved.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39926>
These days, our system value concept is just about iris_program
communicating to iris_state which values to upload into a UBO.
Nowhere in that process is the backend compiler involved, so it
doesn't make sense for there to be brw/elk mechanisms.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39926>
iris needs this, but anv does not, and it's just a small wrapper around
common NIR lowering anyway. This also removes some brw/elk splitting.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39926>
nir_create_passthrough_tcs already validates the result, we don't need
to validate a second time.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39926>
When these are correctly mapped to draw usage, we can't rely on them
being globally initialized in tu_init_hw(). They need to be re-
initialized after rp stomping.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39819>
Use information about register usage to decide what to dump for register
summaries. If --allregs, we continue to dump all regs that we have seen
a register write for. Otherwise we dump registers that are written
since last summary (as before) along with regs with the appropriate
usage.
A new column at the start of the line will show a '?' for registers that
don't match the existing usage (ie. they have been written but usage
doesn't match current "draw"). This could simply mean it is the first
"draw" in the render-pass (collecting per-renderpass values). Or it
could mean a mis-attributed or missing register usage. This new column
is only included for a6xx and newer, since older gens don't have usage
specified in xml.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14829
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39819>
We have other better tests for this now. And the next patch will
balloon the reference trace to 90MB due to large # of draw calls. So
just drop this test.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39819>
The existing implementation did not account for register names that
contain the suffix multiple times (ie. FOO_HIT_COUNT_HI).
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39819>
The NIR_PASS macro only overwrites this when the pass actually makes
progress. If the pass doesn't make progress, the variable stays
uninitialized.
Clang correctly spots this and warns about it.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39968>
The hardware only provides 13 bits for encoding the stack base (in
dwords). That translates to the stack base being required to be below
8192 dwords, or 32kB. It's possible to exceed this - LDS is 64kB after
all. Add an explicit check to make sure we don't end up with offsets
that overflow the hw's address fields. This fixes Metro Exodus Enhanced
Edition, which was using ray queries in a 1024-thread sized workgroup,
resulting in exactly 64kB of LDS being required for the stack.
This check isn't required for RT pipelines as we always use 32 or 64
wide workgroups with no other LDS used, so it's impossible to reach this
stack base limit.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39691>
Just seeing that a passthrough GS was already bound is not sufficient to
know that it is a *matching* passthrough GS. If the application binds a
new VS that requires a different passthrough GS key than the previous
VS, then we need to bind a different passthrough GS.
Fixes: 5bc8284816 ("hk: add Vulkan driver for Apple GPUs")
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39624>