If you were to try to reg allocate to more full regs than the file has,
you'll end up bit clearing off of available[]/available_to_evict[] and
corrupting the RB tree. Let's make sure we don't land there again.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39997>
As the comment says, we want to limit our pressure based on underlying HW
reg file size, not max it out to HW reg file size. This caused us to not
spill when we should when the HW reg size was bigger than the ISA reg file
size, leading to OOB writes in RA when it tried to allocate to the limit
pressure we spilled to.
Fixes segfaults in llama.cpp's test-backend-ops.
Fixes: e6e34883a9 ("ir3: Add wavesize control")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14846
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39997>
To demonstrate that this helper is legitimately useful cross-tree and should
be made common code :-)
Sample AGX IR with this patch:
10 = mov_imm #0x3f5680f1 /* 0.837905 */
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40021>
This helper is generally useful when trying to prettyprint a 32-bit value, so
make it available to the rest of the tree.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40021>
Otherwise, the following error is observed:
lvp_pipeline.c:422:28:
error: variable 'progress' is used uninitialized whenever
'if' condition is false [-Werror, -Wsometimes-uninitialized]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40046>
The way I understand the HW docs is that the polygon offset is applied
always in 24bit depth domain (there are no polygon offset depth format
control registers like r600 has), so we need to manually rescale for
16bit buffers.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39196>
The main change here is adding an architecture argument to
pan_blend_can_fixed_function, so that we can take into
account fixed function hardware limitations in particular
generations.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39705>
Blend shaders operate on 4 components, and this makes
a difference for some operations (particularly blends
with constant values). Usually the hardware handles the
conversion smoothly, but there are a few special cases where
there is an alpha channel in the "wrong" place; we need
to handle those specially.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39705>
We can still render to it, but hardware blending needs a slightly
different path (the supplied GL_R8 internal format did not work
correctly).
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39705>
If the output format has no alpha channel then DST_ALPHA is the same
as CONST_ONE, and hence the blend operation becomes trivial (opaque).
This also fixes some piglit test failures, possibly because the
fixed function blending hardware isn't really set up to handle RGB1.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39705>
We were hitting an assert on a piglit test on midgard. Note that,
oddly, PIPE_FORMAT_R10G10B10X2_UINT is not defined, so we cannot
add that case.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39705>
The Vulkan spec states:
"If logicOpEnable is VK_TRUE, then a logical operation selected by
logicOp is applied between each color attachment and the
fragment’s corresponding output value, and blending of all
attachments is treated as if it were disabled. Any attachments
using color formats for which logical operations are not supported
simply pass through the color values unmodified."
pack_blend() was only checking blendEnable from the attachment state,
causing hardware blending to be applied even when logic ops were enabled.
This is the v3dv equivalent of the RADV fix in commit c172f6ef01
("radv: fix disabling logic op for srgb/float formats when blending
is enabled").
Fixes: dEQP-VK.pipeline.monolithic.logic_op_na_formats.*_blend
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40025>
Since the stride is always 32 dwords, we need to treat the workgroup
size as multiples of that value. Using MAX2() only works for cases where
the workgroup size is less than 32, which was hit by some CTS with 1x1
workgroups.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39981>
The Mali GPUs have native support for user-defined depth clamp bounds
via the LOW_DEPTH_CLAMP/HIGH_DEPTH_CLAMP registers and the
depth_clamp_mode field. Wire up the existing runtime plumbing to these
registers so applications can specify a custom depth clamp range instead
of always clamping to the viewport's minDepth/maxDepth.
While at it, drop the redundant CLAMP on depth values in the CSF path.
Since VK_EXT_depth_range_unrestricted is not supported, panvk_depth_range()
is already guaranteed to produce values in the 0..1 range.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39925>
While reworking image resolves completely in RADV, I found a very weird
bug where the only fix was to emit caches immediately after
decompressing the source resolve image (after FMASK_DECOMPRESS).
I have been struggling this for few hours and figured that it was
something related to context rolls (ie. as long the context was rolled
out, emitting the flushes immediately was required).
It turns out this was a known hardware bug on GFX6 that was implemented
in PAL. Though PAL only applies on GFX6 but GFX7-8 are also affected
based on my testing. Note that RadeonSI flushes CB_META too.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39959>
The new compression scheme introduced in Xe2 also applies to Xe3, so
we're liable for the same bugs.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 2418c91537 ("anv/drirc: disable Xe2 CCS drm modifiers for GTK engine")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39953>
This is intended to enable rusticl to use get_query_result_resource()
for timestamp queries, for hw which cannot convert ticks to us on the
GPU (or for which doing the conversion on the GPU is expensive). In
this case, the query result buffer is not exposed to the app, so we
can still do the necessary conversion on the CPU.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39995>
Prefer to be explicit when handling it, like is done for regular
nir_instr_type_call.
Even though functions called by cmat_call have restrictions on them ("no
tangled instructions" for example), which could allow a couple of passes
to treat them differently, there's no tracking of what functions are
used only in such cases, so being conservative here should be safe.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39903>
arb_point_sprite-mipmap renders polygons with polygon mode set to POINT.
However, r300 point-sprite setup only treated MESA_PRIM_POINTS as point
draws, so sprite coord replacement was disabled for polygon primitives
that were rasterized as points. This produced wrong texcoord orientation
and failed the piglit test.
Detect point rasterization from the primitive plus rasterizer fill/cull
state and use that in both HWTCL and SWTCL draw paths when updating
is_point flag.
The test now pass on RV370 and fails with the rest of the CI HW, but the
remaining issues seem to be some LOD boundary mismatch at point size 22,
the hardware samples level 0 where test expects level 1. In total only 4
cases now fail instead of 82 before.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39850>
Instead of using whatever group was set by the previous
instruction. No behavior change, just normalizes what
we generate.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39843>
The `group()` helper creates the new builder "relative" to the existing
one, so this was resulting in some uniform instructions having
a non-zero channel offset ("group") -- which was surprising and had no
practical effect.
Normalize to always use group = 0. No change in behavior expected.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39842>