The 64bit mask was truncated, and then when the low half is 0, the base was -1.
By accident, u_bit_consecutive64(-1, 65) is the original mask, so we uploaded a
single garbage value.
Fixes: 7f6262bb85 ("radv: allow holes in inline push constants")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42182>
Newer kernels just print hex chip-id rather than unsigned "ipv4" style.
Update parsing to handle this. See kernel commit cc53487e01fc
("drm/msm/adreno: Change chip_id format").
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42193>
Previously this helper function would not capture the xshm opcode from
the server's shm reply and drisw_glx requires the value to work
properly.
Fixes: 5f4eccf1 ("glx: Check that xshm can be attached")
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40926>
NEON always flushes subnormals to zero; previously lp_test_arit
special-cased vector paths to suppress the resulting failures.
The proper fix mirrors x86: set FPSCR/FPCR FZ so VFP also flushes,
keeping scalar and vector paths consistent with the C reference.
util_fpstate_{get,set,set_denorms_to_zero} now read/write FPSCR
(ARMv7) or FPCR (AArch64) via inline asm. flush_denorm_to_zero
in lp_test_arit flushes subnormal inputs on ARM/AArch64 to match.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42178>
For the BIR-compiler, 64-bit values were not properly tracked in the
spill logic and PHIs were always assumed to be 32-bits. This could
create issues were only one half of the value was reloaded or spills
would overlap each other leading to garbage values. This patch fixes
these issues by keeping track of how many words each value needs. Also,
it adds a constraint for SHADD sources where it splits and collects them
right before the SHADD instruction itself to make it easier for RA to
handle the register pairs.
Fixes: 4542982062 ("pan/compiler: Use SHADDX instruction for i64 add")
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42080>
On v14+, multiview is not lowered to per-view output stores. Rename
"multiview" to "per_view_outputs" to make it clear that this logic only
applies when the shader uses nir_intrinsic_store_per_view_output.
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42049>
Replace PAN_MAX_MULTIVIEW_VIEW_COUNT with a helper taking the GPU
architecture, so both the compiler and PanVK can query the right limit.
And rise maximum multiview view count to 16 on v14+. Up from 8 on older
generations.
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42049>
On v14+, the view mask moved from PRIMITIVE_FLAGS to PRIMITIVE_FLAGS_2.
The multiview vertex shader unrolling no longer needs to be handled in
software. The GPU now runs one shader invocation per view, where each
writes a single view and the view index is passed through a preload.
Fixes: 4258888f4d ("pan/genxml: Add v14 definition")
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42049>
On v14+, the GPU runs one vertex shader invocation per view, where each
writes a single view and the view index is passed through
BI_PRELOAD_VIEW_ID.
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42049>
Move the ACORN random number generator from src/nouveau/compiler/acorn/
to src/compiler/rust/acorn/ so it can be shared between different
driver hardware test infrastructures.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42165>
The ISA.xml for Valhall did not match exactly ARSHIFT as it was based on
RSHIFT. We could generate ARSHIFT_OR however so in certain trace dumps
the output would be empty.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42040>
There is no hardware restriction that limits the current size, it was
selected manually.
Increase it to 256 as this aligns more with other hardware, and this is
the minimum requirement for Vulkan 1.4.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42212>
Currently we shift the viewport as an implementation of FRONT_AND_BACK
culling mode.
However, as culling should only take effect on triangles, this shift
should only be applied when the active rasterizing primitive is
triangles.
Check the primitive topology before applying the viewport shift.
This fixes the new Vulkan CTS test `dEQP-VK.glsl.builtin_var.frontfacing.
add_ubo_load.{point,line}_list.front_and_back` introduced in CTS
1.4.6.0.
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42164>
Now that we have a unified layout for timestamp, we can implement
timestamp writes on DMA and Compute sub channels.
This also expose timestamp on non graphics queues.
Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42208>
If it's a variant that we could even possibly construct from NIR, allow
it. We'll legalize them as a 2nd pass. For OpMov, this means it's now
allowed to move darn near anything. We'll need a lowering pass to sort
that out.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42200>
The hardware doesn't have MKVEC.v4i8 in it's current form. Instead, it
has a MKVEC.v4i8 that takes two i8's and an i16 and you have to do two
of them in order to build a full MKVEC.v4i8. It also has a MKVEC.v2i16
which does exactly what it says on the tin.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42200>