Instead of expecting just 1 address bit to be flipped by 1 coordinate bit,
expect any address bits to be flipped by 1 coordinate bit. If multiple
coordinate bits flip the same address bit, that means all those coordinate
bits are XOR'd.
v2: also print 128bpp
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41431>
It may have been accidentally left in the code.
If there is any doubt about this, then the reason is the same
as accepting screen=NULL in context_create or any other function.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41429>
Sample mask should only be limited to current sample bit when using
sample rate shading, and sample shading flag in multisample state
should be considered.
Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41283>
Instead of dirtying the root buffer and re-uploading the whole thing for each
draw where a per-draw value like the draw ID is changed, use a smaller
secondary buffer for per-draw data. We can also skip flushing state for every
indiviual batched draw and just flush once for the whole draw command.
This may also be useful in the future for handling how sized index buffers from
maintenance5 and null index buffers from maintenance6 work with robustness2,
allowing us to pass through indexed draw parameters and lower the index buffer
read into the shader with bounds checks.
Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41399>
`si_init_gfx_screen` already initializes screen state functions, so
avoid doing it twice. This was regressed by d1c57f742e.
Detected by LSan when applications using vaapi exit.
Fixes: d1c57f742e ("radeonsi/gfx: add si_gfx_screen.c")
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: llyyr <llyyr.public@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41442>
List primitives would be handled by geometry unroll as if restart is always
enabled, telling the unroll shader to restart the primitive at the usual
restart index. This would produce invalid geometry for list primitives where
restart is disabled and the restart index is used as a valid index.
Instead, always force the restart index for unrolling to UINT32_MAX when
restart is disabled, and refactor the index promotion logic accordingly.
Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41333>
simd_ballot/quad_any/quad_all (and probably simd_any/simd_all) appear to
generally be broken within conditional blocks, not just with simd_is_first.
Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41186>
lower_boolean_reduce only works if the number of components is 1, and even
asserts on this in its prologue. Otherwise, given a boolean vector type, it
may produce output using ballot/vote with a boolean vector input.
Acked-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41186>
v3dv_CmdFillBuffer was passing only the user-supplied dstOffset to
meta_fill_buffer, ignoring the destination VkBuffer's mem_offset.
When several VkBuffers share one VkDeviceMemory at different offsets
(sub-allocation) the fill landed on whichever VkBuffer was
bound at offset 0 of the memory object instead of the requested one.
Fixes: 5ed78d91fe ("v3dv: implement vkCmdFillBuffer")
Assisted-by: Claude Opus 4.7
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41436>
Drivers using blorp on ELK platforms don't need the special
color->depth conversion path that needs 64bit floating point math.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Allen Ballway <ballway@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39341>
This shouldn't be necessary because SDMA can detile the image just fine,
only buffer->image and image->image need to fallback.
It just works on GFX10+ because RADV is using NBC views, and I think
it works on eg. VEGA10 just by luck due to different
swizzles/alignments.
Fixes: 3d803d7a2e ("radv: Use compute copy for emulated formats")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41384>
KHR-Single-GL46.arrays_of_arrays_gl.SubroutineFunctionCalls2 subtests
pass, but some are slow and we keep skipping them.
copy_image.* now takes 2:30 of runtime on my T14s and has some interesting
fails in rgb9e5, though a750 CI seems to pass.
texture_swizzle.functional* now takes 6.5s of runtime on my T14s.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41245>
pre_chain.rp_trace usage relied on a bunch of bad assumptions
and together with u_trace_move didn't cause issues until
u_trace is started to be refactored. Fixing those bad assumptions
and correctly initializing and freeing pre_chain.rp_trace
also requires fixing u_trace_move at the same time.
u_trace_move fixes:
- If dst had trace chunks in it - we may have leaked them.
- The correct list move pattern is "list_replace -> list_inithead"
not "list_replace -> list_delinit"
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41390>
Format draw and draw_indexed Perfetto events with their vertex count.
For draw_indirect and draw_indexed_indirect, include the draw count
when indirect tracing is enabled (MESA_GPU_TRACES=indirects), otherwise
fall back to the static name.
Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41374>