Implements shader-based global blending and pre-multiplied alpha support
to YUV compositing, allowing for transparent overlays and alpha-channel
based transparency with RGBA overlays.
Handle pre-multiplied alpha images by un-multiplying the pre-multiplied
alpha colours, to allow for straight-alpha (which is easier to
implement) to be applied.
Thanks nyanmisaka for the help, and for pointing out the difference
between pre-multiplied alpha and straight alpha.
Thanks David Rosca and Benjamin Cheng for improvements to the code and
spotting errors.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/12977
Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41090>
Fix typos in the size of proj, and chroma_proj, in the GLSL pseudo-code
comment portion of cs_create_shader.
Thanks Benjamin Cheng <benjamin.cheng@amd.com> for finding it.
Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41090>
Instead of expecting just 1 address bit to be flipped by 1 coordinate bit,
expect any address bits to be flipped by 1 coordinate bit. If multiple
coordinate bits flip the same address bit, that means all those coordinate
bits are XOR'd.
v2: also print 128bpp
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41431>
It may have been accidentally left in the code.
If there is any doubt about this, then the reason is the same
as accepting screen=NULL in context_create or any other function.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41429>
Sample mask should only be limited to current sample bit when using
sample rate shading, and sample shading flag in multisample state
should be considered.
Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41283>
Instead of dirtying the root buffer and re-uploading the whole thing for each
draw where a per-draw value like the draw ID is changed, use a smaller
secondary buffer for per-draw data. We can also skip flushing state for every
indiviual batched draw and just flush once for the whole draw command.
This may also be useful in the future for handling how sized index buffers from
maintenance5 and null index buffers from maintenance6 work with robustness2,
allowing us to pass through indexed draw parameters and lower the index buffer
read into the shader with bounds checks.
Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41399>
`si_init_gfx_screen` already initializes screen state functions, so
avoid doing it twice. This was regressed by d1c57f742e.
Detected by LSan when applications using vaapi exit.
Fixes: d1c57f742e ("radeonsi/gfx: add si_gfx_screen.c")
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: llyyr <llyyr.public@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41442>
List primitives would be handled by geometry unroll as if restart is always
enabled, telling the unroll shader to restart the primitive at the usual
restart index. This would produce invalid geometry for list primitives where
restart is disabled and the restart index is used as a valid index.
Instead, always force the restart index for unrolling to UINT32_MAX when
restart is disabled, and refactor the index promotion logic accordingly.
Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41333>
simd_ballot/quad_any/quad_all (and probably simd_any/simd_all) appear to
generally be broken within conditional blocks, not just with simd_is_first.
Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41186>
lower_boolean_reduce only works if the number of components is 1, and even
asserts on this in its prologue. Otherwise, given a boolean vector type, it
may produce output using ballot/vote with a boolean vector input.
Acked-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41186>
v3dv_CmdFillBuffer was passing only the user-supplied dstOffset to
meta_fill_buffer, ignoring the destination VkBuffer's mem_offset.
When several VkBuffers share one VkDeviceMemory at different offsets
(sub-allocation) the fill landed on whichever VkBuffer was
bound at offset 0 of the memory object instead of the requested one.
Fixes: 5ed78d91fe ("v3dv: implement vkCmdFillBuffer")
Assisted-by: Claude Opus 4.7
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41436>
Drivers using blorp on ELK platforms don't need the special
color->depth conversion path that needs 64bit floating point math.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Allen Ballway <ballway@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39341>
This shouldn't be necessary because SDMA can detile the image just fine,
only buffer->image and image->image need to fallback.
It just works on GFX10+ because RADV is using NBC views, and I think
it works on eg. VEGA10 just by luck due to different
swizzles/alignments.
Fixes: 3d803d7a2e ("radv: Use compute copy for emulated formats")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41384>