Small PS have their VGPR usage equal to the number of input VGPRs,
and this reduces it.
4 input VGPRs removed in most cases.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41226>
Small PS have their VGPR usage equal to the number of input VGPRs,
and this reduces it.
1 input VGPR removed from the PS prolog in most cases.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41226>
Small PS have their VGPR usage equal to the number of input VGPRs,
and this reduces it.
2 input VGPRs removed from the PS prolog in most cases.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41226>
Small PS have their VGPR usage equal to the number of input VGPRs,
and this reduces it.
4 input VGPRs removed in most cases.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41226>
This was missing and it slightly improves code generation.
8 is always correct with maximum sample shading.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41226>
The VGPR indices will be dynamic. This replaces hardcoded VGPR indices
with enums in the PS prolog key.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41226>
Color interpolation (INTERP_MODE_NONE) has unknown barycentrics
and it could be flat shading at runtime.
It's a problem when shader_info is expected to match what's actually
used.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41226>
Implements shader-based global blending and pre-multiplied alpha support
to YUV compositing, allowing for transparent overlays and alpha-channel
based transparency with RGBA overlays.
Handle pre-multiplied alpha images by un-multiplying the pre-multiplied
alpha colours, to allow for straight-alpha (which is easier to
implement) to be applied.
Thanks nyanmisaka for the help, and for pointing out the difference
between pre-multiplied alpha and straight alpha.
Thanks David Rosca and Benjamin Cheng for improvements to the code and
spotting errors.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/12977
Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41090>
Fix typos in the size of proj, and chroma_proj, in the GLSL pseudo-code
comment portion of cs_create_shader.
Thanks Benjamin Cheng <benjamin.cheng@amd.com> for finding it.
Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41090>
It may have been accidentally left in the code.
If there is any doubt about this, then the reason is the same
as accepting screen=NULL in context_create or any other function.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41429>
`si_init_gfx_screen` already initializes screen state functions, so
avoid doing it twice. This was regressed by d1c57f742e.
Detected by LSan when applications using vaapi exit.
Fixes: d1c57f742e ("radeonsi/gfx: add si_gfx_screen.c")
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: llyyr <llyyr.public@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41442>
KHR-Single-GL46.arrays_of_arrays_gl.SubroutineFunctionCalls2 subtests
pass, but some are slow and we keep skipping them.
copy_image.* now takes 2:30 of runtime on my T14s and has some interesting
fails in rgb9e5, though a750 CI seems to pass.
texture_swizzle.functional* now takes 6.5s of runtime on my T14s.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41245>