We recently made some effort to reduce the LDS use of TCS:
The lowering no longer uses the same output location mapping when
storing TCS outputs to LDS and VRAM. This means that the same
patch will use a different amount of LDS and VRAM.
Therefore, we need to properly calculate the patch size in VRAM
when determining the number of output patches.
Fixes: 0e481a4adc
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28739>
This better describes what it actually is.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28764>
Add a new float16_hi_shaded_mask to keep track of which PS input
slots use their high 16 bits, based on the high_16bits of the
NIR IO semantics. Then, set ATTR1_VALID accordingly.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28764>
Instead of taking a lot of booleans, refactor offset_to_ps_input
to take an enum which represents the PS input type. This helps
code readability as we keep adding more and more input types.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28764>
Enabling FP16_INTERP_MODE makes no sense for other types of inputs.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28764>
Apparently, nir_lower_io leaves dead code in shaders, which
prevented us from deleting the IO variables properly.
Fixes: dbfb96f08f
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28764>
For PS epilogs, this isn't necessary because spi_shader_col_format is
already cleared. This will help for implementing color attachment
remapping because colors_written is always the original mapping.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28781>
KHR_video_encode_queue is dependent on KHR_video_queue.
Not exposing this makes encode-only usecases not functional,
eg. SteamVR Link.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28779>
b-frames can be considered as reference, so the NAL type
should refer to reference type and either RASL or TRAIL
depending on the irap_pic_flag.
Fixes: 72f52329c ("vulkan/video: add a nal_unit lookup for hevc")
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28657>
For purposes of address reports, it makes far more sense to report the
actually bound range rather than the full bo_size. RMV code used
effective size, so reproduce that here.
No other code looks at bo_size, so this should be quite safe.
Also fixes a theoretical correctness issue where plane aspect for
DISJOINT image was not passed to GetImageMemoryRequirements2 in internal
code.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10996
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28718>
This prepares ac_nir_lower_ngg_ms for the possibility of packing
two 16-bit outputs into the same 32-bit output slot, by handling
the high_16bits flag in NIR IO semantics.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28704>
Previously, it would hit an assertion when used with scalarized
IO, because the scalarization will split the primitive indices
store into smaller, per-component stores.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28704>
This was an NV_mesh_shader-only feature and we should have already
removed it. We don't want to carry it forward anymore, because it
would needlessly complicate implementing new features.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28704>
Checking for the lib flags is unnecessary because in this case
next_stage is NONE and it's already handled correctly by the helper.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28672>
Store the pipeline layout in radv_graphics_pipeline to simplify the
import. This will also allow us to generate a graphics pipeline key
from pCreateInfo more easily.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28672>
Try to use v_fma_mix{lo,hi}_f16 if possible instead of v_cvt_pkrtz_f16_f32.
To ensure correct rounding we have to make sure that the fp16 rounding mode
can be rtz first.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28670>
Otherwise, the returned VA from vkGetBufferDeviceAddress() or via
VK_EXT_device_address_binding_report doesn't match and applications
would have to mask out.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28652>
Since DGC preprocessing for IBO is supported, the driver generates
an indexed indirect draw but SQTT markers were missing and this
introduced complete non-sense in RGP captures.
Fixes: e59a16bbb8 ("radv: use an indirect draw when IBO isn't updated as part of DGC")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28710>