this existed due to the min/max, per the comment. now that we don't do min/max,
the whole routine is NaN correct so the fixup is pointless.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Suggested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30934>
everybody has abs on fmul, not everyone has abs on bcsel. should help agx and
bifrost.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30934>
this does two things:
* ignores sign of negative numbers which let us play fast and loose later in th
series
* avoids an expensive fsign instruction in favour of a cheap bitwise op
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30934>
worse in terms of NIR instruction count but lets the fabs fold easier. (on agx,
which has fabs on comparisons and fmul but not on bcsel. should be no worse if
ISA has fabs on all 3.)
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30934>
just implement what the comment says, don't be clever. the clever thing is worse
on all architectures i'm familiar with, because the fdiv will turn into
fmul+frcp.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30934>
PPS header should use pps_loop_filter_across_slices_enabled_flag instead
of slice_loop_filter_across_slices_enabled_flag according to HEVC SPEC.
Slice header should also use pps_loop_filter_across_slices_enabled_flag
as one of the condition to determine if slice flag needs to be present.
V2: Apply pps_loop_filter_across_slices_enabled_flag to loop_filter as well
So modify loop_filter_across_slices_enabled value to be pps one instead
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31064>
There are two bugs:
- VK_KHR_maintenance5 added VkBufferUsageFlags2CreateInfoKHR, so
checking for pCreateInfo->usage is incomplete
- this was also missing the usage flag for descriptor buffer with samplers
This fixes recent VKCTS coverage in
dEQP-VK.binding_model.descriptor_buffer.*.
Fixes: 059391b631 ("radv: use 32bit va range for sparse descriptor buffers")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31054>
Upcoming more strict context vs glsl version checks will fail otherwise.
Since standalone compiler requires ARB_ES3_2_compatibility that requires
GL 4.5 we simply set that as the context version.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31061>
Otherwise we can end up with uninitialized values, this fixes following
valgrind warning:
==71705== Uninitialised byte(s) found during client check request
==71705== at 0x73B6DB8: util_bitpack_uint (bitpack_helpers.h:55)
==71705== by 0x73B6DB8: GFX11_PIPE_CONTROL_pack (gen11_pack.h:19885)
==71705== by 0x73B6DB8: iris_emit_raw_pipe_control (iris_state.c:10022)
==71705== by 0x6F93386: iris_emit_pipe_control_write (iris_pipe_control.c:97)
==71705== by 0x6FBCCAA: iris_resource_update_indirect_color (iris_resolve.c:1241)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30990>
Otherwise we can end up with uninitialized values, this fixes following
valgrind warning:
==31283== Uninitialised byte(s) found during client check request
==31283== at 0x503E4DE: anv_batch_bo_finish (anv_batch_chain.c:345)
==31283== by 0x504220A: anv_cmd_buffer_end_batch_buffer (anv_batch_chain.c:1103)
==31283== by 0x55A0E4F: end_command_buffer (genX_cmd_buffer.c:3455)
==31283== by 0x55A0E82: gfx11_EndCommandBuffer (genX_cmd_buffer.c:3466)
==31283== by 0x11233A: ??? (in /usr/bin/vkcube)
==31283== by 0x10BDEE: ??? (in /usr/bin/vkcube)
==31283== by 0x49B5149: (below main) (in /usr/lib64/libc.so.6)
==31283== Address 0xc10c4d8 is 1,240 bytes inside a block of size 8,192 client-defined
==31283== at 0x5036EF6: anv_bo_pool_alloc (anv_allocator.c:1284)
==31283== by 0x503E0E1: anv_batch_bo_create (anv_batch_chain.c:262)
==31283== by 0x5040D3F: anv_cmd_buffer_init_batch_bo_chain (anv_batch_chain.c:868)
==31283== by 0x504F9C1: anv_create_cmd_buffer (anv_cmd_buffer.c:147)
==31283== by 0x6B718C4: vk_common_AllocateCommandBuffers (vk_command_pool.c:206)
==31283== by 0x4FB06B2: vkAllocateCommandBuffers (trampoline.c:1996)
==31283== by 0x111E6B: ??? (in /usr/bin/vkcube)
==31283== by 0x10BDEE: ??? (in /usr/bin/vkcube)
==31283== by 0x49B5149: (below main) (in /usr/lib64/libc.so.6)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30990>
If the next renderpass uses the same depth attachment, clears it
with generic clear - ZPASS_DONE may somehow read stale values that
are apparently invalidated by CCU_INVALIDATE_DEPTH.
Fixes:
dEQP-VK.fragment_operations.early_fragment.sample_count_early_fragment_tests_depth_alpha_to_coverage_samples_2_maintenance5
dEQP-VK.fragment_operations.early_fragment.sample_count_early_fragment_tests_depth_alpha_to_coverage_samples_4_maintenance5
dEQP-VK.fragment_operations.early_fragment.sample_count_early_fragment_tests_depth_samples_2_maintenance5
dEQP-VK.fragment_operations.early_fragment.sample_count_early_fragment_tests_depth_samples_4_maintenance5
When running them with TU_DEBUG=sysmem
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30858>
Some NIR passes (such as nir_opt_varyings) rely on having the
XFB info in explicit I/O intrinsics. If we want to use those,
we need to add this info.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28676>
This code will be shared between RADV and RadeonSI.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28676>
Per description of VkPipelineShaderStageCreateFlags
```
VK_PIPELINE_SHADER_STAGE_CREATE_REQUIRE_FULL_SUBGROUPS_BIT specifies
that the subgroup sizes must be launched with all invocations active in
the task, mesh, or compute stage.
```
Future CTS tests will use that.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31023>
Certain AOSP targets feature a glibc or musl-based build,
where (__ANDROID__) isn't defined, but:
- ANDROID is defined
- the relevant headers are present.
For such builds, buffer_handle_t shouldn't be defined
internally.
Reviewed-by: Roman Stratiienko <r.stratiienko@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31030>
When there are no VS outputs, we expect that the drivers set
the LS-HS vertex stride to zero, which will produce the
same result as no_inputs_in_lds did.
Remove the unnecessary code path from the output lowering.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30962>