This doesn't work with NIR_DEBUG=serialize or NIR_DEBUG=clone.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37573>
When NIR_DEBUG=serialize or NIR_DEBUG=clone is used, NIR_PASS recreates
nir_function_impl and nir_variable objects, causing use-after-free since
radv_nir_lower_rt_abi() keeps pointers to those in local variables.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37573>
When NIR_DEBUG=serialize or NIR_DEBUG=clone is used, NIR_PASS recreates
nir_function_impl and nir_variable objects, causing use-after-free since
ac_nir_lower_ngg_nogs() keeps pointers to those in local variables.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13946
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37573>
Feedback requires transfer capability, so we must skip feedback cmd
pool initialization on incompatible queue families. Meanwhile, use
pool_handle for all validity check needed.
- fence and semaphore feedback: skip feedback cmd alloc and record when
pool_handle is VK_NULL_HANDLE
- event feedback: not affected as we patch in-place upon recording
- query feedback: assert the feedback cmd alloc is on supported queue
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38016>
The framebuffer dimension exposed to apps is still 16k but since the
driver allows 32k image on GFX12+, meta operations might perform
operations (like a copy) using graphics.
While we are at it, use the correct bitfield for setting BR_X/BR_Y on
GFX12.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37974>
enums2names.py is only uses in one place. I propose to remove the -I
argument that is not strictly necessary as we can already get the
header name from the `-H` argument.
That modification is motivated by the need to help ninja-to-soong to
generate proper rule for the Android build system.
ninja-to-soong can't differenciate output file location and a string
matching the output file name.
Ref #14072
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37785>
Apparently various tessellation parameters come specified from
TESS_EVAL stage in GLSL while they come from the TESS_CTRL stage in
HLSL.
We switch to store the tesselation params more like shader_info with 0
values for unspecified fields. That let's us merge it with a simple OR
with values from from tcs/tes and the resulting merge can be used for
state programming.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: a91e0e0d61 ("brw: add support for separate tessellation shader compilation")
Fixes: 50fd669294 ("anv: prep work for separate tessellation shaders")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37979>
It must not be larger than maxInlineUniformBlockSize.
Fixes VKCTS 1.4.4.0's
dEQP-VK.api.maintenance3_check.support_count_inline_uniform_block*.
Cc: mesa-stable
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38002>
Based on RADV.
The Vulkan spec says:
"If bindingCount is zero or if this structure is not included in
the pNext chain, the VkDescriptorBindingFlags for each descriptor
set layout binding is considered to be zero. Otherwise, the
descriptor set layout binding at
VkDescriptorSetLayoutCreateInfo::pBindings[i] uses the flags in
pBindingFlags[i]."
Fixes dEQP-VK.api.maintenance3_check.* in VKCTS 1.4.4.0.
Cc: mesa-stable
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38002>
Introduces an opt pass that attempts to optimize
load_barycentric_at_{sample,offset} with simpler load_barycentric_*
equivalents where possible, and optionally lowers
load_barycentric_at_sample to load_barycentric_at_offset with a position
derived from the sample ID instead.
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37658>
The subgroup lowering may generate new fp64 vector operations, so
ensure that those are lowered before calling nir_lower_doubles().
Issue spotted by Georg Lehmann.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38003>
hk_heap is called during command buffer recording, which may be
concurrent, so writing dev->heap without synchronization is a data race.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Fixes: 5bc8284816 ("hk: add Vulkan driver for Apple GPUs")
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37973>
lowering bitsize before lowering idiv is silly, since then it forces us
down the software int32 division path instead of the much faster
int8/int16 lowered path. Relevant CTS tests:
dEQP-VK.spirv_assembly.type.scalar.i16.div_comp,
dEQP-VK.spirv_assembly.type.scalar.i8.rem_comp,
Go from:
SIMD8 shader: 46 instructions. 1 loops. 4716 cycles. 0:0 spills:fills
SIMD8 shader: 1008 instructions. 0 loops. 3600 cycles. 0:0 spills:fills, 8 sends
to:
SIMD8 shader: 17 instructions. 1 loops. 2556 cycles. 0:0 spills:fills
SIMD8 shader: 464 instructions. 0 loops. 1394 cycles. 0:0 spills:fills, 8 sends
No stats change on fossil-db (which has very little int8/int16 and even
less integer division, apparently).
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37966>
Skip waiting/signaling on semaphores with stages not related
to a given subqueue
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37810>
- rename vk_stage_to_subqueue_mask -> vk_stages_to_subqueue_mask
- handle stage masks instead of single stages.
- Add which sync scope it is reading to better reason with the mask semantics.
- Handle ALL_COMMANDS as well as TOP/BOTTOM (using sync scopes)
- add timestamp utility vk_stage_to_timestamp_subqueue_mask
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37810>
Until now, the driver has been using a single set for internal state
while leaving maxBoundDescriptorSets to the remaining 15.
This gives us no room for optimizations of driver sets, which might
become an issue in the future.
To remedy this, we therefore reduce maxBoundDescriptorSets to 7. This
aligns with the proprietary driver and gives us the space to optimize
the driver sets.
We might increase this in the future if we see that we don't need all
the driver sets we now reserve.
Reviewed-by: John Anthony <john.anthony@arm.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37978>