Similar to nir_lower_alu_width(), the callback can return the
desired number of components for a phi, or 0 for no lowering.
The previous behavior of nir_lower_phis_to_scalar() with lower_all=true
can be elicited via nir_lower_all_phis_to_scalar() while the previous
behavior with lower_all=false now corresponds to nir_lower_phis_to_scalar()
with NULL callback.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35783>
This check causes unnecessary overhead and can be replaced by simply
checking whether a phi_src is from a loop continue block.
Except for rare edge cases, the result will be the same.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35783>
With the check for the cap gone no driver will expose
ARB_shader_clock. Unfortunately the CI doesn't catch this
because it doesn't provide expectations whether a test should
pass or be skipped. In this case
spec@arb_shader_clock@execution@clock
spec@arb_shader_clock@execution@clock2x32
went from pass to skip. (Tested on r600, but on radeonsi one
can also see that the extension ARB_shader_clock is no longer
available).
Adding the test for the cap back in fixes this.
Fixes: 2ce201707e (Add support for EXT_shader_clock)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35992>
Add a new field userq_num_hqds to drm_amdgpu_info_hw_ip to expose the
number of available hardware queue descriptors (HQDs) for user queues.
This allows userspace to query the maximum number of user queues that
can be created for a particular IP block.
the patch link in driver side:
https://lists.freedesktop.org/archives/amd-gfx/2025-June/126686.html
v2: we should also put userq_num_hqds into radeon_info and
print it where other fields are printed. (Marek Olšák)
v3: rename num_userqs to num_queue_slots
and add print log in ac_print_gpu_info. (Marek Olšák)
v4: rename userq_num_hqds to userq_num_slots in hw_ip_info,
and update the hw information (Marek Olšák)
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35850>
Traces jobs upload their results at the end, making them incompatible
with the current design of CI-tron which doesn't allow internet access
for security reasons, so they are not included for now.
We're working on a solution for controlled access to specific domains,
and will add the traces jobs once that's ready.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35215>
It's done later in nir_lower_io_passes only for shader stages not
supporting indirect access.
Unfortunately we have add a hack into nir_lower_io_passes to get rid of
output loads. A later commit will remove it.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35945>
If the last block is empty, nir_block_last_instr returns NULL, which
sets the cursor to NULL, which crashes.
I think this can't crash currently because if xfb is present, there is
always at least 1 output store in the last block due to
lower_io_vars_to_temporaries, but that won't be true after we stop
calling it in a later commit.
Fixes: fa9cee4247 - glsl: implement lower_xfb_varying() as a NIR pass
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35945>
The GLSL compiler always lowers inputs to temps for VS and GS, so exclude
them from driver support because the GLSL compiler will no longer do that
unconditionally. Thus, indirect VS and GS inputs are completely untested
and broken in a lot of drivers.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35945>
These drivers set lower_all_io_to_temps = true, which means all indirect
access is always lowered except TCS, which is skipped by
nir_lower_io_vars_to_temporaries. Based on that, these drivers have never
received indirect IO for non-TCS shaders.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35945>
From the Vulkan spec:
`If pColorAttachmentLocations is NULL, it is
equivalent to setting each element to its index
within the array.`
Use similar logic to what we do in
CmdSetRenderingInputAttachmentIndices to handle
this behaviour properly.
Signed-off-by: Autumn Ashton <misyl@froggi.es>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35948>
using a screen method for this is broken since the value can change
before it is flushed. it must be passed along with the methods that use it
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35866>
it's possible for multiple user semaphores to be signaled in one batch,
and these all have the same mechanics as wait semaphores, which means
they unfortunately need their own submit in order to preserve ownership
when resetting the batch state
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35866>
functionally this is the same as other types of timeline semaphores, but
it is not actually the same as other types of timeline semaphores, e.g.,
in vulkan it would be VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_D3D12_FENCE_BIT
whereas other types of timeline semaphores would have different handle types
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35866>