Commit graph

215870 commits

Author SHA1 Message Date
antonino
cd269ebfa2 nir: allow using sysvals in nir_lower_clip
By passing a `NULL` pointer for the `ucp_enable` argument the
value will be loaded from a sysval
2025-12-15 23:30:56 +01:00
antonino
0068da6f45 nir: add clip_plane_enable sysval 2025-12-15 23:30:27 +01:00
antonino
e564c97a22 zink: advertise fragment_color_clamped when precompiling
Zink will do color clamp lowering internally in order to handle
precompilation so this cap needs to be advertised.
2025-12-15 23:30:23 +01:00
antonino
532661d5e1 zink: handle color clamp lowering 2025-12-15 23:30:15 +01:00
antonino
3a51333e99 zink: add a lowering pass for st sysvals
The uber shader relies on a number of sysvals being lowered to push
costants in order to dynamically controlling the behaviour of some
lowering passes.

Add a pass to perform such lowering.
2025-12-15 23:30:08 +01:00
antonino
e5adbb9769 nir: sysval to disable/enable clamp_color lowering
Allow to use `nir_intrinsic_load_clamp_color_enabled` to control the
color clamp lowering pass.
2025-12-15 23:29:40 +01:00
antonino
fcf9161909 nir: add clamp_color_enabled intrinsic 2025-12-15 23:29:12 +01:00
antonino
4f4ca234cc nir: don't pass shader in nir_lower_clamp_color_outputs
nir_builder contains a pointer to the shader so there is no point in
passing it as an extra argument.
2025-12-15 23:29:08 +01:00
antonino
4581de5886 zink: send the st key to shaders as a push constant
Subsequent commits rely on the key being available in push constants.
2025-12-15 23:29:01 +01:00
antonino
f2ecf95fa7 zink: uber shaders logic
Introduce the logic to implement uber shaders.

The way variants is handled changes significantly: a uber program is
expected to be compiling asynchronously and is used whenever possible.

Specialized variant shaders are compiled asynchronously, though they
might be compiled synchronously if the uber program can't be used.

Each variant is a separate program as that simplifies gpl/obj caching.

A new key is introduced, the st_key, that keeps track of the state of
the features emulated by the uber shader.

This is split in a dynamic part, always sent through push constants, and
a more compact part that is used as a key for caching optimized
variants.
2025-12-15 22:44:55 +01:00
Anna Maniscalco
446e49f145 zink: prog->base.uses_shobj instead of per screen bool
Whether a prog can use shobj can depend on more things than
screen->info.have_EXT_shader_object so use the prog specific bool.
2025-12-10 23:27:36 +01:00
Anna Maniscalco
a363ec306d zink: refactor get_shader_module_for_stage_optimal
Factor out key calculation.
2025-12-10 23:27:36 +01:00
antonino
2ffd1211de zink: implement passing sys values as push constants
Lower sysvals for `flat_flags` and `pv_last_vert` as push constants.

This allows to change them dynamically without recompiling shaders which
is necessary for uber shaders.
2025-12-10 23:27:36 +01:00
antonino
e94891fa30 zink: fix crash in in replicate_derefs
Only call `nir_src_as_deref` when needed to avoid crashes.
2025-12-10 23:27:36 +01:00
Michael Cheng
8ba197c9ef anv: Switch shaders to dedicated VMA allocator
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Switched to the new VMA allocator that provides explicit GPU VA
control via util_vma_heap.

This is architectural preparation for ray tracing capture/replay,
which requires the ability to reserve and allocate shaders at specific
VAs. The state pool's free-list design makes VA reservation difficult
to add, while the new chunk allocator is designed for explicit VA
management from the ground up.

Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38869>
2025-12-10 20:32:10 +00:00
Michael Cheng
1fa327ac32 anv: Add VMA allocator for shader binaries
Introduce a VMA-first chunk allocator for shader binaries to eventually
replace the anv_state_pool-based implementation. This allocator works
directly with GPU virtual addresses through util_vma_heap, making the
virtual address space an explicit resource managed by ANV.

No functional change in this commit.

v2(Michael Cheng): Use existing instruction state pool anv_va_range

v3(Lionel): Simplify allocator

Signed-off-by: default avatarMichael Cheng <michael.cheng@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38869>
2025-12-10 20:32:10 +00:00
Lionel Landwerlin
20f320b7c7 anv: program STATE_BASE_ADDRESS instruction ptr using pdevice address
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Michael Cheng <michael.cheng@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38869>
2025-12-10 20:32:10 +00:00
Lionel Landwerlin
7cc9d8eec7 anv: fixup error path for shader allocation
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: d39e443ef8 ("anv: add infrastructure for common vk_pipeline")
Acked-by: Michael Cheng <michael.cheng@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38869>
2025-12-10 20:32:10 +00:00
Lionel Landwerlin
567c1b3af4 anv: add missing device_memory_report for shaders
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: d39e443ef8 ("anv: add infrastructure for common vk_pipeline")
Acked-by: Michael Cheng <michael.cheng@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38869>
2025-12-10 20:32:09 +00:00
Lionel Landwerlin
efe60d2940 intel: remove unused show_shader_stage debug option
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Michael Cheng <michael.cheng@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38869>
2025-12-10 20:32:09 +00:00
Lionel Landwerlin
37789249a1 anv: fix internal representations of shaders
The shader assembly was only available when not hitting the cache.

Additionally the serialized shader code was also the relocated variant
which meant that it could differ from one run to the next. Instead
serialize the unrelocated code produced by the compiler.

With this change we now decode the copy of the ISA we have on the
host.

NIR dumps are only available for shaders not loaded from the cache
(much like the other drivers).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 8f4c2bd566 ("anv: add runtime shader statistic support")
Acked-by: Michael Cheng <michael.cheng@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38869>
2025-12-10 20:32:08 +00:00
Nanley Chery
fe372f3b1b anv: Don't allow STORAGE + CCS for Y_TILED mod
This can happen as a result of us adding on CCS to modifiers which don't
support it on gfx9-11.

Fixes image corruption seen with the following test:

   $ mpv av://lavfi:testsrc --config=no --vo=gpu-next --scale=ewa_lanczossharp --fs

Fixes: 01c4ea771c ("anv: Enable storage accesses with modifiers on gfx12+")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12910
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38855>
2025-12-10 20:09:09 +00:00
Caio Oliveira
7bd238fa5a brw: Properly set 'desc as register' for SEND in assembler
The non-split SEND case was missing setting this.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38876>
2025-12-10 19:46:52 +00:00
Marek Olšák
308da55f1a radv,radeonsi: use FRAG_RESULT_DUAL_SRC_BLEND
this is slightly nicer

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38604>
2025-12-10 19:16:46 +00:00
Marek Olšák
9a2f1be814 nir: add FRAG_RESULT_DUAL_SRC_BLEND and an option to use it
This is potentially nicer for some drivers. AMD drivers will use it.

mesa_frag_result_get_color_index will be used often.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38604>
2025-12-10 19:16:46 +00:00
Chia-I Wu
ddd0b0c3a8 panvk: rework calculate_task_axis_and_increment
We used to maximize threads_per_task, but that is ideal when the system
has a single gpu client. When there are multiple gpu clients, we want
smaller threads_per_task such that cores can be more fairly shared among
the clients.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Tested-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37988>
2025-12-10 18:54:48 +00:00
Chia-I Wu
5fd32d79ee panvk: fix calculate_task_axis_and_increment
task_axis selects the dim of the global workgroup, not the dim of the
local workgroup.

v2: fix assert for dEQP-VK.compute.pipeline.basic.empty_workgroup*

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Tested-by: Yiwei Zhang <zzyiwei@chromium.org> (v1)
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> (v1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37988>
2025-12-10 18:54:48 +00:00
Chia-I Wu
546d73721b panvk: set compute_ep_limit on v12+
Set compute_ep_limit to max_tasks_per_core on v12+. It is generally a
good idea to queue as many tasks as possible to better utilize the
cores.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Tested-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37988>
2025-12-10 18:54:48 +00:00
Chia-I Wu
bcd2e62ad0 panfrost: make RUN_COMPUTE.ep_limit configurable
Since v12, RUN_COMPUTE.ep_limit specifies the size of the compute task
queue.  RUN_COMPUTE stalls when there are more tasks in the queue than
the specified ep_limit.

Sensible values are 0 (treated as 4), 4, or 16 (max_tasks_per_core).

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Tested-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37988>
2025-12-10 18:54:48 +00:00
Yiwei Zhang
c696ec3b73 venus: add missing VKAPI_ATTR/CALL
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14446
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38882>
2025-12-10 18:11:07 +00:00
Aaron Ruby
b17896f693 device-select-layer: Implement VkNegotiateLayerInterface::pfnGetDeviceProcAddr
This must be implemented for loaderLayerInterfaceVersion >= 2. The only
interface that's allowed to be set to null is pfnGetPhysicalDeviceProcAddr.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38270>
2025-12-10 17:36:21 +00:00
Pohsiang (John) Hsu
6173ff73c7 mediafoundation: remove unused templ and small code cleanup
Reviewed-by: Yubo Xie <yuboxie@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38857>
2025-12-10 17:13:34 +00:00
Pohsiang (John) Hsu
23516579a8 mediafoundation: remove unneeded memset (~34KB for hevc)
Reviewed-by: Yubo Xie <yuboxie@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38857>
2025-12-10 17:13:34 +00:00
Silvio Vilerino
c0039ce657 d3d12: Prefer video encode suballocated buffer mode for subregion notification mode
Reviewed-by: Pohsiang (John) Hsu <pohhsu@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38857>
2025-12-10 17:13:33 +00:00
Pohsiang (John) Hsu
d16b651fdd mediafoundation: add some end of function error logging for diagnosing error
Reviewed-by: Yubo Xie <yuboxie@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38857>
2025-12-10 17:13:33 +00:00
Pohsiang (John) Hsu
47dc4b90e4 mediafoundation: propagate PrepareForEncode error up.
Reviewed-by: Yubo Xie <yuboxie@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38857>
2025-12-10 17:13:33 +00:00
Pohsiang (John) Hsu
10138e5b42 mediafoundation: turn on slice auto on frames with dirty rect only
Reviewed-by: Yubo Xie <yuboxie@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38857>
2025-12-10 17:13:32 +00:00
Yonggang Luo
095c2acf01 meson: do not reconstruct ICD paths
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This is a follow up of
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20907/diffs?commit_id=b6a344f4baa1ee2c784ca74499dc9fe3b4519013

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38637>
2025-12-10 14:46:11 +00:00
Yonggang Luo
be4ad5c819 meson: Remove VK_ICD_FILENAMES totally from source tree.
This is a follow up of
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28516

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@google.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> hk changes
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> for RADV changes
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38637>
2025-12-10 14:46:11 +00:00
Dylan Baker
938fb7703e anv/video: Cast intentional read past end of struct member to void*
Coverity notices that we read past the end of the array we're pointing
to, which is intentional, we want to copy additional members from the
source struct into the target pointer. As such, cast to a `void *`,
since this will make Coverity happy.

CID: 1649589
Fixes: 314de7af06 ("anv: Initial support for VP9 decoding")
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38438>
2025-12-10 14:18:59 +00:00
Tapani Pälli
c9bc373f7c crocus: add struct crocus_scissor_state to clamp values to 16bit
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This is a port of iris driver commit 193e494e6a to crocus.

Fixes: bc1a6b0a41 ("gallium: change pipe_scissor_state to 32 bit integer")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14428
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38862>
2025-12-10 14:04:56 +00:00
Valentine Burley
c56543874c zink/ci: Document recent Turnip flakes
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38867>
2025-12-10 13:32:08 +00:00
Georg Lehmann
621465e417 nir/opt_uniform_subgroup: handle more trivial shuffles/votes
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38867>
2025-12-10 13:32:08 +00:00
Georg Lehmann
e648e551c1 nir/opt_uniform_subgroup: wire up mbcnt_amd path
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38867>
2025-12-10 13:32:08 +00:00
Georg Lehmann
5778436e99 nir/opt_uniform_subgroup: use nir_shader_intrinsics_pass
Nothing here needs the recursion of the full lower_instructions pass.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38867>
2025-12-10 13:32:08 +00:00
Georg Lehmann
5f28bb72a7 nir/divergence_analysis: fix swizzle_amd without fetch inactive
Fixes: ad5be40303 ("nir: add fetch inactive index to quad_swizzle_amd/masked_swizzle_amd")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38867>
2025-12-10 13:32:08 +00:00
Georg Lehmann
1fc38d8539 nir/opt_uniform_subgroup: fix swizzle_amd without fetch_inactive
Fixes: ad5be40303 ("nir: add fetch inactive index to quad_swizzle_amd/masked_swizzle_amd")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38867>
2025-12-10 13:32:08 +00:00
Georg Lehmann
e11d7f06d0 nir/opt_uniform_subgroup: don't try to optimize non trivial clustered reduce
Fixes: 535caaf3e0 ("nir: Optimize uniform iadd, fadd, and ixor reduction operations")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38867>
2025-12-10 13:32:08 +00:00
Valentine Burley
a265cdaa18 ci/deqp: Backport Android logcat commit
Instead of manually applying the patch, backport the version that landed
in main, which requires a cmake argument to enable.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38071>
2025-12-10 11:31:33 +00:00
Valentine Burley
4cbf5062b7 ci: Uprev GL & GLES CTS
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38071>
2025-12-10 11:31:33 +00:00