Commit graph

20529 commits

Author SHA1 Message Date
Jaishankar Rajendran
cd941d3970 vulkan/runtime: enable parametrization of ASTC software decode
Enable the driver to select :
  - LUT allocation alignment
  - LUT memory flags selection

Signed-off-by: Prakhar Vishwakarma <prakhar.vishwakarma@intel.com>
Signed-off-by: Jaishankar Rajendran <jaishankar.rajendran@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41205>
2026-04-27 15:17:04 +00:00
Samuel Pitoiset
bd62c72223 radv: cleanup invalidating vertex draw state
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41159>
2026-04-27 10:32:10 +00:00
Samuel Pitoiset
8c425351e9 radv: stop dirtying some states after DGC execute
The Vulkan spec says:
    "After a call to vkCmdExecuteGeneratedCommandsEXT, command buffer
     state will become undefined according to the tokens executed. This
     table specifies the relationship between tokens used and state
     invalidation."

The application must re-bind the states that are updated using DGC.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41159>
2026-04-27 10:32:10 +00:00
Samuel Pitoiset
4996cd82f6 radv: only emit the "normal" index buffer when needed with DGC
Only if DGC emits an indexed draw without providing the index buffer
as part of the tokens. This avoids emitting useless packets.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41159>
2026-04-27 10:32:10 +00:00
Samuel Pitoiset
dc816ce4ac radv: remove an useless check when emitting the index buffer
index_type is uint32_t, so this checks is always FALSE.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41159>
2026-04-27 10:32:09 +00:00
Trigger Huang
8d60001d69 radv: enable protected memory
Advertise protectedMemory feature for application when TMZ is available

Signed-off-by: Trigger Huang <Trigger.Huang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40619>
2026-04-27 09:03:34 +00:00
Julia Zhang
1f9d1366f8 radv: save protected queue and non-protected queue seperately
Save protected radv_queue in device->queues_protected so it can be
relased in radv_destroy_device.

Without this, device->queues[] will point to a new queues to record
the queue created according to the queueCreateInfo. When queueCreateInfo
include both protected and unprotected queue info, the device->queues[]
will be created twice and record one queue at each time. So we will lose
either protected queue or unprotected queue which created first.

Signed-off-by: Julia Zhang <Julia.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40619>
2026-04-27 09:03:34 +00:00
Julia Zhang
0e36d7112c radv: set TMZ bit in sdma_copy packet
Pass secure and set TMZ bit in sdma_copy packet for protected image

Signed-off-by: Julia Zhang <Julia.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40619>
2026-04-27 09:03:34 +00:00
Julia Zhang
bfc54d444d radv: enable surface protected capability
Pass protected support flag to wsi_device to enable surface protected
capability when radv_physical_device::has_tmz_support.

Signed-off-by: Julia Zhang <Julia.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40619>
2026-04-27 09:03:34 +00:00
Julia Zhang
60bd766299 radv: create encrypted BOs for protected cmd_buffers
Create encrypted fence_bo and eop_bug_bo when the radv_cmd_buffer is
created from a protected pool which is marked with flag
VK_COMMAND_POOL_CREATE_PROTECTED_BIT

Signed-off-by: Julia Zhang <Julia.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40619>
2026-04-27 09:03:34 +00:00
Julia Zhang
501f72bc89 radv: allocate encrypted rings BOs
Pass secure flag to radv_update_preamble_cs() if this queue is created
with vk flag VK_QUEUE_PROTECTED_BIT and create encrypted tess/ge ring
BOs and compute_scratch_bo according to this secure flag.

Signed-off-by: Julia Zhang <Julia.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40619>
2026-04-27 09:03:34 +00:00
Trigger Huang
68db27f0b4 radv: add protected type bits for memory requirements
Add protected type bits for memory requirements of protected resources

Signed-off-by: Trigger Huang <Trigger.Huang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40619>
2026-04-27 09:03:34 +00:00
Trigger Huang
540864685d radv: support secure submission
Set AMDGPU_IB_FLAGS_SECURE on IBs to support secure submission

Signed-off-by: Trigger Huang <Trigger.Huang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40619>
2026-04-27 09:03:33 +00:00
Trigger Huang
535207a075 radv: allow creation of protected queues
Advertise VK_QUEUE_PROTECTED_BIT on gfx and transfer queues to allow
creation of protected queues.

Signed-off-by: Trigger Huang <Trigger.Huang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40619>
2026-04-27 09:03:33 +00:00
Trigger Huang
af35a99435 radv: supports protected memory allocation
Add memory type for protected memory to support TMZ encrypted memory
allocation

Signed-off-by: Trigger Huang <Trigger.Huang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40619>
2026-04-27 09:03:33 +00:00
Julia Zhang
6496f9d123 radv: add new option RADV_DEBUG=notmz
Used for enable/disable TMZ support of radv.

Signed-off-by: Julia Zhang <Julia.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40619>
2026-04-27 09:03:32 +00:00
Samuel Pitoiset
ac52fb569a radv: fix a potential NULL pointer dereference when emitting VBOs
vkCmdBindVertexBuffers() -> draw with mesh shaders will just segfault.
This sequence doesn't make real sense but it's possible.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41161>
2026-04-27 07:40:14 +00:00
Samuel Pitoiset
782254b820 radv: re-introduce DGC+multiview support and enable it for vkd3d-proton only
The Vulkan spec says:
    "VUID-vkCmdExecuteGeneratedCommandsEXT-None-11062
     If a rendering pass is currently active, the view mask must be 0."

So, it's invalid with VK_EXT_device_generated_commands but it's allowed
in DX12, it seems we missed this during the spec review.

Crimson Desert uses this and emulating in vkd3d-proton would be complex,
so let's re-introduce this support only for vkd3d-proton.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41153>
2026-04-27 07:08:23 +00:00
Samuel Pitoiset
2d78546d59 radv: store the number of PS params heuristic to radv_compiler_info
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This improves compatibility between eg. NAVI33 and PHOENIX because
NGG culling is disabled by default on GFX11+.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41137>
2026-04-27 06:26:25 +00:00
Samuel Pitoiset
48db5c0378 radv: pass radv_compiler_info to radv_pipeline_get_shader_key()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41137>
2026-04-27 06:26:25 +00:00
Peyton Lee
9225ba47d5 amd/vpelib: Support vpe 2.0
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Support vpe 2.0
Update vpelib to support vpe 2.0 includes new color formats,
blending, and 3dlut fast loading.

Signed-off-by: Peyton Lee <peytolee@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41073>
2026-04-27 03:06:42 +00:00
Peyton Lee
85a5d6233b amd/vpelib: add alpha fill support check
Add helper functions check_alpha_fill_support()
Also fix incorrect color format naming.

Signed-off-by: Peyton Lee <peytolee@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41073>
2026-04-27 03:06:42 +00:00
Marek Olšák
bfb6c41b64 amd: remove unnecessary and transitive #includes
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reported by clang tools.
See: https://clangd.llvm.org/guides/include-cleaner

struct ac_cmdbuf had to be moved to ac_cmdbuf_base.h because we can't
include ac_cmdbuf.h->sid.h->amdgfxregs.h in radeon_winsys.h for r300.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41091>
2026-04-24 21:53:07 +00:00
Andrzej Datczuk
d476c96bad radv: enable advertising of VK_KHR_pipeline_library under llvm
KHR_pipeline_library is a base extension whose semantics can only
be exercised by extensions: EXT_graphics_pipeline_library
and KHR_ray_tracing_pipeline. Both remain gated under LLVM,
so advertising the KHR base extension is inert for conformant apps.

The reason for the change is a hard requirement for KHR_pipeline_library
in DXVK 2.7+. DX games under Proton, which uses DXVK, fail adapter
creation if this extension is absent. DXVK supports scenario when
KHR_pipeline_library is available but two dependent extensions aren't.

The !use_llvm condition originated in f1095260a4 when
KHR_pipeline_library was first wired up for ray tracing only. It was
touched also in 045c96d896 when EXT_graphics_pipeline_library also took
it as a dependency.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41134>
2026-04-24 15:59:59 +00:00
Rhys Perry
f9f60aa844 radv: don't use radv_optimize_nir after lowering indirect derefs for RT
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Just these three passes seem necessary. radv_rt_nir_to_asm() will call
radv_optimize_nir() later.

fossil-db (navi21):
Totals from 32 (0.02% of 202427) affected shaders:
Instrs: 205974 -> 205948 (-0.01%)
CodeSize: 1131492 -> 1131352 (-0.01%)
SpillSGPRs: 321 -> 320 (-0.31%)
Latency: 2171106 -> 2170677 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 540282 -> 540174 (-0.02%)
VClause: 5579 -> 5578 (-0.02%)
SClause: 4586 -> 4582 (-0.09%)
Copies: 23543 -> 23535 (-0.03%)
PreSGPRs: 2444 -> 2443 (-0.04%)
VALU: 129415 -> 129399 (-0.01%)
SMEM: 7175 -> 7170 (-0.07%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31265>
2026-04-24 11:01:03 +00:00
Rhys Perry
91d555c2cb radv: lower indirect derefs after linking
Scratch access isn't very optimizable, so more stores are optimized away
if we lower indirect derefs after both linking and radv_optimize_nir.

fossil-db (navi21):
Totals from 1264 (0.62% of 202427) affected shaders:
Instrs: 1504703 -> 1504708 (+0.00%); split: -0.02%, +0.02%
CodeSize: 8031388 -> 8031020 (-0.00%); split: -0.02%, +0.02%
SpillSGPRs: 1865 -> 1869 (+0.21%)
Latency: 12106362 -> 12106464 (+0.00%); split: -0.01%, +0.01%
InvThroughput: 4056269 -> 4056044 (-0.01%); split: -0.01%, +0.00%
VClause: 13927 -> 13940 (+0.09%)
SClause: 32382 -> 32396 (+0.04%); split: -0.03%, +0.08%
Copies: 188004 -> 187897 (-0.06%); split: -0.17%, +0.11%
Branches: 39045 -> 39052 (+0.02%); split: -0.01%, +0.03%
PreSGPRs: 79885 -> 79814 (-0.09%); split: -0.11%, +0.02%
VALU: 1072639 -> 1072532 (-0.01%); split: -0.01%, +0.00%
SALU: 187317 -> 187375 (+0.03%); split: -0.11%, +0.14%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31265>
2026-04-24 11:01:03 +00:00
Rhys Perry
1943e88d56 radv: move ac_nir_lower_indirect_derefs to end of radv_shader_spirv_to_nir
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31265>
2026-04-24 11:01:03 +00:00
Samuel Pitoiset
cf9fb46e54 radv: zero-initialize radv_cmd_state only when a cmdbuf is reset
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Command buffers are zero-allocated, so this is only needed when reset.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41107>
2026-04-24 06:28:41 +00:00
Samuel Pitoiset
0b21aaaa59 radv: remove redundant initialization when beginning a cmdbuf
radv_cmd_state is already zero-initialized few lines above.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41107>
2026-04-24 06:28:41 +00:00
Samuel Pitoiset
92a5526435 radv: move shader_upload_seq to radv_cmd_buffer_queue_state
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41107>
2026-04-24 06:28:40 +00:00
Samuel Pitoiset
b9b9850d82 radv: move uses_perf_counters to radv_cmd_buffer_queue_state
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41107>
2026-04-24 06:28:40 +00:00
Samuel Pitoiset
f8aed0793b radv: move queue related cmd buffer state to a new struct
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41107>
2026-04-24 06:28:39 +00:00
Emma Anholt
7094e30a00 ci/amd: Switch radv-raven-traces-restricted over to gpu-trace-replay.sh
The new tool has much better image diffing presentation (thanks to
Danilo's work on turnip's private trace CI), better performance, flake
checking within a single run, parallelized downloads along with replays,
system monitoring for replay debug (OOMs especially), and DXVK support
(I've added a few traces, but not most of the collection because I didn't
want to block on stabilizing this job with everything).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41115>
2026-04-23 22:54:12 +00:00
Samuel Pitoiset
91f5fcdcd5 radv: advertise VK_KHR_shader_constant_data
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40722>
2026-04-23 11:12:06 +00:00
Liu, Mengyang
40fa195cd0 aco: fix broken VGPRs reservation for 64-bit attributes in VS prologs
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
After 8e6bff4caa, the large attribute counts as two slots in
`num_attributes` if the vertex shader consumes more than two
channels of it, even though `misaligned_mask` marks only the
lower slot.

Fixes: 8e6bff4caa ("radv: Lower 64-bit VS inputs to 32-bit")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41071>
2026-04-23 08:20:47 +00:00
Samuel Pitoiset
371316e989 radv: use radv_compiler_info everywhere during compilation
This prevents the compiler to access the logical/physical devices and
the instance during compilation.

The main goal is to make it more robust against cache related issues
when something isn't hashed correctly (this used to happen a lot in the
past). Also it would be much more robust for sharing binaries between
two GPUs in the same generation (eg. Vangogh/Rembrandt) because
everything needed for compilation is in radv_compiler_info. There is
still some work to do to achieve that but it's making good progress.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40992>
2026-04-23 07:56:48 +00:00
Samuel Pitoiset
4a91fd8bab radv: add a radv_compiler_info object
This object contains everything needed for compiling AMD binaries
from SPIR-V to assembly.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40992>
2026-04-23 07:56:48 +00:00
Samuel Pitoiset
6e86d2877b radv/rt: pass more parameters to radv_rt_nir_to_asm()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40992>
2026-04-23 07:56:47 +00:00
Valentine Burley
34ffa61805 radv/ci: Add more ASAN VKCTS jobs on Cezanne
Use more devices for the radv-cezanne-vkcts-asan job, but decrease
concurrency from 4 to 3 to avoid OOMs.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41099>
2026-04-23 07:34:03 +00:00
Qiang Yu
b41cd59790 ac,radeonsi,radv: use V_581A_* engine sel for non-pws acquire_mem packet
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
V_581B_PFP and V_581B_ME is for pws acquire_mem. Current code
does not cause any problem because we won't pass engine arg
directly to acqure_mem packet. But use a native V_581A_* arg
for better coding.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41069>
2026-04-23 02:48:06 +00:00
Qiang Yu
89c1bf34ed ac,radeonsi,radv: fix print IB assertion fail for reserved fields
New IB print will assert reserved packet field to be zero.

Fixes: 1c75cd958f ("ac: enable the new auto-generated CP packet parser")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41069>
2026-04-23 02:48:06 +00:00
David Rosca
3d0239cff9 radv/video: Fix initializing rc structs with default rate control
Fixes: 32a02720a8 ("radv/video: Init session and update rate control in ControlVideoCoding")
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41027>
2026-04-22 14:49:27 +00:00
David Rosca
906be9bc7e radv: Fix uint32 overflow in slice offset calculation
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41027>
2026-04-22 14:49:27 +00:00
Samuel Pitoiset
a73fc90bcd radv: fix GPU hangs with PS epilogs and secondaries properly
The previous fix was incomplete because if the same graphics pipeline
and the same PS epilog are rebind after vkCmdExecuteCommands(), the PS
epilog state wouldn't be re-emitted, and it will use a wrong VA (in case
both fragment shader user SGPRs aren't similar either).

Resetting the PS epilog to NULL in the primary should prevent any
issues, but this tracking still need to be improved because it caused
two issues recently.

Fixes: 1a00587c44 ("radv: fix a GPU hang with PS epilogs and secondary command buffers")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15176
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41056>
2026-04-22 08:03:35 +00:00
Samuel Pitoiset
9d17a7bdb4 spirv,treewide: rework specialization constant
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
With SPV_KHR_constant_data, it's allowed to specialize array of
constants.

RustiCL changes are from Karol Herbst <kherbst@redhat.com>.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41046>
2026-04-22 06:57:55 +00:00
Eric R. Smith
4ae192a3d9 glsl, spirv: Improve accuracy of asin() and acos()
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The polynomial used for asin_expr() was suboptimal (and its source was
not documented).

A better approximation is found in the _Handbook_of_Mathematical_Functions_
by Abramowitz and Stegun, which is used in Nvidia's Cg toolkit. However,
while this approximation gives a good absolute error bound, its relative
error exceeds the 4096 ulp allowed by the Vulkan spec. Taking a page
from the spirv implementation of asin(), we implement a piecewise
approximation where a Taylor series is used for small values of |x|.
This patch also harmonizes the GLSL and Vulkan implementations by moving
the implementation to common code (nir_builder).

Running tests on asin() with a grid of 64000 samples between 0.0 and +1.0,
the original asin() at 32 bits has:
```
                       glsl                       spirv
  RMSE:            1.756451e-04                 1.609091e-04
  worst abs error: 3.904104e-04 at 0.937001     3.904104e-04 at 0.937001
  worst ulp error: 11800 at 6.2499e-05          3826 at 0.841331
```
whereas the new implementation has for both:
```
  RMSE:            2.528056e-05
  worst abs error: 4.962087e-05 at 0.451149
  worst ulp error: 2379 at 0.215106
```

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40862>
2026-04-21 21:10:22 +00:00
Samuel Pitoiset
ebf2797da2 vulkan,treewide: stop passing vk_device to vk_pipeline_robustness_state_fill()
This will be helpful for RADV.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41029>
2026-04-21 17:29:04 +00:00
Rhys Perry
bddd8b36a6 aco: use RegisterDemand::operator[] more
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39690>
2026-04-21 11:16:26 +00:00
Rhys Perry
176b075129 aco: prefer spilling smaller temporaries if it finishes spilling
fossil-db stats seemed less positive when updating process_block() too.

fossil-db (navi31):
Totals from 41 (0.05% of 84369) affected shaders:
Instrs: 294758 -> 294694 (-0.02%); split: -0.11%, +0.09%
CodeSize: 1566136 -> 1564392 (-0.11%); split: -0.21%, +0.10%
SpillSGPRs: 2306 -> 2143 (-7.07%); split: -8.37%, +1.30%
Latency: 3877251 -> 3868194 (-0.23%); split: -0.29%, +0.05%
InvThroughput: 881747 -> 882352 (+0.07%); split: -0.01%, +0.08%
SClause: 6498 -> 6494 (-0.06%); split: -0.09%, +0.03%
Copies: 33582 -> 33900 (+0.95%); split: -0.23%, +1.18%
Branches: 6799 -> 6801 (+0.03%)
VALU: 192977 -> 192646 (-0.17%); split: -0.21%, +0.04%
SALU: 28082 -> 28395 (+1.11%); split: -0.27%, +1.39%
VOPD: 1939 -> 1959 (+1.03%); split: +1.19%, -0.15%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39690>
2026-04-21 11:16:26 +00:00
Rhys Perry
0ffbc30d7f aco: refactor spiller to use spills_needed variable
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39690>
2026-04-21 11:16:26 +00:00