Commit graph

73373 commits

Author SHA1 Message Date
Rob Clark
71e76f3637 freedreno: Remove use of fd_perfcntr_type/result_type
Everything is "UINT64, AVERAGE", so no need to get this from the table.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40522>
2026-04-24 21:28:30 +00:00
Emma Anholt
ed729bf948 ci/llvmpipe: Disable some traces too close to the timeout.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
I did my stress testing mostly outside of north america work hours, but it
turns out once the runners have 60-70% background CPU usage, these ones
intermittently time out.

Reported-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41163>
2026-04-24 18:06:48 +00:00
Silvio Vilerino
e4c9d57ddf d3d12: Flush stale video encode wait registrations when reusing ID3D12Fence objects
Reviewed-by: Pohsiang (John) Hsu <pohhsu@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41160>
2026-04-24 16:52:14 +00:00
Silvio Vilerino
fb13c044a8 Revert "d3d12: Video sliced encode: Use same ID3D12Fence/different per slice values as optimization"
This reverts commit b83a931cb1 as it causes
regressions with dirty rects enabled on some HW platforms that signal
out of order completion and require individual fence objects per slice

Fixes: b83a931cb1 ("d3d12: Video sliced encode: Use same ID3D12Fence/different per slice values as optimization")

Reviewed-by: Pohsiang (John) Hsu <pohhsu@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41160>
2026-04-24 16:52:14 +00:00
Derek Lesho
ce45069c49 zink: Guard bo map/unmap on map_count.
Otherwise zink_bo_map can return cpu_ptr being destroyed by zink_bo_unmap.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41127>
2026-04-24 13:44:50 +00:00
Pavel Ondračka
caeaa6bad2 i915/ci: update expectations
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41149>
2026-04-24 10:39:50 +00:00
Pavel Ondračka
1ca70a7d6c r300/ci: update expectations
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41149>
2026-04-24 10:39:50 +00:00
Rob Herring (Arm)
4e8e4ca2fc ethosu: Add minimum and maximum operators
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>
2026-04-24 09:22:16 +00:00
Rob Herring (Arm)
03e29e2fa5 teflon: Add minimum and maximum operations
Add the plumbing for minimum and maximum operations.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>
2026-04-24 09:22:16 +00:00
Rob Herring (Arm)
dce4b0313a ethosu: Add reshape operation
A reshape operation just changes the dimensions of a tensor, but doesn't
change the data at all. So we just point the OFM to the IFM data and
we're done.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>
2026-04-24 09:22:16 +00:00
Rob Herring (Arm)
08d93a60f5 ethosu: Add quantize operation
The quantize operation lowers to a pooling nop operation.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>
2026-04-24 09:22:15 +00:00
Rob Herring (Arm)
e6f4f6aa5d teflon: Add quantize operation
Add the plumbing for quantize operations.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>
2026-04-24 09:22:15 +00:00
Rob Herring (Arm)
2fe1301e5e ethosu: Add LeakyRelu operation
Add support for LeakyRelu operations. These are implemented as a pooling
LUT.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>
2026-04-24 09:22:15 +00:00
Rob Herring (Arm)
15bc152185 teflon: Add LeakyRelu operation
Add the plumbing for LeakyRelu operations.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>
2026-04-24 09:22:14 +00:00
Rob Herring (Arm)
3487b15312 ethosu: Add hard swish operation
Hard swish lowers to a pooling operation with a LUT.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>
2026-04-24 09:22:14 +00:00
Rob Herring (Arm)
f2800fe13b teflon: Add hard swish operation
Add the plumbing for hard swish operations.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>
2026-04-24 09:22:14 +00:00
Rob Herring (Arm)
a305dfd54b ethosu: Add logistic and TANH operations
Logistic and TANH operations are similar and both lower to pooling
operation with a LUT.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>
2026-04-24 09:22:13 +00:00
Rob Herring (Arm)
6933207435 teflon: Add TANH operation support
Add the plumbing for TANH operations.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>
2026-04-24 09:22:13 +00:00
Rob Herring (Arm)
df051917a5 ethosu: Add multiply operation support
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>
2026-04-24 09:22:13 +00:00
Rob Herring (Arm)
024c70fbb3 teflon: Add multiply operation
Add the plumbing for multiply operations.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>
2026-04-24 09:22:12 +00:00
Rob Herring (Arm)
d55a574898 ethosu: Support element wise op with constant IFM buffer
Element wise operations can have a constant data buffer.

Re-order things a bit to group all the IFM2 setup together.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>
2026-04-24 09:22:12 +00:00
Rob Herring (Arm)
1f579379c1 ethosu: Rename ethosu_lower_add to ethosu_lower_eltwise
The ethosu_lower_add() function can handle other element wise operations
such as multiply, minimum, and maximum, so rename it in preparation to
add those operations.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>
2026-04-24 09:22:12 +00:00
Rob Herring (Arm)
fe97dab8b0 ethosu: Add fully-connected operation
Add support for fully-connected convolution. FC convolution lowering is
nearly the same, so refactor the existing convolution code to support
both.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>
2026-04-24 09:22:11 +00:00
Rob Herring (Arm)
ed65f84921 ethosu: Support axis 1 concatention
For axis 1 concatenation, the OFM strides need to match the IFM strides.

Presumably axis -3 can also be supported, but there haven't been any
models with -3. Not sure what axis 2 would need either.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>
2026-04-24 09:22:11 +00:00
Rob Herring (Arm)
aaaca26fd2 ethosu: Fix concatenation OFM scaling
Some pooling operations like concatenation are NOPs requiring different
scaling calculations.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>
2026-04-24 09:22:11 +00:00
Rob Herring (Arm)
d772f36741 ethosu: Move stride calculation to lowering
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>
2026-04-24 09:22:10 +00:00
Rob Herring (Arm)
ed2c19a411 ethosu: Store ethosu_tensor struct ptr in feature map
Some of the tensor info is needed at various points during lowering.
Instead of storing the tensor index and looking it up every time, store
a point to the tensor struct instead.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>
2026-04-24 09:22:10 +00:00
Rob Herring (Arm)
915cd57c08 ethosu: Add a common initializer for struct ethosu_operation
The struct ethosu_operation structure has the same initialization in
multiple ops. More ops with the same duplication are about to be added.
Move this out to a common initializer function.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>
2026-04-24 09:22:09 +00:00
Rob Herring (Arm)
76ad93bf93 ethosu: Make quantization shift signed
The vela compiler defines shift as signed and some upcoming LUT code
allows for negative shifts, so make shift signed everywhere.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>
2026-04-24 09:22:09 +00:00
Dave Airlie
3f5d54ab8c nouveau: drop sector promotion.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Just like the fix for nvk, just drop this in the GL driver as well.

Cc: mesa-stable
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41143>
2026-04-24 04:20:10 +00:00
Virgile Bello
50ab52f135 microsoft/compiler, d3d12: preserve TCS outputs and pad TES inputs for cross-stage signature matching
Four linked D3D12 pipeline-validation problems with GLSL TCS on DXIL:

1) dxil_nir_kill_unused_outputs killed TCS outputs read back by the
   patch-constant function after a barrier, zeroing the tess factors.
   Keep shader_out locations with any intra-shader load_deref live
   regardless of next_stage_read_mask.

2) is_dead_in_variable dropped TES padding placeholders (no local
   uses) in nir_remove_dead_variables. Also honor
   prev_stage_written_mask so padded TES inputs stay alive.

3) Preserving (1) leaves HS with outputs the DS doesn't declare,
   breaking pipeline validation (e.g. piglit's barrier.shader_test).
   Add dxil_nir_pad_tes_input_signature, called from both link paths,
   to synthesize matching TES inputs (reusing each TCS output's type
   so sig shape and stride match byte-for-byte) plus the tess-level
   inputs -- subsuming the tess-level-only block previously in
   dxil_spirv_nir_link. Scope the per-variable padding to TCS
   outputs that TCS itself reads back via load_deref: outputs that
   neither TES nor TCS consumes get killed from the HS signature,
   so padding them into DS would make the DS input signature longer
   than HS output and break validation for SSO pipelines whose TCS
   declares unused per-patch writes (arb_separate_shader_objects/
   mix-and-match-tcs-tes).

4) remove_hs_intrinsics rewrote load_output but not
   load_per_vertex_output in HS main. With (1) keeping outputs alive,
   GLSL reads of outputs in main whose result survives DCE (UAV
   atomics, non-tess per-vertex output writes) left
   LoadOutputControlPoint in the control-point function, which dxil.dll
   rejects outside the PCF (CreatePipelineState then fails with
   E_INVALIDARG). Treat load_per_vertex_output like load_output.

Validated on piglit arb_tessellation_shader/execution (WARP + DXC
1.8.2403): barrier now passes; the previously-crashing
tcs-output-unmatched and variable-indexing/tcs-output-array-* fail
gracefully matching baseline; isoline/isoline-no-tcs remain flakes
(pre-existing canary corruption, unrelated).

d3d12-quick_shader.txt drops barrier; d3d12-flakes.txt adds
isoline-no-tcs alongside isoline.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41028>
2026-04-23 18:45:01 +00:00
Virgile Bello
1d923fdd2b microsoft/compiler, d3d12: flip tess winding at caller, not in nir_to_dxil
get_tessellator_output_primitive used to unconditionally invert CW<->CCW
on the assumption the input was GL-origin (lower-left). That was wrong
for any upper-left caller — including spirv_to_dxil, whose SPIR-V sources
(DXC, glslang) already align with D3D winding.

Make nir_to_dxil copy info.tess.ccw through and expect upper-left. The
d3d12 gallium driver (GL) flips before the conversion to preserve its
output. spirv_to_dxil and dozen (Vulkan, UPPER_LEFT default) are unchanged.

Assisted-by: Claude Opus 4.7 <noreply@anthropic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41028>
2026-04-23 18:45:01 +00:00
Valentine Burley
4e4207e639 zink/ci: Remove Cezanne job
The devices will be repurposed for a different job.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41099>
2026-04-23 07:34:03 +00:00
jinmiliu
809bf45c12 radeonsi: enable protected context support for Android
Enable protected context capability for Android
when TMZ support is available. This is needed for Widevine L1 secure
video playback on Android, which requires a protected context.

Signed-off-by: jinmiliu <jinming.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40980>
2026-04-23 05:23:57 +00:00
Qiang Yu
b41cd59790 ac,radeonsi,radv: use V_581A_* engine sel for non-pws acquire_mem packet
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
V_581B_PFP and V_581B_ME is for pws acquire_mem. Current code
does not cause any problem because we won't pass engine arg
directly to acqure_mem packet. But use a native V_581A_* arg
for better coding.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41069>
2026-04-23 02:48:06 +00:00
Qiang Yu
89c1bf34ed ac,radeonsi,radv: fix print IB assertion fail for reserved fields
New IB print will assert reserved packet field to be zero.

Fixes: 1c75cd958f ("ac: enable the new auto-generated CP packet parser")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41069>
2026-04-23 02:48:06 +00:00
GKraats
686266d2f1 crocus: Fix shader precompilation on Gen6 and higher
By default crocus precompiles shaders, to avoid stuttering at screens,
caused by compiling shaders at the drawing phase.
Unfortunately at intel Gen 6 and higher the precompiled version of the
fragment shaders is not used and every fragment shader is compiled twice.
These double fragment shaders also are added to the memory cache
and disk cache.
This is caused by setting wrong values to variables at the key during
precompiling at routine crocus_create_fs_state() at src/gallium/drivers/crocus/crocus_program.c,
which differ from values at crocus_populate_fs_key() at src/gallium/drivers/crocus/crocus_state.c.

This commit solves 3 problems:

it adjusts the predicted value 'input_slots_valid' at Gen 6
it adjusts the predicted value 'ignore_sample_mask_out' at Gen 6 and higher
it predicts the value 'multisample_fbo' , which helps if samplemask is used

Cc: mesa-stable
Signed-off-by: GKraats <vd.kraats@hccnet.nl>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35605>
2026-04-22 20:50:29 +00:00
Valentine Burley
96d17d18be zink/ci: Move Turnip flakes to correct list
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
These belong in the zink directory, not freedreno. Also add 2-sample
variants and document the origin.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41111>
2026-04-22 19:56:11 +00:00
Silvio Vilerino
e56354661b mediafoundation: Create readable dpb buffers with PIPE_BIND_RENDER_TARGET and PIPE_BIND_SHARED for DX11 sharing
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Pohsiang (John) Hsu <pohhsu@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41110>
2026-04-22 18:08:30 +00:00
Silvio Vilerino
f07be3b416 d3d12: Create PIPE_BIND_SHARED resources with D3D12_RESOURCE_FLAG_ALLOW_SIMULTANEOUS_ACCESS
Reviewed-by: Pohsiang (John) Hsu <pohhsu@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41110>
2026-04-22 18:08:30 +00:00
Emma Anholt
3a8ff22336 ci: Delete references to various broken traces.
These are all being removed from the repos, so no need to leave the old
notes around.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40959>
2026-04-22 17:39:31 +00:00
Emma Anholt
886fd59951 ci/lavapipe: Use anholt's new GPU trace snapshot comparison tool.
The new tool has much better image diffing presentation (thanks to
Danilo's work on turnip's private trace CI), better performance, flake
checking within a single run, parallelized downloads along with replays,
and ability to cache downloaded files to improve runtime, and system
monitoring (for debugging OOM-related slowdowns).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40959>
2026-04-22 17:39:31 +00:00
Emma Anholt
2ee4da8677 ci/llvmpipe: Use anholt's new GPU trace snapshot comparison tool.
The new tool has much better image diffing presentation (thanks to
Danilo's work on turnip's private trace CI), better performance, flake
checking within a single run, parallelized downloads along with replays,
and ability to cache downloaded files to improve runtime, and system
monitoring (for debugging OOM-related slowdowns).

./bin/update_traces_checksum.sh still updates based on the output of a CI
run, but you can also apply a patch file that the tool generates, if you
do offline runs using your traces.toml.

New traces being replayed, in less overall runtime (2 minutes instead of 3):

- minetest/minetest-high-v3.trace (new version, not the old flaky one)
- neverball/neverball-v2.trace
- ror/ror-default.trace
- supertuxkart/supertuxkart-mansion-egl-gles-v2.b.trace
- valve/counterstrike-v2.trace
- valve/portal-2-v2.trace
- xonotic/xonotic-keybench-high-v2.trace

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40959>
2026-04-22 17:39:31 +00:00
Martin Roukala (né Peres)
931d7d1fad zink/ci: mark blender-demo-cube_diorama as flaky on gfx1201
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41100>
2026-04-22 17:19:22 +00:00
Daniel Schürmann
1f9a0490c6 nir/opt_loop: Don't peel initial break from do-while loops
As the main purpose of this optimization is to transform
while- into do-while loops, don't apply on loops which are
already in do-while form. Also set nir_loop::do_while after
this transformation, so that it is only applied once.

Totals from 576 (0.28% of 202440) affected shaders: (Navi48)
Instrs: 1337529 -> 1253438 (-6.29%); split: -6.36%, +0.07%
CodeSize: 8390852 -> 7837328 (-6.60%); split: -6.61%, +0.01%
VGPRs: 50856 -> 50844 (-0.02%)
SpillSGPRs: 42198 -> 35395 (-16.12%); split: -16.13%, +0.01%
SpillVGPRs: 47608 -> 44620 (-6.28%)
Latency: 31043828 -> 44143753 (+42.20%); split: -0.06%, +42.26%
InvThroughput: 6973433 -> 10079000 (+44.53%); split: -0.08%, +44.61%
VClause: 26839 -> 24718 (-7.90%); split: -7.91%, +0.00%
SClause: 21831 -> 21583 (-1.14%); split: -1.52%, +0.38%
Copies: 183503 -> 150040 (-18.24%); split: -18.84%, +0.61%
Branches: 27738 -> 26848 (-3.21%); split: -5.12%, +1.91%
PreSGPRs: 40233 -> 39083 (-2.86%); split: -2.88%, +0.02%
PreVGPRs: 38745 -> 38903 (+0.41%); split: -0.02%, +0.43%
VALU: 688396 -> 645948 (-6.17%); split: -6.17%, +0.01%
SALU: 189792 -> 177642 (-6.40%); split: -6.97%, +0.57%
VMEM: 121500 -> 112748 (-7.20%)
SMEM: 38765 -> 37767 (-2.57%); split: -2.58%, +0.00%
VOPD: 102488 -> 89071 (-13.09%); split: +0.24%, -13.33%

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40349>
2026-04-22 10:34:58 +00:00
Pavel Ondračka
485586b184 r300,i915/ci: update expectations
More accurate asin and atan push few tests over the instruction limit.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41094>
2026-04-22 10:16:43 +00:00
Valentine Burley
220d01fd2a zink/ci: Document recent flakes
These flakes have caused job failures in the last two weeks.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41095>
2026-04-22 09:46:30 +00:00
Lionel Landwerlin
6031d52393 anv: implement VK_EXT_primitive_restart_index
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40776>
2026-04-22 08:52:57 +00:00
Samuel Pitoiset
9d17a7bdb4 spirv,treewide: rework specialization constant
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
With SPV_KHR_constant_data, it's allowed to specialize array of
constants.

RustiCL changes are from Karol Herbst <kherbst@redhat.com>.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41046>
2026-04-22 06:57:55 +00:00
Eric R. Smith
4ae192a3d9 glsl, spirv: Improve accuracy of asin() and acos()
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The polynomial used for asin_expr() was suboptimal (and its source was
not documented).

A better approximation is found in the _Handbook_of_Mathematical_Functions_
by Abramowitz and Stegun, which is used in Nvidia's Cg toolkit. However,
while this approximation gives a good absolute error bound, its relative
error exceeds the 4096 ulp allowed by the Vulkan spec. Taking a page
from the spirv implementation of asin(), we implement a piecewise
approximation where a Taylor series is used for small values of |x|.
This patch also harmonizes the GLSL and Vulkan implementations by moving
the implementation to common code (nir_builder).

Running tests on asin() with a grid of 64000 samples between 0.0 and +1.0,
the original asin() at 32 bits has:
```
                       glsl                       spirv
  RMSE:            1.756451e-04                 1.609091e-04
  worst abs error: 3.904104e-04 at 0.937001     3.904104e-04 at 0.937001
  worst ulp error: 11800 at 6.2499e-05          3826 at 0.841331
```
whereas the new implementation has for both:
```
  RMSE:            2.528056e-05
  worst abs error: 4.962087e-05 at 0.451149
  worst ulp error: 2379 at 0.215106
```

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40862>
2026-04-21 21:10:22 +00:00