fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-20 17:48:15 +02:00

Author	SHA1	Message	Date
Rob Clark	71e76f3637	freedreno: Remove use of fd_perfcntr_type/result_type Everything is "UINT64, AVERAGE", so no need to get this from the table. Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40522>	2026-04-24 21:28:30 +00:00
Emma Anholt	ed729bf948	ci/llvmpipe: Disable some traces too close to the timeout. Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details I did my stress testing mostly outside of north america work hours, but it turns out once the runners have 60-70% background CPU usage, these ones intermittently time out. Reported-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41163>	2026-04-24 18:06:48 +00:00
Silvio Vilerino	e4c9d57ddf	d3d12: Flush stale video encode wait registrations when reusing ID3D12Fence objects Reviewed-by: Pohsiang (John) Hsu <pohhsu@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41160>	2026-04-24 16:52:14 +00:00
Silvio Vilerino	fb13c044a8	Revert "d3d12: Video sliced encode: Use same ID3D12Fence/different per slice values as optimization" This reverts commit `b83a931cb1` as it causes regressions with dirty rects enabled on some HW platforms that signal out of order completion and require individual fence objects per slice Fixes: `b83a931cb1` ("d3d12: Video sliced encode: Use same ID3D12Fence/different per slice values as optimization") Reviewed-by: Pohsiang (John) Hsu <pohhsu@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41160>	2026-04-24 16:52:14 +00:00
Derek Lesho	ce45069c49	zink: Guard bo map/unmap on map_count. Otherwise zink_bo_map can return cpu_ptr being destroyed by zink_bo_unmap. Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41127>	2026-04-24 13:44:50 +00:00
Pavel Ondračka	caeaa6bad2	i915/ci: update expectations Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41149>	2026-04-24 10:39:50 +00:00
Pavel Ondračka	1ca70a7d6c	r300/ci: update expectations Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41149>	2026-04-24 10:39:50 +00:00
Rob Herring (Arm)	4e8e4ca2fc	ethosu: Add minimum and maximum operators Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>	2026-04-24 09:22:16 +00:00
Rob Herring (Arm)	03e29e2fa5	teflon: Add minimum and maximum operations Add the plumbing for minimum and maximum operations. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>	2026-04-24 09:22:16 +00:00
Rob Herring (Arm)	dce4b0313a	ethosu: Add reshape operation A reshape operation just changes the dimensions of a tensor, but doesn't change the data at all. So we just point the OFM to the IFM data and we're done. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>	2026-04-24 09:22:16 +00:00
Rob Herring (Arm)	08d93a60f5	ethosu: Add quantize operation The quantize operation lowers to a pooling nop operation. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>	2026-04-24 09:22:15 +00:00
Rob Herring (Arm)	e6f4f6aa5d	teflon: Add quantize operation Add the plumbing for quantize operations. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>	2026-04-24 09:22:15 +00:00
Rob Herring (Arm)	2fe1301e5e	ethosu: Add LeakyRelu operation Add support for LeakyRelu operations. These are implemented as a pooling LUT. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>	2026-04-24 09:22:15 +00:00
Rob Herring (Arm)	15bc152185	teflon: Add LeakyRelu operation Add the plumbing for LeakyRelu operations. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>	2026-04-24 09:22:14 +00:00
Rob Herring (Arm)	3487b15312	ethosu: Add hard swish operation Hard swish lowers to a pooling operation with a LUT. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>	2026-04-24 09:22:14 +00:00
Rob Herring (Arm)	f2800fe13b	teflon: Add hard swish operation Add the plumbing for hard swish operations. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>	2026-04-24 09:22:14 +00:00
Rob Herring (Arm)	a305dfd54b	ethosu: Add logistic and TANH operations Logistic and TANH operations are similar and both lower to pooling operation with a LUT. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>	2026-04-24 09:22:13 +00:00
Rob Herring (Arm)	6933207435	teflon: Add TANH operation support Add the plumbing for TANH operations. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>	2026-04-24 09:22:13 +00:00
Rob Herring (Arm)	df051917a5	ethosu: Add multiply operation support Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>	2026-04-24 09:22:13 +00:00
Rob Herring (Arm)	024c70fbb3	teflon: Add multiply operation Add the plumbing for multiply operations. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>	2026-04-24 09:22:12 +00:00
Rob Herring (Arm)	d55a574898	ethosu: Support element wise op with constant IFM buffer Element wise operations can have a constant data buffer. Re-order things a bit to group all the IFM2 setup together. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>	2026-04-24 09:22:12 +00:00
Rob Herring (Arm)	1f579379c1	ethosu: Rename ethosu_lower_add to ethosu_lower_eltwise The ethosu_lower_add() function can handle other element wise operations such as multiply, minimum, and maximum, so rename it in preparation to add those operations. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>	2026-04-24 09:22:12 +00:00
Rob Herring (Arm)	fe97dab8b0	ethosu: Add fully-connected operation Add support for fully-connected convolution. FC convolution lowering is nearly the same, so refactor the existing convolution code to support both. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>	2026-04-24 09:22:11 +00:00
Rob Herring (Arm)	ed65f84921	ethosu: Support axis 1 concatention For axis 1 concatenation, the OFM strides need to match the IFM strides. Presumably axis -3 can also be supported, but there haven't been any models with -3. Not sure what axis 2 would need either. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>	2026-04-24 09:22:11 +00:00
Rob Herring (Arm)	aaaca26fd2	ethosu: Fix concatenation OFM scaling Some pooling operations like concatenation are NOPs requiring different scaling calculations. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>	2026-04-24 09:22:11 +00:00
Rob Herring (Arm)	d772f36741	ethosu: Move stride calculation to lowering Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>	2026-04-24 09:22:10 +00:00
Rob Herring (Arm)	ed2c19a411	ethosu: Store ethosu_tensor struct ptr in feature map Some of the tensor info is needed at various points during lowering. Instead of storing the tensor index and looking it up every time, store a point to the tensor struct instead. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>	2026-04-24 09:22:10 +00:00
Rob Herring (Arm)	915cd57c08	ethosu: Add a common initializer for struct ethosu_operation The struct ethosu_operation structure has the same initialization in multiple ops. More ops with the same duplication are about to be added. Move this out to a common initializer function. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>	2026-04-24 09:22:09 +00:00
Rob Herring (Arm)	76ad93bf93	ethosu: Make quantization shift signed The vela compiler defines shift as signed and some upcoming LUT code allows for negative shifts, so make shift signed everywhere. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39975>	2026-04-24 09:22:09 +00:00
Dave Airlie	3f5d54ab8c	nouveau: drop sector promotion. Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Just like the fix for nvk, just drop this in the GL driver as well. Cc: mesa-stable Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41143>	2026-04-24 04:20:10 +00:00
Virgile Bello	50ab52f135	microsoft/compiler, d3d12: preserve TCS outputs and pad TES inputs for cross-stage signature matching Four linked D3D12 pipeline-validation problems with GLSL TCS on DXIL: 1) dxil_nir_kill_unused_outputs killed TCS outputs read back by the patch-constant function after a barrier, zeroing the tess factors. Keep shader_out locations with any intra-shader load_deref live regardless of next_stage_read_mask. 2) is_dead_in_variable dropped TES padding placeholders (no local uses) in nir_remove_dead_variables. Also honor prev_stage_written_mask so padded TES inputs stay alive. 3) Preserving (1) leaves HS with outputs the DS doesn't declare, breaking pipeline validation (e.g. piglit's barrier.shader_test). Add dxil_nir_pad_tes_input_signature, called from both link paths, to synthesize matching TES inputs (reusing each TCS output's type so sig shape and stride match byte-for-byte) plus the tess-level inputs -- subsuming the tess-level-only block previously in dxil_spirv_nir_link. Scope the per-variable padding to TCS outputs that TCS itself reads back via load_deref: outputs that neither TES nor TCS consumes get killed from the HS signature, so padding them into DS would make the DS input signature longer than HS output and break validation for SSO pipelines whose TCS declares unused per-patch writes (arb_separate_shader_objects/ mix-and-match-tcs-tes). 4) remove_hs_intrinsics rewrote load_output but not load_per_vertex_output in HS main. With (1) keeping outputs alive, GLSL reads of outputs in main whose result survives DCE (UAV atomics, non-tess per-vertex output writes) left LoadOutputControlPoint in the control-point function, which dxil.dll rejects outside the PCF (CreatePipelineState then fails with E_INVALIDARG). Treat load_per_vertex_output like load_output. Validated on piglit arb_tessellation_shader/execution (WARP + DXC 1.8.2403): barrier now passes; the previously-crashing tcs-output-unmatched and variable-indexing/tcs-output-array-* fail gracefully matching baseline; isoline/isoline-no-tcs remain flakes (pre-existing canary corruption, unrelated). d3d12-quick_shader.txt drops barrier; d3d12-flakes.txt adds isoline-no-tcs alongside isoline. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41028>	2026-04-23 18:45:01 +00:00
Virgile Bello	1d923fdd2b	microsoft/compiler, d3d12: flip tess winding at caller, not in nir_to_dxil get_tessellator_output_primitive used to unconditionally invert CW<->CCW on the assumption the input was GL-origin (lower-left). That was wrong for any upper-left caller — including spirv_to_dxil, whose SPIR-V sources (DXC, glslang) already align with D3D winding. Make nir_to_dxil copy info.tess.ccw through and expect upper-left. The d3d12 gallium driver (GL) flips before the conversion to preserve its output. spirv_to_dxil and dozen (Vulkan, UPPER_LEFT default) are unchanged. Assisted-by: Claude Opus 4.7 <noreply@anthropic.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41028>	2026-04-23 18:45:01 +00:00
Valentine Burley	4e4207e639	zink/ci: Remove Cezanne job The devices will be repurposed for a different job. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41099>	2026-04-23 07:34:03 +00:00
jinmiliu	809bf45c12	radeonsi: enable protected context support for Android Enable protected context capability for Android when TMZ support is available. This is needed for Widevine L1 secure video playback on Android, which requires a protected context. Signed-off-by: jinmiliu <jinming.liu@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40980>	2026-04-23 05:23:57 +00:00
Qiang Yu	b41cd59790	ac,radeonsi,radv: use V_581A_* engine sel for non-pws acquire_mem packet Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details V_581B_PFP and V_581B_ME is for pws acquire_mem. Current code does not cause any problem because we won't pass engine arg directly to acqure_mem packet. But use a native V_581A_* arg for better coding. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41069>	2026-04-23 02:48:06 +00:00
Qiang Yu	89c1bf34ed	ac,radeonsi,radv: fix print IB assertion fail for reserved fields New IB print will assert reserved packet field to be zero. Fixes: `1c75cd958f` ("ac: enable the new auto-generated CP packet parser") Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41069>	2026-04-23 02:48:06 +00:00
GKraats	686266d2f1	crocus: Fix shader precompilation on Gen6 and higher By default crocus precompiles shaders, to avoid stuttering at screens, caused by compiling shaders at the drawing phase. Unfortunately at intel Gen 6 and higher the precompiled version of the fragment shaders is not used and every fragment shader is compiled twice. These double fragment shaders also are added to the memory cache and disk cache. This is caused by setting wrong values to variables at the key during precompiling at routine crocus_create_fs_state() at src/gallium/drivers/crocus/crocus_program.c, which differ from values at crocus_populate_fs_key() at src/gallium/drivers/crocus/crocus_state.c. This commit solves 3 problems: it adjusts the predicted value 'input_slots_valid' at Gen 6 it adjusts the predicted value 'ignore_sample_mask_out' at Gen 6 and higher it predicts the value 'multisample_fbo' , which helps if samplemask is used Cc: mesa-stable Signed-off-by: GKraats <vd.kraats@hccnet.nl> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35605>	2026-04-22 20:50:29 +00:00
Valentine Burley	96d17d18be	zink/ci: Move Turnip flakes to correct list Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details These belong in the zink directory, not freedreno. Also add 2-sample variants and document the origin. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41111>	2026-04-22 19:56:11 +00:00
Silvio Vilerino	e56354661b	mediafoundation: Create readable dpb buffers with PIPE_BIND_RENDER_TARGET and PIPE_BIND_SHARED for DX11 sharing Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Pohsiang (John) Hsu <pohhsu@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41110>	2026-04-22 18:08:30 +00:00
Silvio Vilerino	f07be3b416	d3d12: Create PIPE_BIND_SHARED resources with D3D12_RESOURCE_FLAG_ALLOW_SIMULTANEOUS_ACCESS Reviewed-by: Pohsiang (John) Hsu <pohhsu@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41110>	2026-04-22 18:08:30 +00:00
Emma Anholt	3a8ff22336	ci: Delete references to various broken traces. These are all being removed from the repos, so no need to leave the old notes around. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40959>	2026-04-22 17:39:31 +00:00
Emma Anholt	886fd59951	ci/lavapipe: Use anholt's new GPU trace snapshot comparison tool. The new tool has much better image diffing presentation (thanks to Danilo's work on turnip's private trace CI), better performance, flake checking within a single run, parallelized downloads along with replays, and ability to cache downloaded files to improve runtime, and system monitoring (for debugging OOM-related slowdowns). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40959>	2026-04-22 17:39:31 +00:00
Emma Anholt	2ee4da8677	ci/llvmpipe: Use anholt's new GPU trace snapshot comparison tool. The new tool has much better image diffing presentation (thanks to Danilo's work on turnip's private trace CI), better performance, flake checking within a single run, parallelized downloads along with replays, and ability to cache downloaded files to improve runtime, and system monitoring (for debugging OOM-related slowdowns). ./bin/update_traces_checksum.sh still updates based on the output of a CI run, but you can also apply a patch file that the tool generates, if you do offline runs using your traces.toml. New traces being replayed, in less overall runtime (2 minutes instead of 3): - minetest/minetest-high-v3.trace (new version, not the old flaky one) - neverball/neverball-v2.trace - ror/ror-default.trace - supertuxkart/supertuxkart-mansion-egl-gles-v2.b.trace - valve/counterstrike-v2.trace - valve/portal-2-v2.trace - xonotic/xonotic-keybench-high-v2.trace Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40959>	2026-04-22 17:39:31 +00:00
Martin Roukala (né Peres)	931d7d1fad	zink/ci: mark blender-demo-cube_diorama as flaky on gfx1201 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41100>	2026-04-22 17:19:22 +00:00
Daniel Schürmann	1f9a0490c6	nir/opt_loop: Don't peel initial break from do-while loops As the main purpose of this optimization is to transform while- into do-while loops, don't apply on loops which are already in do-while form. Also set nir_loop::do_while after this transformation, so that it is only applied once. Totals from 576 (0.28% of 202440) affected shaders: (Navi48) Instrs: 1337529 -> 1253438 (-6.29%); split: -6.36%, +0.07% CodeSize: 8390852 -> 7837328 (-6.60%); split: -6.61%, +0.01% VGPRs: 50856 -> 50844 (-0.02%) SpillSGPRs: 42198 -> 35395 (-16.12%); split: -16.13%, +0.01% SpillVGPRs: 47608 -> 44620 (-6.28%) Latency: 31043828 -> 44143753 (+42.20%); split: -0.06%, +42.26% InvThroughput: 6973433 -> 10079000 (+44.53%); split: -0.08%, +44.61% VClause: 26839 -> 24718 (-7.90%); split: -7.91%, +0.00% SClause: 21831 -> 21583 (-1.14%); split: -1.52%, +0.38% Copies: 183503 -> 150040 (-18.24%); split: -18.84%, +0.61% Branches: 27738 -> 26848 (-3.21%); split: -5.12%, +1.91% PreSGPRs: 40233 -> 39083 (-2.86%); split: -2.88%, +0.02% PreVGPRs: 38745 -> 38903 (+0.41%); split: -0.02%, +0.43% VALU: 688396 -> 645948 (-6.17%); split: -6.17%, +0.01% SALU: 189792 -> 177642 (-6.40%); split: -6.97%, +0.57% VMEM: 121500 -> 112748 (-7.20%) SMEM: 38765 -> 37767 (-2.57%); split: -2.58%, +0.00% VOPD: 102488 -> 89071 (-13.09%); split: +0.24%, -13.33% Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40349>	2026-04-22 10:34:58 +00:00
Pavel Ondračka	485586b184	r300,i915/ci: update expectations More accurate asin and atan push few tests over the instruction limit. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41094>	2026-04-22 10:16:43 +00:00
Valentine Burley	220d01fd2a	zink/ci: Document recent flakes These flakes have caused job failures in the last two weeks. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41095>	2026-04-22 09:46:30 +00:00
Lionel Landwerlin	6031d52393	anv: implement VK_EXT_primitive_restart_index Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40776>	2026-04-22 08:52:57 +00:00
Samuel Pitoiset	9d17a7bdb4	spirv,treewide: rework specialization constant Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details With SPV_KHR_constant_data, it's allowed to specialize array of constants. RustiCL changes are from Karol Herbst <kherbst@redhat.com>. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41046>	2026-04-22 06:57:55 +00:00
Eric R. Smith	4ae192a3d9	glsl, spirv: Improve accuracy of asin() and acos() Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details The polynomial used for asin_expr() was suboptimal (and its source was not documented). A better approximation is found in the _Handbook_of_Mathematical_Functions_ by Abramowitz and Stegun, which is used in Nvidia's Cg toolkit. However, while this approximation gives a good absolute error bound, its relative error exceeds the 4096 ulp allowed by the Vulkan spec. Taking a page from the spirv implementation of asin(), we implement a piecewise approximation where a Taylor series is used for small values of \|x\|. This patch also harmonizes the GLSL and Vulkan implementations by moving the implementation to common code (nir_builder). Running tests on asin() with a grid of 64000 samples between 0.0 and +1.0, the original asin() at 32 bits has: ``` glsl spirv RMSE: 1.756451e-04 1.609091e-04 worst abs error: 3.904104e-04 at 0.937001 3.904104e-04 at 0.937001 worst ulp error: 11800 at 6.2499e-05 3826 at 0.841331 ``` whereas the new implementation has for both: ``` RMSE: 2.528056e-05 worst abs error: 4.962087e-05 at 0.451149 worst ulp error: 2379 at 0.215106 ``` Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Acked-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40862>	2026-04-21 21:10:22 +00:00

1 2 3 4 5 ...

73373 commits