fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-17 11:48:05 +02:00

Author	SHA1	Message	Date
Alyssa Rosenzweig	d3051b2eb0	nir/lower_bitmap: use more effective NIR * use tex builder * drop a silly bunch of wrapping Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36050>	2025-07-21 12:11:42 +00:00
Alyssa Rosenzweig	15c950cd49	nir/lower_drawpixels: use tex builder Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36050>	2025-07-21 12:11:41 +00:00
Alyssa Rosenzweig	6b34e2174e	nir: introduce ergonomic tex builder for intrinsics, we have these really nice builders using designated initializers + macros to specify optional indices. texture instrs have even more craziness involved, but we can do the same trick. this commit takes the existing "fixed form" deref-centric tex builders and generalizes them to work with non-deref textures, making it useful also for GL and late VK passes, while providing an API that strives to be ergonomic and consistent. this series only implements a subset of possible texture operations for now, but more generalizing could be added as people have need. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36050>	2025-07-21 12:11:41 +00:00
Alyssa Rosenzweig	b9c2579ae0	nir: unmark 24b multiply as associative Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Suggested-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36257>	2025-07-21 11:42:19 +00:00
Alyssa Rosenzweig	076f245df8	nir: restrict associativity to binary operations mathemtically, associativity is only defined for binary operations. I have no idea what "associativity" would even mean for imad. I can kinda see the idea for iadd3 but iadd3 should not be formed until after reassociating adds so the point is moot. Unmark the "associative" ternary operations, and assert that associativity implies binary. nothing uses associativity yet, so this doesn't cause any functional change. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36257>	2025-07-21 11:42:19 +00:00
Alyssa Rosenzweig	e466b8735b	nir: introduce "inexact associative" property nothing currently uses the associative flag, but they will change soon. we need to stop incorrectly marking fmul/fadd/etc as associative, because they're not, but they almost are. distinguish these properties so we can correctly handle floating point rules without any opcode-based special casing. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36257>	2025-07-21 11:42:19 +00:00
Alyssa Rosenzweig	421d0e0953	nir: mark exact fmul in ldexp lowering this chain of fmul is deliberately chosen for floating point precision reasons, it needs to be exact, or else we might try to reassociate it and break subnormal handling. avoids regressing dEQP-VK.glsl.builtin.precision.ldexp_subnormals.* Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36257>	2025-07-21 11:42:18 +00:00
Rhys Perry	8fd5266b69	nir/divergence: ignore boolean phis for ignore_undef_if_phi_srcs The only user of this option (ACO) doesn't support this for boolean phis. fossil-db (navi21): Totals from 1208 (1.51% of 79825) affected shaders: Instrs: 826592 -> 823201 (-0.41%); split: -0.41%, +0.00% CodeSize: 4228296 -> 4224280 (-0.09%); split: -0.11%, +0.01% Latency: 3030803 -> 3028410 (-0.08%); split: -0.08%, +0.01% InvThroughput: 578588 -> 578693 (+0.02%); split: -0.00%, +0.02% VClause: 19500 -> 19494 (-0.03%) Copies: 60914 -> 57589 (-5.46%); split: -5.47%, +0.01% PreVGPRs: 50759 -> 50774 (+0.03%) VALU: 528582 -> 528671 (+0.02%); split: -0.00%, +0.02% SALU: 121134 -> 117646 (-2.88%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Backport-to: 25.1 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13455 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13509 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36005>	2025-07-21 08:27:01 +00:00
Konstantin Seurer	df44b353ad	radv: Optimize ray tracing position fetch Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Gets rid of a lot of indirection when fetching triangle positions. Storing the primitive address increases register pressure by a bit but the traversal shader which should have the highest register demand should not be affected when position fetch is not used. Totals: Instrs: 4021686 -> 4022435 (+0.02%); split: -0.01%, +0.03% CodeSize: 21235812 -> 21235832 (+0.00%); split: -0.02%, +0.02% Latency: 23402275 -> 23412110 (+0.04%); split: -0.04%, +0.09% InvThroughput: 4352818 -> 4352206 (-0.01%); split: -0.04%, +0.02% VClause: 101906 -> 102058 (+0.15%); split: -0.03%, +0.18% Copies: 342210 -> 342368 (+0.05%); split: -0.09%, +0.14% Branches: 114988 -> 114993 (+0.00%) PreVGPRs: 26551 -> 27111 (+2.11%) VALU: 2249366 -> 2249524 (+0.01%); split: -0.01%, +0.02% SALU: 529828 -> 529808 (-0.00%); split: -0.01%, +0.00% Reviewed-by: Natalie Vock <natalie.vock@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35533>	2025-07-19 16:07:59 +00:00
Faith Ekstrand	9fbb57e0a4	nir,nak: Add a nir_texop_sample_pos_nv and plumb it through Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36207>	2025-07-18 22:21:46 +00:00
Faith Ekstrand	557ac588e4	nir/instr_set: Rework tex instr hash/compare Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details We were missing a couple bits from hash and a bunch of stuff from the comparison. This puts most of nir_tex_instr into a single pack_tex helper that's used by both and grabs everything we were missing. Cc: mesa-stable Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36234>	2025-07-18 17:10:20 -04:00
Alyssa Rosenzweig	2308960bed	treewide: use nir_mov_scalar Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Via Coccinelle patch: @@ expression builder, scalar; @@ -nir_channel(builder, scalar.def, scalar.comp) +nir_mov_scalar(builder, scalar) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Eric Engestrom <eric@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36142>	2025-07-16 18:59:16 +00:00
Alyssa Rosenzweig	186db0ebfe	nir: add nir_mov_scalar helper I keep reaching for this helper but it doesn't exist. So I fixed that. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Eric Engestrom <eric@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36142>	2025-07-16 18:59:16 +00:00
Alyssa Rosenzweig	98aad84d73	hk: push descriptor set addresses saves an indirection and sets us up for more goodness. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36127>	2025-07-16 18:27:18 +00:00
Alyssa Rosenzweig	24c708564f	nir: add bindless_sampler_agx intrinsic to facilitate pushing on AGX. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36127>	2025-07-16 18:27:17 +00:00
Alyssa Rosenzweig	58cc66238a	nir/opt_preamble: add sampler class AGX will use. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36127>	2025-07-16 18:27:17 +00:00
Georg Lehmann	d672737372	nir,aco: add byte_perm_amd Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36115>	2025-07-16 11:46:52 +00:00
Mary Guillemard	90438bae51	nir: Add NVIDIA-specific muladd intrinsics Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32777>	2025-07-15 23:34:31 +00:00
Natalie Vock	9707b30965	nir,aco: Add ds_bvh_stack_rtn This is a ds instruction that also overwrites its first input, so introduce a new ds format with two outputs. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35269>	2025-07-15 21:34:39 +00:00
Dave Airlie	2273b6c46a	nak: add divergent attribute and wrapper for nir_load_sysval_nv This wraps the sysval load in a builder where we can add proper divergence for ctaid later. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36105>	2025-07-15 19:07:11 +00:00
Marek Olšák	6286c1c66f	nir/opt_vectorize_io: optionally vectorize loads with holes e.g. load X; load W; ==> load XYZW. Verified with a shader test. This will be used by AMD drivers. See the code comments. Reviewed-by: Simon Perretta <simon.perretta@imgtec.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36098>	2025-07-15 16:29:30 +00:00
Romaric Jodin	b4977a1605	nir/lower_bit_size: Avoid round-trip conversion when possible When we detect that the source is a conversion generated by the pass, try to get the real source instead of doing a round-trip conversion. Make sure that the nir_alu_type and the bit_size is the same between what we need and what's before the detected conversion. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35744>	2025-07-15 15:32:58 +00:00
Marek Olšák	0fdd6de65f	nir/lower_io: validate locations more accurately Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36091>	2025-07-15 13:38:29 +00:00
Marek Olšák	b0494f9485	nir/opt_varyings: optimize the consumer after constant propagation and dedupli. A TF2 shader propagates 0 to the consumer, which eliminates 1 input if we run algebraic opts and DCE before compaction. This is a prerequisite for removing all IO var optimizations from the GLSL linker that are redundant with nir_opt_varyings. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36091>	2025-07-15 13:38:29 +00:00
Marek Olšák	9607852c30	nir/opt_varyings: use nir_scalar Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36091>	2025-07-15 13:38:29 +00:00
Christian Gmeiner	ec9a2aa2e4	nir: Unvendor sampler_lod_parameters(_pan) Will be used by etnaviv too. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35753>	2025-07-12 10:48:03 +00:00
Qiang Yu	25897f0692	nir/recompute_io_bases: fix for per primitive IO It does not handle per primitive output and count per primitive input. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35931>	2025-07-11 02:25:51 +00:00
Qiang Yu	35e3f4ee92	nir: fix PRIMITIVE_INDICES mistreated as varying It's a sysval in mesh shader, but it share the same slot number with VARYING_SLOT_TESS_LEVEL_INNER. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35931>	2025-07-11 02:25:51 +00:00
Alyssa Rosenzweig	329413992e	nir/lower_tex: revert "optimize LOD bias lower for txl" This reverts commit `f853d285ef`. Failing a GL CTS test https://gitlab.khronos.org/Tracker/vk-gl-cts/-/issues/5866 .. apparently I ran VK CTS but not GL CTS on that MR. Oops. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35949>	2025-07-10 15:00:28 -04:00
Alyssa Rosenzweig	ee26938faf	nir,agx: switch to bindless_image_agx intrinsic this is more explicit than vec2's and hence has fewer footguns. in particular it's easier to handle with preambles in a sane way. modelled on what ir3 does. there's probably room for more clean up but for now this unblocks what I want to do. stats don't seem concerning. Totals from 692 (1.29% of 53701) affected shaders: MaxWaves: 441920 -> 442112 (+0.04%) Instrs: `1588748` -> 1589304 (+0.03%); split: -0.05%, +0.08% CodeSize: 11487976 -> 11491620 (+0.03%); split: -0.04%, +0.07% ALU: 1234867 -> 1235407 (+0.04%); split: -0.06%, +0.10% FSCIB: 1234707 -> 1235249 (+0.04%); split: -0.06%, +0.10% IC: 380514 -> 380518 (+0.00%) GPRs: 117292 -> 117332 (+0.03%); split: -0.08%, +0.11% Preamble instrs: 314064 -> 313948 (-0.04%); split: -0.05%, +0.01% Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35949>	2025-07-10 14:55:17 -04:00
Alyssa Rosenzweig	78f4c7c6a4	nir: fix AGX intrinsic flag by inspection. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35949>	2025-07-10 14:55:17 -04:00
Alyssa Rosenzweig	f10e96586f	nir/rewrite_image_intrinsic: handle non-derefs it is sometimes useful to turn lowered bindless intrinsics into bound or vice versa, and it is annoying to do so without this helper, so generalize the helper. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Job Noorman <job@noorman.info> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35949>	2025-07-10 14:55:17 -04:00
Alyssa Rosenzweig	569046d95e	nir/rewrite_image_intrinsic: handle explicit coord for agx. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Job Noorman <job@noorman.info> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35949>	2025-07-10 14:55:17 -04:00
Alyssa Rosenzweig	d55bdb4ec5	nir/opt_preamble: add "register class" concept Class represents an indexed "ideal" register class, where non-general classes only allow defs that choose that class in the def_size callback. nir_opt_preamble will try to assign specialized classes where possible, falling back to the general class once the special-purpose classes are exhausted. AGX will use this mechanism to promote bindless texture handles to bound texture registers where possible, falling back to pushing the handle as a uniform where not possible. Supporting multiple classes in nir_opt_preamble allows this multi-level hoisting to work in a single nir_opt_preamble call with proper global behaviour. Add this concept to nir_opt_preamble so we can use it in AGX later in this MR. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Job Noorman <job@noorman.info> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35949>	2025-07-10 14:55:17 -04:00
Marek Olšák	2ba2a61101	nir: switch indirect IO load lowering to nir_lower_io_indirect_loads for GLSL This reduces GLSL compile times with the gallium noop driver by 0.6%. This might decrease register usage and do less code reordering because nir_lower_io_vars_to_temporaries is no longer called for inputs, which moved most input loads to the top. radeonsi+ACO shader-db results are noise. More uniforms are identified as inlinable. TOTALS FROM ALL SHADERS (58138): VGPRs: 2152680 -> 2158032 (0.25 %) Code Size: 71008908 -> 71064812 (0.08 %) bytes Max Waves: 916943 -> 916924 (-0.00 %) Inline Uniforms: 6395 -> 6414 (0.30 %) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36018>	2025-07-10 16:37:45 +00:00
Marek Olšák	a4e522f8b0	nir: add new pass nir_opt_move_to_top This can be used to move input loads to top after we stop using nir_lower_io_vars_to_temporaries that does it unconditionally. It's more flexible than what nir_lower_io_vars_to_temporaries was doing, and can be extended to handle any instructions. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36018>	2025-07-10 16:37:44 +00:00
Marek Olšák	3dd9a9782b	nir: add new pass nir_lower_io_indirect_loads This is a partial replacement for nir_lower_io_vars_to_temporaries. It supports all input and output loads. It doesn't handle stores. The motivation is to improve compile times. The main differences compared to nir_lower_io_vars_to_temporaries are: - it only lowers indirect loads to temps and doesn't touch direct loads which improves compile times and removes the need for nir_lower_vars_to_ssa afterward because indirect temp access can't be lowered to SSA - it doesn't move all input loads to the top; it only moves those input loads to the top whose indirect loads are lowered (which improves register usage because direct loads are not moved) - it doesn't have to deal with complexities of variables Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36018>	2025-07-10 16:37:44 +00:00
Mel Henning	94f4fc12ea	nir/divergence_analysis: Add NV_shader_sm_builtins Fixes crucible func.nv.shader-sm-builtins.q0 Fixes: `a3839dbb90` ("nak: Change divergence analysis pass order") Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36011>	2025-07-09 16:47:28 +00:00
Simon Perretta	f89fb76671	nir/lower_io_to_scalar: add case for lowering push constants Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36000>	2025-07-09 12:58:29 +00:00
Simon Perretta	d3e3e0e3d2	nir/builder: add nir_ibitfield_extract_imm helper Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36000>	2025-07-09 12:58:29 +00:00
Simon Perretta	1a4e22b01a	nir/builder: add nir_bitfield_insert_imm helper Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36000>	2025-07-09 12:58:29 +00:00
Simon Perretta	e2ece5ef25	nir/serialize: increase the op limit to 10 bits/1024 Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36000>	2025-07-09 12:58:29 +00:00
Simon Perretta	1f1b3cc200	nir/precompiled: add shader stage option to nir_precompiled_build_variant Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36001>	2025-07-09 13:14:41 +01:00
Simon Perretta	5b29daf7bc	nir/precompiled: add helper to emit an enum map for multiple targets Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36001>	2025-07-09 13:14:41 +01:00
Alyssa Rosenzweig	fc95397957	nir/lower_alu: optimize min/max signed zeros we don't usually need a multi-instruction lowering. with the agx change in the next commit, honeykrisp results: Totals from 3589 (6.64% of 54019) affected shaders: MaxWaves: 3598144 -> 3598400 (+0.01%); split: +0.02%, -0.01% Instrs: 1445830 -> 1332394 (-7.85%) CodeSize: 10696356 -> 9742130 (-8.92%) Fills: 721 -> 723 (+0.28%); split: -0.14%, +0.42% Scratch: 3980 -> 3968 (-0.30%) ALU: 1156426 -> 1043198 (-9.79%) FSCIB: 1156426 -> 1043196 (-9.79%) IC: 267202 -> 267166 (-0.01%) GPRs: 208765 -> 208712 (-0.03%); split: -0.16%, +0.14% Uniforms: 683643 -> 683677 (+0.00%); split: -0.01%, +0.01% Preamble instrs: 1163325 -> 1159314 (-0.34%) control results alone: Totals: Instrs: 110168 -> 107171 (-2.72%) Totals from 71 (22.26% of 319) affected shaders: Instrs: 48895 -> 45898 (-6.13%) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35989>	2025-07-08 17:09:16 +00:00
Alyssa Rosenzweig	042adf3cc5	nir/opt_algebraic: optimize signed pow in Control used in a post-processing shader which goes 896 instrs -> 749 instrs. In my Control fossil: Totals from 2 (0.63% of 319) affected shaders: Instrs: 2078 -> 1841 (-11.41%) CodeSize: 14540 -> 12800 (-11.97%) ALU: 1779 -> 1626 (-8.60%) FSCIB: 1779 -> 1626 (-8.60%) Uniforms: 370 -> 372 (+0.54%) In radv_fossils, there are affected shaders in Dredge. Totals from 4 (0.01% of 54019) affected shaders: Instrs: 2306 -> 2294 (-0.52%) CodeSize: 16594 -> 16534 (-0.36%) ALU: 2010 -> 2004 (-0.30%) FSCIB: 2010 -> 2004 (-0.30%) Uniforms: 1138 -> 1146 (+0.70%) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35989>	2025-07-08 17:09:16 +00:00
Alyssa Rosenzweig	2765017553	nir: fuse ffma even with float controls The fmul+fadd -> fma rules in nir_opt_algebraic are marked imprecise, because they are a contraction. However, they respect signed zero/Inf/NaN rules. As such, it is legal to do this fusion with shader float controls as long as the exact bit is not set (mapping to SPIR-V NoContract). Unfortunately, NIR's imprecise rules do not distinguish between contraction issues versus float special case issues, forcing nir_search to skip all imprecise rules when any shader float control modes are used. This notably affects DXVK, which sets shader float controls to get D3D11 float behaviour and hence loses FMA fusing. Therefore, we plumb in the exact bit to express NoContract independent of the float controls, and weaken the requirement for fma fusion to allowable contraction. For fma splitting, it's a similar issue, as inexact GLSL fma in SPIR-V is just a multiply add that we're allowed to contract rather than the real deal. Drivers that use their own FMA fusing passes (notably, Intel and AMD) are unaffected, but DXVK-capable drivers using fuse_ffma should like this. Results on hk shown: Totals from 2194 (4.06% of 54019) affected shaders: MaxWaves: 2174272 -> 2175936 (+0.08%); split: +0.08%, -0.01% Instrs: 1173283 -> 1131494 (-3.56%); split: -3.57%, +0.01% CodeSize: 8568168 -> 8381724 (-2.18%); split: -2.18%, +0.01% Spills: 1094 -> 747 (-31.72%) Fills: 988 -> 681 (-31.07%) Scratch: 4444 -> 3820 (-14.04%) ALU: 953032 -> 913149 (-4.18%); split: -4.19%, +0.01% FSCIB: 953032 -> 913149 (-4.18%); split: -4.19%, +0.01% IC: 215398 -> 215274 (-0.06%) GPRs: 139865 -> 139032 (-0.60%); split: -1.56%, +0.96% Uniforms: 414886 -> 414466 (-0.10%); split: -0.14%, +0.04% Preamble instrs: 646398 -> 644017 (-0.37%); split: -0.43%, +0.07% Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35989>	2025-07-08 17:09:16 +00:00
Daniel Schürmann	2c51a8870d	nir: add nir_vectorize_cb callback parameter to nir_lower_phis_to_scalar() Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Similar to nir_lower_alu_width(), the callback can return the desired number of components for a phi, or 0 for no lowering. The previous behavior of nir_lower_phis_to_scalar() with lower_all=true can be elicited via nir_lower_all_phis_to_scalar() while the previous behavior with lower_all=false now corresponds to nir_lower_phis_to_scalar() with NULL callback. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35783>	2025-07-08 15:33:59 +00:00
Daniel Schürmann	23b7b3b919	nir/lower_phis_to_scalar: remove exec_list dead_instrs No need to free the instructions at this point. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35783>	2025-07-08 15:33:59 +00:00
Daniel Schürmann	f6e0f4813c	nir: remove recursive check in nir_lower_phis_to_scalar() This check causes unnecessary overhead and can be replaced by simply checking whether a phi_src is from a loop continue block. Except for rare edge cases, the result will be the same. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35783>	2025-07-08 15:33:59 +00:00

1 2 3 4 5 ...

6380 commits