fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-16 18:28:05 +02:00

Author	SHA1	Message	Date
squidbus	5b34d1ff34	nir: Only attempt subgroups lower_boolean_reduce for single component. lower_boolean_reduce only works if the number of components is 1, and even asserts on this in its prologue. Otherwise, given a boolean vector type, it may produce output using ballot/vote with a boolean vector input. Acked-by: Aitor Camacho <aitor@lunarg.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41186>	2026-05-11 09:50:27 +00:00
Daniel Schürmann	0832f3251c	nir/opt_algebraic: extend some extract_u8 pattern to extract_i8 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details and remove some duplicate extract pattern. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41385>	2026-05-09 21:23:40 +00:00
Daniel Schürmann	9895b5e5da	nir/opt_algebraic: optimize downcast followed by upcast to extract Totals from 217 (0.10% of 208640) affected shaders: (Navi48) Instrs: 283561 -> 282870 (-0.24%) CodeSize: 1604864 -> 1601136 (-0.23%); split: -0.24%, +0.01% Latency: 2992301 -> 2990107 (-0.07%); split: -0.09%, +0.02% InvThroughput: 602722 -> 601316 (-0.23%); split: -0.23%, +0.00% Copies: 26490 -> 26471 (-0.07%); split: -0.10%, +0.03% VALU: 147735 -> 147176 (-0.38%) SALU: 51545 -> 51541 (-0.01%) VOPD: 11140 -> 11204 (+0.57%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41385>	2026-05-09 21:23:40 +00:00
Georg Lehmann	1716cbff37	nir,amd: reassociate fadd to create more fma/mad ACO's backend fusing is quite competent, but it cannot reorder adds. This adds a simple algebraic pass to do that for us. Foz-DB Navi10: Totals from 13568 (18.76% of 72319) affected shaders: MaxWaves: 304722 -> 304004 (-0.24%); split: +0.10%, -0.33% Instrs: 15084252 -> 14993010 (-0.60%); split: -0.61%, +0.00% CodeSize: 81480188 -> 81372600 (-0.13%); split: -0.17%, +0.04% VGPRs: 741580 -> 743680 (+0.28%); split: -0.10%, +0.38% SpillSGPRs: 9418 -> 9434 (+0.17%) Latency: 154602014 -> 154312940 (-0.19%); split: -0.29%, +0.10% InvThroughput: 44628554 -> 44442595 (-0.42%); split: -0.47%, +0.05% VClause: 300035 -> 300054 (+0.01%); split: -0.31%, +0.31% SClause: 370992 -> 370640 (-0.09%); split: -0.15%, +0.06% Copies: 1162401 -> 1162800 (+0.03%); split: -0.30%, +0.33% Branches: 300646 -> 300654 (+0.00%); split: -0.00%, +0.01% PreSGPRs: 673675 -> 675057 (+0.21%); split: -0.00%, +0.21% PreVGPRs: 633017 -> 634768 (+0.28%); split: -0.29%, +0.57% VALU: 10800351 -> 10712041 (-0.82%); split: -0.82%, +0.00% SALU: 1752917 -> 1753203 (+0.02%); split: -0.04%, +0.06% Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41348>	2026-05-08 11:49:43 +00:00
Georg Lehmann	9e87090db4	nir/loop_analyze: do not count fmul towards the limit when only used by fadd As always with loop unrolling, don't look too closely at stats, but they confirm more loops are now unrolled. Foz-DB Navi10: Totals from 66 (0.09% of 72319) affected shaders: MaxWaves: 1464 -> 1424 (-2.73%); split: +0.82%, -3.55% Instrs: 101778 -> 173128 (+70.10%) CodeSize: 544148 -> 905392 (+66.39%) VGPRs: 3652 -> 3788 (+3.72%); split: -0.77%, +4.49% SpillSGPRs: 105 -> 75 (-28.57%) Latency: 1197088 -> 1033471 (-13.67%); split: -17.08%, +3.41% InvThroughput: 315257 -> 293245 (-6.98%); split: -13.29%, +6.31% VClause: 1663 -> 3057 (+83.82%); split: -0.12%, +83.94% SClause: 2797 -> 4496 (+60.74%); split: -0.21%, +60.96% Copies: 6472 -> 11219 (+73.35%); split: -0.08%, +73.42% Branches: 2695 -> 4697 (+74.29%); split: -0.56%, +74.84% PreSGPRs: 3418 -> 3619 (+5.88%); split: -0.79%, +6.67% PreVGPRs: 3305 -> 3423 (+3.57%); split: -1.06%, +4.63% VALU: 73061 -> 124934 (+71.00%) SALU: 11775 -> 20803 (+76.67%); split: -0.99%, +77.66% VMEM: 2729 -> 4627 (+69.55%) SMEM: 3796 -> 5869 (+54.61%); split: -0.18%, +54.79% Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41348>	2026-05-08 11:49:43 +00:00
Georg Lehmann	25add9cbd1	nir/opt_peephole_select: do not count fmul towards the limit when only used by fadd Foz-DB Navi10: Totals from 4077 (5.64% of 72319) affected shaders: MaxWaves: 84057 -> 83325 (-0.87%); split: +0.07%, -0.94% Instrs: 6019711 -> 6007338 (-0.21%); split: -0.27%, +0.07% CodeSize: 32373984 -> 32356152 (-0.06%); split: -0.18%, +0.13% VGPRs: 236588 -> 238172 (+0.67%); split: -0.05%, +0.72% SpillSGPRs: 7341 -> 7367 (+0.35%); split: -0.65%, +1.01% Latency: 61833147 -> 61386674 (-0.72%); split: -0.91%, +0.19% InvThroughput: 22328993 -> 22364077 (+0.16%); split: -0.16%, +0.32% VClause: 97803 -> 97832 (+0.03%); split: -0.29%, +0.32% SClause: 147544 -> 146274 (-0.86%); split: -1.19%, +0.33% Copies: 606083 -> 593887 (-2.01%); split: -2.27%, +0.26% Branches: 171344 -> 164203 (-4.17%); split: -4.17%, +0.00% PreSGPRs: 234116 -> 234922 (+0.34%); split: -0.17%, +0.52% PreVGPRs: 211250 -> 211374 (+0.06%); split: -0.00%, +0.06% VALU: 4130666 -> 4132669 (+0.05%); split: -0.11%, +0.16% SALU: 854007 -> 852585 (-0.17%); split: -0.77%, +0.61% VMEM: 162718 -> 162755 (+0.02%); split: -0.00%, +0.03% SMEM: 237856 -> 236323 (-0.64%); split: -0.65%, +0.00% Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41348>	2026-05-08 11:49:43 +00:00
Georg Lehmann	0dd50a426e	nir: fix fp_math_ctrl in fisnan Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Otherwise, nir_opt_algebraic will replace it with false. Fixes: `63d199a01e` ("nir: remove special fp_math_ctrl rules") Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41420>	2026-05-08 08:20:16 +00:00
Kenneth Graunke	4018aea9fa	nir: Set FRAG_RESULT_DUAL_SRC_BLEND in outputs_written when lowering Detecting dual source blending is currently annoying: you can either look at info->fs.color_is_dual_source, or FRAG_RESULT_DUAL_SRC_BLEND being in the info->outputs_written bitfield. The former is only set if nir_shader_gather_info runs prior to nir_lower_io lowering it to FRAG_RESULT_DUAL_SRC_BLEND. The latter is only set if nir_shader_gather_info runs after the nir_lower_io lowering. Just make the IO lowering also set the outputs_written flag so if you're trying to use FRAG_RESULT_DUAL_SRC_BLEND, you can always check outputs_written without worrying about pass ordering. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41122>	2026-05-07 08:29:40 +00:00
Karol Herbst	3df48dec23	nir/lower_cl_images: call nir_progress on every function llvmpipe supports real function calls, so we need to call nir_progress on every function, not just the entry point. Cc: mesa-stable Reviewed-by: Dave Airlie <airlied@redhat.com> Acked-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41404>	2026-05-07 05:35:12 +00:00
Pavel Ondračka	0f75fa5bfd	nir/tests: add partial unroll OOB tests Assisted-by: OpenAI Codex (GPT-5.5) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41203>	2026-05-06 20:08:13 +00:00
Pavel Ondračka	e517e3da0b	nir/tests: add helpers for counting used/unused instructions Assisted-by: OpenAI Codex (GPT-5.5) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41203>	2026-05-06 20:08:12 +00:00
Pavel Ondračka	959f59b3f0	nir: fix partial loop unroll OOB check for loops not starting at 0 is_access_out_of_bounds() decides whether the residual loop (created by partial_unroll) will access arrays out of bounds by checking whether array_length is less than or equal to trip_count. That assumes the induction variable starts at 0. For example glamor gradient shader shader-db/shaders/glamor/4.shader_test: uniform float stops[18]; for (i = 1; i < n_stop; i++) if (stop_len < stops[i]) break; trip_count is guessed as 17 from the array indexing, so the residual loop's index begins at 18, out of bounds for the 18-element array, yet 18 <= 17 is false, so the OOB removal is skipped and the residual loop is not eliminated. Correctly consider the start value for the OOB check. This lets glamor gradient shaders with loops starting at i=1 unroll the same way as i=0 loops. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41203>	2026-05-06 20:08:12 +00:00
Rhys Perry	ec59b59b97	nir: rename nir_src_parent_instr to nir_src_use_instr sed -i "s/nir_src_parent_instr/nir_src_use_instr/" `find ./ -type f` sed -i "s/nir_src_parent_if/nir_src_use_if/" `find ./ -type f` sed -i "s/nir_src_set_parent/nir_src_set_use/" `find ./ -type f` There are two kinds of "parent" in relation to a src/def: - the instruction where the def or src's def is defined - the instruction which the src is a part of and where the def is used Clarify that the parent here is where the src's def is used, not where it's defined. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41344>	2026-05-06 17:09:22 +00:00
Lionel Landwerlin	c30a4d4fdb	anv/brw/nir: fix wa_18019110168 Several things were wrong : - incorrect offset in the FS push constant data - incorrect encoding of the 32bit values with 2 fields (remap table offset & provoking vertex) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31384>	2026-05-06 09:49:41 +00:00
Job Noorman	0703f27d6a	nir/opt_offsets: add support for @load/store_global_ir3 Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41342>	2026-05-05 06:25:49 +02:00
Job Noorman	c784af5ca0	ir3: always use byte offset for @load/store_global_ir3 Before a7xx, ldg/stg.a use an offset in units of their type size while on a7xx and later, the offset is always in bytes. Currently, @load/store_global_ir3 take their offset in dwords (32-bits). This has a few downsides: offsets need an extra shl during codegen on a7xx and addressing sub-dword-aligned addresses is only possible by doing 64-bit math on the base address. Improve the situation by always using a byte offset for @load/store_global_ir3 and adding the offset_shift index to support type units pre-a7xx. While we're at it, add the base index as well to support all ldg/stg.g features in @load/store_global_ir3. Supporting these renewed intrinsics consists of two parts: - ir3_nir_lower_io_offsets legalizes the offset_shift on a6xx: for ldg.a/stg.a, the offset has to be in units of the type size so extra shifts are inserted to accomplish this if necessary. On a7xx, offsets are always in bytes so nothing needs to be done. - The intrinsics are emitted as ldg/stg if the offset is a small enough constant and as ldg.a/stg.a otherwise. a6xx supports an extra shift for ldg.a/stg.a that only applies to the GPR offset (not the immediate base); NIR is pattern matched at this point to extract this if possible. All users of @load/store_global_ir3 are updated to generate the offset in units of bytes. ir3_nir_analyze_ubo_ranges is updated to take the new offset_shift into account. Totals from 2029 (1.15% of 176266) affected shaders: MaxWaves: 26728 -> 26660 (-0.25%); split: +0.01%, -0.26% Instrs: 1314089 -> 1278603 (-2.70%); split: -2.72%, +0.02% CodeSize: 2739108 -> 2633236 (-3.87%); split: -3.87%, +0.01% NOPs: 197537 -> 200843 (+1.67%); split: -1.62%, +3.30% MOVs: 43771 -> 44025 (+0.58%); split: -1.11%, +1.69% Full: 31849 -> 31948 (+0.31%); split: -0.03%, +0.34% (ss): 37965 -> 42027 (+10.70%); split: -3.47%, +14.17% (sy): 13752 -> 13566 (-1.35%); split: -4.04%, +2.68% (ss)-stall: 154238 -> 170353 (+10.45%); split: -1.72%, +12.16% (sy)-stall: 804442 -> 806518 (+0.26%); split: -4.65%, +4.91% Preamble Instrs: 326728 -> 293488 (-10.17%) Cat0: 217926 -> 220947 (+1.39%); split: -1.58%, +2.96% Cat1: 50182 -> 50446 (+0.53%); split: -0.97%, +1.49% Cat2: 460987 -> 452101 (-1.93%); split: -2.26%, +0.33% Cat3: 390696 -> 361271 (-7.53%) Cat7: 39148 -> 38688 (-1.18%); split: -1.24%, +0.06% Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41342>	2026-05-05 06:25:49 +02:00
Job Noorman	53d96aed05	nir/get_io_offset_src_number: support @load/store_global_ir3 Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41342>	2026-05-05 06:25:49 +02:00
Faith Ekstrand	84bbfaa7e5	pan/bi: Delete the old texel buffer intrinsics Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41036>	2026-05-05 01:27:16 +00:00
Faith Ekstrand	7d5cb2884c	pan/bi: Allow setting the table on lea_attr_pan Also allow us to set AUTO32 while we're at it. Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41036>	2026-05-05 01:27:16 +00:00
Faith Ekstrand	2369808cd1	pan,nir: Add Bifrost texturing intrinsics These are funky enough that they make more sense as intrinsics than texture opcodes. Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41036>	2026-05-05 01:27:16 +00:00
Faith Ekstrand	0d549f5bde	nir: Add a new nir_op_f2u32_rtne Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41036>	2026-05-05 01:27:16 +00:00
Faith Ekstrand	58cba7887a	nir: Add a new nir_texop_gradient_pan Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41036>	2026-05-05 01:27:16 +00:00
Faith Ekstrand	e0fffabda7	nir/builder: Allow backend1/2 in nir_build_tex() Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41036>	2026-05-05 01:27:16 +00:00
Faith Ekstrand	337aaa0ab9	pan,nir: Add cube face intrinsics Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41036>	2026-05-05 01:27:15 +00:00
Rhys Perry	081feabf9c	nir/search: fix nir_algebraic_automaton after constant folding op(bcsel) Likely fixes https://gitlab.freedesktop.org/mesa/mesa/-/jobs/98917704 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `f4812dc11d` ("nir/opt_constant_folding: constant-fold op(bcsel(), #c) -> bcsel(.., #c1, #c2)") Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41343>	2026-05-04 17:27:38 +00:00
Daniel Schürmann	f4812dc11d	nir/opt_constant_folding: constant-fold op(bcsel(), #c) -> bcsel(.., #c1, #c2) for all ALU instructions except fneg instead of using nir_opt_algebraic for a small subset. Totals from 17711 (8.49% of 208640) affected shaders: (Navi48) MaxWaves: 364391 -> 364397 (+0.00%); split: +0.01%, -0.01% Instrs: 33873994 -> 33780398 (-0.28%); split: -0.31%, +0.03% CodeSize: 198627596 -> 198259724 (-0.19%); split: -0.23%, +0.05% VGPRs: 1435516 -> 1435144 (-0.03%); split: -0.04%, +0.02% SpillSGPRs: 652827 -> 654577 (+0.27%); split: -0.00%, +0.27% SpillVGPRs: 594840 -> 593598 (-0.21%); split: -0.28%, +0.07% Scratch: 31791360 -> 31543552 (-0.78%) Latency: 417824569 -> 415881858 (-0.46%); split: -0.48%, +0.02% InvThroughput: 80376232 -> 80307996 (-0.08%); split: -0.10%, +0.01% VClause: 557238 -> 554770 (-0.44%); split: -0.50%, +0.06% SClause: 688297 -> 688125 (-0.02%); split: -0.04%, +0.02% Copies: 3571756 -> 3566704 (-0.14%); split: -0.44%, +0.29% Branches: 628710 -> 628576 (-0.02%); split: -0.07%, +0.05% PreSGPRs: 1100316 -> 1103478 (+0.29%); split: -0.02%, +0.30% PreVGPRs: 1132139 -> 1128765 (-0.30%); split: -0.30%, +0.00% VALU: 18944830 -> 18912030 (-0.17%); split: -0.20%, +0.03% SALU: 4363054 -> 4342748 (-0.47%); split: -0.57%, +0.10% VMEM: 1894420 -> 1891754 (-0.14%); split: -0.19%, +0.05% SMEM: 1073860 -> 1073741 (-0.01%); split: -0.01%, +0.00% VOPD: 1734659 -> 1735718 (+0.06%); split: +0.20%, -0.14% Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40848>	2026-05-04 09:42:59 +00:00
Daniel Schürmann	8b1c60add4	nir/opt_constant_folding: create const_value_for_alu() helper Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40848>	2026-05-04 09:42:59 +00:00
Georg Lehmann	52b195b4e8	nir/opt_algebraic: add more fmulz pattern Totals from 3 (0.00% of 202440) affected shaders: (Navi48) Instrs: 5684 -> 5641 (-0.76%); split: -0.77%, +0.02% CodeSize: 30952 -> 30708 (-0.79%); split: -0.80%, +0.01% Latency: 9236 -> 9199 (-0.40%); split: -0.42%, +0.02% InvThroughput: 2287 -> 2273 (-0.61%) VALU: 3900 -> 3884 (-0.41%) SALU: 305 -> 289 (-5.25%) Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40848>	2026-05-04 09:42:59 +00:00
Georg Lehmann	38e691fc0a	nir/opt_varyings: do no_signed_zero linking even for non removable stores Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details E.g. position in VS. Foz-DB Navi48: Totals from 948 (0.79% of 120695) affected shaders: MaxWaves: 26816 -> 26828 (+0.04%) Instrs: 799692 -> 796993 (-0.34%); split: -0.34%, +0.01% CodeSize: 3855744 -> 3846816 (-0.23%); split: -0.24%, +0.01% VGPRs: 50256 -> 50220 (-0.07%) Latency: 2209359 -> 2207667 (-0.08%); split: -0.09%, +0.01% InvThroughput: 305260 -> 303519 (-0.57%); split: -0.57%, +0.00% VClause: 11640 -> 11643 (+0.03%); split: -0.01%, +0.03% SClause: 21152 -> 21149 (-0.01%) Copies: 51658 -> 51675 (+0.03%); split: -0.11%, +0.14% Branches: 18656 -> 18655 (-0.01%) PreVGPRs: 37999 -> 37984 (-0.04%) VALU: 469752 -> 467406 (-0.50%); split: -0.50%, +0.00% SALU: 105433 -> 105323 (-0.10%); split: -0.11%, +0.00% Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41292>	2026-05-03 19:55:10 +00:00
Georg Lehmann	fac4edbcba	nir/opt_varyings: back propagate signed zero information to outputs Foz-DB Navi48: Totals from 809 (0.67% of 120695) affected shaders: MaxWaves: 21804 -> 21808 (+0.02%) Instrs: 863131 -> 861310 (-0.21%); split: -0.22%, +0.01% CodeSize: 4535500 -> 4523232 (-0.27%); split: -0.30%, +0.03% VGPRs: 47304 -> 47280 (-0.05%) SpillSGPRs: 170 -> 82 (-51.76%) Latency: 6791484 -> 6786880 (-0.07%); split: -0.07%, +0.00% InvThroughput: 906281 -> 905301 (-0.11%); split: -0.11%, +0.00% VClause: 16910 -> 16917 (+0.04%); split: -0.01%, +0.05% SClause: 21856 -> 21827 (-0.13%); split: -0.14%, +0.01% Copies: 61890 -> 61436 (-0.73%); split: -0.80%, +0.06% Branches: 19725 -> 19640 (-0.43%) PreSGPRs: 38011 -> 37851 (-0.42%) PreVGPRs: 36482 -> 36454 (-0.08%) VALU: 465316 -> 464323 (-0.21%); split: -0.22%, +0.00% SALU: 143757 -> 143395 (-0.25%); split: -0.33%, +0.08% VMEM: 36827 -> 36806 (-0.06%) SMEM: 37769 -> 37768 (-0.00%) Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41292>	2026-05-03 19:55:10 +00:00
Georg Lehmann	b2bc57551a	nir/instr_set: allow cse with fp_math_ctrl mismatches for intrinsics Just like for ALU. No Foz-DB changes. Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41292>	2026-05-03 19:55:10 +00:00
Marek Olšák	f583f6e717	nir: use nir_build_frag_coord everywhere nir_build_frag_coord generates the correct sysval loads based on NIR options. nir_load_frag_coord shouldn't be used directly because drivers don't have to support it. v2: RADV can't use it because nir->options isn't set, so use load_pixel_coord. Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41227>	2026-05-03 13:03:01 +00:00
Marek Olšák	b63a9a8b39	nir: add direct lowered frag_coord building to replace lowering passes Instead of lowering frag_coord 4 times during compilation, just use this. Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41227>	2026-05-03 13:03:00 +00:00
Marek Olšák	9c5ad16819	nir/opt_frag_coord_to_pixel_coord: handle frag_coord_xy Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41227>	2026-05-03 13:03:00 +00:00
Marek Olšák	076b0aaf1d	nir/lower_wpos_ytransform: handle frag_coord_xy Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41227>	2026-05-03 13:03:00 +00:00
Marek Olšák	e49f29f25e	nir: add frag_coord_xy to strengthen and simplify pixel_coord lowering Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41227>	2026-05-03 13:03:00 +00:00
Daniel Schürmann	012d72f2b0	nir/opt_algebraic: add some imul24_relaxed pattern Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41178>	2026-05-01 10:07:26 +00:00
Daniel Schürmann	708093d830	nir/opt_algebraic: use imul24_relaxed for lowered dot4x8_add Totals from 28 (0.04% of 72819) affected shaders: (Navi10) MaxWaves: 181 -> 186 (+2.76%) Instrs: 406735 -> 338360 (-16.81%) CodeSize: 2913588 -> 2469712 (-15.23%) VGPRs: 5520 -> 5468 (-0.94%) SpillVGPRs: 32 -> 0 (-inf%) LDS: 64512 -> 62464 (-3.17%) Scratch: 10240 -> 0 (-inf%) Latency: 11028252 -> 4357120 (-60.49%) InvThroughput: 11004126 -> 4079018 (-62.93%) VClause: 1686 -> 2055 (+21.89%); split: -0.89%, +22.78% SClause: 890 -> 852 (-4.27%) Copies: 4516 -> 2644 (-41.45%); split: -41.59%, +0.13% PreSGPRs: 982 -> 974 (-0.81%) PreVGPRs: 5356 -> 4284 (-20.01%) VALU: 370529 -> 330201 (-10.88%) SALU: 28850 -> 1170 (-95.94%) VMEM: 2616 -> 2560 (-2.14%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41178>	2026-05-01 10:07:25 +00:00
Lorenzo Rossi	63aceb07ff	nir/opt_sink: Add pan-specific load_input Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>	2026-04-30 18:26:10 +00:00
Lorenzo Rossi	30d8f9c554	nir/lower_point_size: Handle 16-bit point sizes panfrost has float16 point size, handling that precision too allows the compiler to call lower_point_size later in the compilation pipeline Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>	2026-04-30 18:26:10 +00:00
Lorenzo Rossi	2a7d817591	nir/opt_algebraic: optimize fadd/fmul with 16-bit source and constant Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41096>	2026-04-30 17:33:09 +00:00
Lorenzo Rossi	89436db611	nir: Extract float_is_half tests in common code Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41096>	2026-04-30 17:33:09 +00:00
Karol Herbst	4e67582ddf	nir: add fmul_rtz optimizations NVK is only going to use it for `fmul_rtz(frcp(ipa), ipa)` patterns, so try not too hard to optimize this. Totals from 10 (0.00% of 1212873) affected shaders: CodeSize: 34480 -> 34288 (-0.56%); split: -0.60%, +0.05% Static cycle count: 6225 -> 6132 (-1.49%); split: -1.57%, +0.08% Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41179>	2026-04-30 15:42:40 +00:00
Karol Herbst	2e09b4ac68	nir: handle fmul_rtz in a couple of places Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41179>	2026-04-30 15:42:40 +00:00
Karol Herbst	4e520f671c	nir: add fmul_rtz It's needed in NVK for correctness with interpolation. Backport-to: 26.1 Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41179>	2026-04-30 15:42:40 +00:00
Marek Olšák	a3e3bf0ac2	nir/opt_dce: add shader_info::assert_inputs_not_dead Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41166>	2026-04-30 07:07:32 +00:00
Marek Olšák	7bd5856cc6	nir/opt_dce: factor out dead instruction removal into a helper Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41166>	2026-04-30 07:07:32 +00:00
Alyssa Rosenzweig	0c49738211	nir/opt_reassociate: fix exactness bug For an inexact-associative operation (fadd or fmul), can_reassociate ensures the root of the chain is inexact to allow reassociating. However, build_chain just checks for opcodes to match up after, although we do sum up exactness across the chain. Although an Effort Was Made, it still seems incorrect to reassociate %3 = fadd! %0, %1 %4 = fadd %3, %2 to instead be (ex.) %3 = fadd! %0, %2 %4 = fadd! %3, %1 Closes: #14418 Fixes: `e0b0f7e73c` ("nir: add ALU reassocation pass") Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41162>	2026-04-28 21:14:56 +00:00
Georg Lehmann	599a52174b	nir: disable fp class analysis for 64bit transcendentals Some backends have terrible precision for these fp64 opcodes, so don't try to do anything clever. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15334 Fixes: `5a298f3560` ("nir: rewrite fp range analysis as a fp class analysis") Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41206>	2026-04-28 13:26:42 +00:00
Simon Perretta	57791c4a99	pco: track how many tg4/raw sample comps are needed Rather than always emitting and swizzling 16 components for raw samples, scale it by the number actually needed as defined by the selected tg4 channel/components. Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Acked-by: Frank Binns <frank.binns@imgtec.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40687>	2026-04-28 12:04:03 +01:00

1 2 3 4 5 ...

7455 commits