fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-19 00:38:06 +02:00

Author	SHA1	Message	Date
Georg Lehmann	abfd6a4df9	nir: don't assume indicies are always 32bit when accessing them as raw data Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40299>	2026-03-10 07:46:20 +00:00
Georg Lehmann	aa831b6690	nir/opt_algebraic: skip more redundant alignment iand Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Useful for smaller/larger loads. Also there is no reason to be bitsize specific here if we use an signed constant. Foz-DB Navi48: Totals from 8 (0.01% of 114655) affected shaders: Instrs: 7629 -> 7612 (-0.22%) CodeSize: 40772 -> 40692 (-0.20%) Latency: 54880 -> 54944 (+0.12%) InvThroughput: 8879 -> 8880 (+0.01%); split: -0.08%, +0.09% VALU: 4029 -> 4027 (-0.05%); split: -0.15%, +0.10% SALU: 1260 -> 1249 (-0.87%) Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40292>	2026-03-10 06:57:50 +00:00
Karol Herbst	9d90cbc314	nak: add input predicate to load_global_nv and OpLd This is new in SM75 (Turing). Let's use it because it allows us to get rid of the if/else around bound checked global loads. There are some changes in fossils, but it seems that's mostly due to CFG optimizations doing things a bit differently? Totals: CodeSize: 9442152688 -> 9442133184 (-0.00%); split: -0.00%, +0.00% Static cycle count: 6120910991 -> 6120907718 (-0.00%); split: -0.00%, +0.00% Spills to reg: 184789 -> 184810 (+0.01%) Fills from reg: 223831 -> 223860 (+0.01%); split: -0.00%, +0.01% Totals from 334 (0.03% of 1163204) affected shaders: CodeSize: 22020752 -> 22001248 (-0.09%); split: -0.10%, +0.01% Static cycle count: 26582978 -> 26579705 (-0.01%); split: -0.01%, +0.00% Spills to reg: 3110 -> 3131 (+0.68%) Fills from reg: 3401 -> 3430 (+0.85%); split: -0.03%, +0.88% Reviewed-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Acked-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40272>	2026-03-10 00:10:05 +00:00
Georg Lehmann	6936282bd3	nir/opt_algebraic: remove min(a, >= 1.0) before fsat Foz-DB Navi48: Totals from 86 (0.08% of 114655) affected shaders: Instrs: 217553 -> 217408 (-0.07%); split: -0.07%, +0.01% CodeSize: 1159992 -> 1159380 (-0.05%); split: -0.06%, +0.01% Latency: 1657600 -> 1657533 (-0.00%); split: -0.01%, +0.00% InvThroughput: 203205 -> 203178 (-0.01%); split: -0.02%, +0.00% SClause: 5245 -> 5244 (-0.02%) Copies: 13726 -> 13716 (-0.07%); split: -0.14%, +0.07% VALU: 130151 -> 130039 (-0.09%); split: -0.09%, +0.00% SALU: 26476 -> 26474 (-0.01%); split: -0.02%, +0.01% Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40281>	2026-03-09 21:11:25 +00:00
Georg Lehmann	108a4d4341	nir: create more fsat using range analysis Foz-DB Navi48: Totals from 5922 (5.17% of 114655) affected shaders: Instrs: 5188307 -> 5184193 (-0.08%); split: -0.09%, +0.01% CodeSize: 27852544 -> 27843252 (-0.03%); split: -0.05%, +0.01% Latency: 28723967 -> 28714268 (-0.03%); split: -0.04%, +0.01% InvThroughput: 4745002 -> 4742298 (-0.06%); split: -0.07%, +0.01% VClause: 68649 -> 68650 (+0.00%) SClause: 103932 -> 103917 (-0.01%); split: -0.02%, +0.00% Copies: 244683 -> 244706 (+0.01%); split: -0.01%, +0.02% PreSGPRs: 272361 -> 272362 (+0.00%); split: -0.00%, +0.00% VALU: 3248960 -> 3245520 (-0.11%); split: -0.11%, +0.00% SALU: 516784 -> 516796 (+0.00%); split: -0.01%, +0.01% VOPD: 8910 -> 8895 (-0.17%) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40281>	2026-03-09 21:11:25 +00:00
Alyssa Rosenzweig	edccd06a0b	nir/lower_subgroups: fix boolean clustered reductions It is legal to have a cluster size larger than the subgroup/ballot size, but our lowering would blow up in this case due to the nir_ishl_imm overflowing in the lowering. Fortunately, this is easy to handle. Fixes sub_group_clustered_reduce_logical_and() Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40224>	2026-03-09 14:50:37 +00:00
Kenneth Graunke	952bf55483	nir: Fix divergence of Intel URB input/output handle intrinsics Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Tessellation evaluation shaders have a single convergent URB handle (for the common patch data) used by all lanes. Every other stage's IO handles have separate handles in each lane. Thanks to Alyssa Rosenzweig for catching this bug. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40280>	2026-03-09 02:38:59 +00:00
Georg Lehmann	7c217e540c	nir: add a pass to optimize fp_math_ctrl Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40098>	2026-03-07 08:16:27 +01:00
Georg Lehmann	f474e9853e	nir: add fp class analysis tests Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:45 +00:00
Georg Lehmann	4885e5cf3a	nir: remove more fsat using range analysis Foz-DB Navi48: Totals from 3018 (3.65% of 82636) affected shaders: MaxWaves: 69274 -> 69280 (+0.01%) Instrs: 7165414 -> 7157581 (-0.11%); split: -0.12%, +0.01% CodeSize: 38890212 -> 38823132 (-0.17%); split: -0.18%, +0.00% VGPRs: 228672 -> 228624 (-0.02%) Latency: 64789026 -> 64784877 (-0.01%); split: -0.01%, +0.00% InvThroughput: 11805156 -> 11802642 (-0.02%); split: -0.02%, +0.00% VClause: 136900 -> 136886 (-0.01%); split: -0.03%, +0.02% SClause: 150135 -> 150130 (-0.00%); split: -0.01%, +0.01% Copies: 574690 -> 574894 (+0.04%); split: -0.03%, +0.06% Branches: 187169 -> 187086 (-0.04%); split: -0.04%, +0.00% PreSGPRs: 190074 -> 190067 (-0.00%); split: -0.00%, +0.00% PreVGPRs: 189564 -> 189538 (-0.01%); split: -0.02%, +0.00% VALU: 3955188 -> 3949411 (-0.15%); split: -0.15%, +0.00% SALU: 1114659 -> 1114729 (+0.01%); split: -0.02%, +0.03% SMEM: 231080 -> 231077 (-0.00%); split: -0.00%, +0.00% VOPD: 116150 -> 116180 (+0.03%); split: +0.04%, -0.02% Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:45 +00:00
Georg Lehmann	506bb5a609	nir/search_helpers: use fp class analysis more Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:45 +00:00
Georg Lehmann	a9e75d8ee4	nir: remove nir_analyze_fp_range Use fp class analysis instead. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:44 +00:00
Georg Lehmann	eb431efc19	nir/search_helpers: switch to fp class analysis Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:44 +00:00
Georg Lehmann	58799c4e7c	nir/gather_tcs_info: use nir_analyze_fp_class directly The information around positive one helps in theory. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:44 +00:00
Georg Lehmann	0ecf2c322e	nir: add fp class analysis for fround_even Foz-DB Navi48: Totals from 383 (0.33% of 114655) affected shaders: MaxWaves: 9806 -> 9808 (+0.02%) Instrs: 502508 -> 501762 (-0.15%); split: -0.16%, +0.01% CodeSize: 2711404 -> 2707604 (-0.14%); split: -0.15%, +0.01% VGPRs: 24360 -> 24348 (-0.05%) Latency: 2068105 -> 2066817 (-0.06%); split: -0.07%, +0.01% InvThroughput: 370962 -> 370081 (-0.24%) VClause: 7045 -> 7041 (-0.06%) SClause: 10551 -> 10559 (+0.08%); split: -0.08%, +0.15% Copies: 29135 -> 29117 (-0.06%); split: -0.12%, +0.05% Branches: 17333 -> 17328 (-0.03%) PreSGPRs: 21511 -> 21510 (-0.00%) PreVGPRs: 18555 -> 18545 (-0.05%) VALU: 274445 -> 273874 (-0.21%); split: -0.21%, +0.00% SALU: 78819 -> 78779 (-0.05%); split: -0.07%, +0.02% VMEM: 10918 -> 10913 (-0.05%) SMEM: 17662 -> 17656 (-0.03%) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:44 +00:00
Georg Lehmann	7509b4a199	nir: add fp class analysis for fsub Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:44 +00:00
Georg Lehmann	d8734e5453	nir: add fp class analysis for shadow compare Foz-DB Navi48: Totals from 145 (0.18% of 82636) affected shaders: Instrs: 280871 -> 280729 (-0.05%) CodeSize: 1545724 -> 1545488 (-0.02%); split: -0.02%, +0.00% Latency: 10840265 -> 10840216 (-0.00%); split: -0.00%, +0.00% InvThroughput: 2093707 -> 2093646 (-0.00%) SClause: 4483 -> 4481 (-0.04%) VALU: 188142 -> 188039 (-0.05%) SALU: 22238 -> 22236 (-0.01%) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:44 +00:00
Georg Lehmann	6d3a279a3b	nir: add fp class analysis for some intrinsics I also tried ddx/ddy, but that was not worth it. Foz-DB Navi48: Totals from 1019 (1.23% of 82636) affected shaders: Instrs: 516459 -> 515700 (-0.15%); split: -0.17%, +0.02% CodeSize: 2712428 -> 2707008 (-0.20%); split: -0.21%, +0.01% VGPRs: 70152 -> 70140 (-0.02%) Latency: 1799198 -> 1795926 (-0.18%); split: -0.19%, +0.00% InvThroughput: 233497 -> 232628 (-0.37%); split: -0.37%, +0.00% VClause: 15315 -> 15346 (+0.20%); split: -0.11%, +0.31% Copies: 30009 -> 30035 (+0.09%); split: -0.06%, +0.14% VALU: 305519 -> 304727 (-0.26%); split: -0.27%, +0.01% SALU: 45855 -> 45854 (-0.00%) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:44 +00:00
Georg Lehmann	73bce23f65	nir: add fp class analysis for flog2 Foz-DB Navi48: Totals from 230 (0.28% of 82636) affected shaders: Instrs: 599005 -> 598615 (-0.07%); split: -0.09%, +0.02% CodeSize: 3110528 -> 3103136 (-0.24%); split: -0.24%, +0.00% Latency: 3661526 -> 3663241 (+0.05%); split: -0.01%, +0.05% InvThroughput: 526561 -> 526487 (-0.01%); split: -0.01%, +0.00% Copies: 33735 -> 33820 (+0.25%); split: -0.06%, +0.31% VALU: 378034 -> 377904 (-0.03%); split: -0.03%, +0.00% SALU: 65156 -> 65045 (-0.17%); split: -0.19%, +0.02% Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:44 +00:00
Georg Lehmann	81e272aa1d	nir: add fp class analysis for sin/cos Foz-DB Navi48: Totals from 264 (0.32% of 82636) affected shaders: CodeSize: 1688676 -> 1688672 (-0.00%) Latency: 510773 -> 510772 (-0.00%) InvThroughput: 138569 -> 138568 (-0.00%) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:44 +00:00
Georg Lehmann	5a298f3560	nir: rewrite fp range analysis as a fp class analysis Knowing if a value is not larger than one helps proving finite results of fmul/fadd and will allow skipping/creating more fsat. Knowing that a value is larger than one helps proving non zero results of fmul. Separating positive and negative zero also has advantages when signed zero correctness is required. Foz-DB Navi48: Totals from 1344 (1.63% of 82636) affected shaders: Instrs: 5319389 -> 5312280 (-0.13%); split: -0.14%, +0.01% CodeSize: 29702516 -> 29665684 (-0.12%); split: -0.13%, +0.01% Latency: 40694344 -> 40694545 (+0.00%); split: -0.01%, +0.02% InvThroughput: 7481192 -> 7480403 (-0.01%); split: -0.02%, +0.01% VClause: 121947 -> 121946 (-0.00%); split: -0.00%, +0.00% SClause: 104972 -> 104923 (-0.05%); split: -0.05%, +0.00% Copies: 371098 -> 371092 (-0.00%); split: -0.02%, +0.02% Branches: 122929 -> 122919 (-0.01%); split: -0.01%, +0.00% PreSGPRs: 82506 -> 82510 (+0.00%); split: -0.00%, +0.01% PreVGPRs: 79175 -> 79168 (-0.01%) VALU: 2906718 -> 2904777 (-0.07%); split: -0.07%, +0.00% SALU: 726256 -> 723454 (-0.39%); split: -0.39%, +0.00% VMEM: 205021 -> 205016 (-0.00%) SMEM: 163972 -> 163916 (-0.03%) VOPD: 303354 -> 303298 (-0.02%); split: +0.02%, -0.04% Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:44 +00:00
Georg Lehmann	32b5719a9f	nir/opt_algebraic: add is_not_uint_zero for b2i16(uge) pattern More fallout from `f2a59fdea6`. is_not_zero now always returns whether the result is a floating point zero. When combined with the fp denorm handling that will be added to floating point range analysis, this is false for many sensible integer values. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:44 +00:00
Georg Lehmann	ab773fc5d4	nir/opt_algebraic: fix frsq clamp pattern This is not NaN correct. And also make the pattern 32bit only because the constant is hard coded FLT_MAX. Fixes: `780b5c1037` ("nir/algebraic: Simplify some Inf and NaN avoidance code") Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:42 +00:00
Georg Lehmann	ba30de1f97	nir/opt_algebraic: remove pattern that skips iabs with range analysis Fixes: `f2a59fdea6` ("nir: remove non float nir_analyse_range support") Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:41 +00:00
Caio Oliveira	da57fbfb07	nir: Fix constant folding for iadd_sat Use INT_MIN instead of INT_MAX for underflow. Fixes: `cc4b50b023` ("nir/opcodes: use u_overflow to fix incorrect checks") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pelloux@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40252>	2026-03-06 22:26:07 +00:00
Emma Anholt	2ec8ecd7de	nir: Do NIR_DEBUG=print under a lock. With most Vulkan engines doing multithreaded compiles, NIR_DEBUG=print has been a frustrating racy mess. Take a lock when we're doing per-pass printing, so that the output is coherent. This unfortunately single-threads the compiler process itself in that case, but when you're NIR_DEBUG=printing, that's probably not a big deal. An assert is introduced to make sure that nobody nests NIR_PASS() in a way that would break printing. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40126>	2026-03-06 19:50:38 +00:00
Alyssa Rosenzweig	1c1c119d7b	nir/lower_io: handle Intel URB intrinsics useful to query these too, they're kinda like load_ssbo/store_ssbo. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40181>	2026-03-06 13:28:32 +00:00
Lionel Landwerlin	e14d6b535c	brw/nir: add new intrinsics to load data from the indirect address This address is delivered on Gfx12.5+ in compute/mesh/task shaders from the command stream instruction. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40174>	2026-03-06 06:34:43 +00:00
Lionel Landwerlin	7b1533414a	brw/nir: enable constant offsets for global_constant_uniform_block_intel Will be useful to retain the base offset added in `0e9453291c` ("brw: improve push constant loading using base offsets") once we move push constant data loading into NIR. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40174>	2026-03-06 06:34:43 +00:00
Rhys Perry	e43caba5f4	nir/range_analysis: use sparse array for float analysis This seems to be faster. ministat (nir_analyze_fp_range): Difference at 95.0% confidence -592900 +/- 2302.24 -27.6432% +/- 0.0998961% (Student's t, pooled s = 2719.05) ministat (overall): Difference at 95.0% confidence -76.8333 +/- 27.2345 -0.632558% +/- 0.223407% (Student's t, pooled s = 46.867) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190>	2026-03-05 11:26:25 +00:00
Rhys Perry	aecbb2a903	nir/range_analysis: use function pointers for lookup Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190>	2026-03-05 11:26:25 +00:00
Rhys Perry	2731c34891	nir/range_analysis: use SSA index for hash table keys Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190>	2026-03-05 11:26:25 +00:00
Rhys Perry	5e376e3ed2	nir: add nir_fp_analysis_state Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190>	2026-03-05 11:26:25 +00:00
Rhys Perry	c0079e09ca	nir/range_analysis: set deleted key If (uintptr_t)&deleted_key is small enough, inserting entries into the hash table might not work correctly. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Backport-to: 26.0 Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40190>	2026-03-05 11:26:25 +00:00
Georg Lehmann	6a218e346d	nir: remove lower_vector_cmp Use nir_lower_alu_width or nir_lower_alu_to_scalar instead. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40197>	2026-03-04 19:50:28 +00:00
Georg Lehmann	3e6e1e213c	nir: remove fall_equal/fany_nequal opcodes Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40197>	2026-03-04 19:50:27 +00:00
Georg Lehmann	d6977adc09	nir/lower_bool_to_float: assert that vector comparisons were lowered There are no backends that handle the vector comparisons with float result. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40197>	2026-03-04 19:50:27 +00:00
Karol Herbst	e1ed7de274	nir: fix nir_round_int_to_float for fp16 fp16 has quite the limited value range and with bigger integers nir_round_int_to_float might return Inf where it shouldn't depending on the rounding mode. Fixes conversions half_rt[npz]_(u)?(int\|long) CL CTS tests. Cc: mesa-stable Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Rob Clark <rob.clark@oss.qualcomm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40163>	2026-03-04 14:32:35 +00:00
Karol Herbst	8e8fb2ebaa	nir: fix nir_alu_type_range_contains_type_range for fp16 to int The special value "Inf" doesn't fit into an int and therefore we have to clamp regardless of whether all the other values would fit. And because f2u32 and f2u64 define out-of-range conversions as UB in nir, we need to clamp. This change should have no effect for non saturating conversions. Fixes "conversions long_sat_*half" CL CTS tests Cc: mesa-stable Suggested-by: Rob Clark <rob.clark@oss.qualcomm.com> Reviewed-by: Rob Clark <rob.clark@oss.qualcomm.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40163>	2026-03-04 14:32:35 +00:00
Daniel Schürmann	56f5e35d95	nir/opt_remove_phis: recursively check loop header phis for triviality This only checks for one level of nested phis as the potential cost of recursive checks outweighs the rare cases. Totals from 393 (0.35% of 112055) affected shaders: (Navi48) Instrs: 920765 -> 915832 (-0.54%); split: -0.54%, +0.00% CodeSize: 4887052 -> 4867876 (-0.39%); split: -0.39%, +0.00% SpillSGPRs: 464 -> 411 (-11.42%) Latency: 6868149 -> 6856413 (-0.17%); split: -0.21%, +0.04% InvThroughput: 841067 -> 839821 (-0.15%); split: -0.17%, +0.02% Copies: 73573 -> 72021 (-2.11%) Branches: 25973 -> 25343 (-2.43%) PreSGPRs: 34110 -> 33454 (-1.92%) PreVGPRs: 24594 -> 24593 (-0.00%) VALU: 513068 -> 512816 (-0.05%); split: -0.05%, +0.00% SALU: 133157 -> 130038 (-2.34%) VOPD: 9773 -> 9673 (-1.02%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40165>	2026-03-04 14:03:40 +00:00
Rob Clark	dfaa4375c3	rusticl: Let backend control convert_alu_types lowering Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40179>	2026-03-03 12:13:04 -08:00
Georg Lehmann	7194dfcc2c	nir/opt_algebraic: optimize b2i(a) * b to bcsel Foz-DB Navi48: Totals from 3180 (2.77% of 114655) affected shaders: MaxWaves: 85526 -> 85446 (-0.09%) Instrs: 2681446 -> 2678641 (-0.10%); split: -0.17%, +0.07% CodeSize: 14295536 -> 14284628 (-0.08%); split: -0.13%, +0.05% VGPRs: 174792 -> 174636 (-0.09%); split: -0.16%, +0.07% SpillSGPRs: 306 -> 308 (+0.65%) Latency: 14078973 -> 14070122 (-0.06%); split: -0.07%, +0.01% InvThroughput: 2774242 -> 2764051 (-0.37%); split: -0.37%, +0.00% VClause: 41744 -> 41734 (-0.02%); split: -0.10%, +0.07% SClause: 58176 -> 58154 (-0.04%); split: -0.05%, +0.01% Copies: 222967 -> 223108 (+0.06%); split: -0.14%, +0.20% Branches: 57317 -> 57322 (+0.01%) PreSGPRs: 140454 -> 140451 (-0.00%); split: -0.01%, +0.00% PreVGPRs: 131649 -> 131540 (-0.08%); split: -0.09%, +0.01% VALU: 1509318 -> 1505443 (-0.26%); split: -0.26%, +0.00% SALU: 384419 -> 385838 (+0.37%); split: -0.01%, +0.38% VOPD: 13272 -> 13286 (+0.11%); split: +0.14%, -0.03% Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40160>	2026-03-02 15:58:30 +00:00
Georg Lehmann	3d304d5647	nir/opt_algebraic: remove is_used_once on outer instruction This just prevents useful optimizations. is_used_once only makes sense on inner instructions, to prevent creating more new instructions than will be removed. Foz-DB Navi48: Totals from 16989 (14.82% of 114655) affected shaders: MaxWaves: 434379 -> 434353 (-0.01%); split: +0.01%, -0.01% Instrs: 29030794 -> 29022514 (-0.03%); split: -0.07%, +0.04% CodeSize: 155293092 -> 155262816 (-0.02%); split: -0.05%, +0.03% VGPRs: 1093980 -> 1094088 (+0.01%); split: -0.01%, +0.02% SpillSGPRs: 9801 -> 9803 (+0.02%); split: -0.03%, +0.05% Latency: 356327270 -> 356283384 (-0.01%); split: -0.03%, +0.02% InvThroughput: 58239439 -> 58229374 (-0.02%); split: -0.03%, +0.01% VClause: 451716 -> 451815 (+0.02%); split: -0.07%, +0.09% SClause: 654614 -> 654556 (-0.01%); split: -0.03%, +0.03% Copies: 1809805 -> 1809297 (-0.03%); split: -0.20%, +0.17% Branches: 552382 -> 552384 (+0.00%); split: -0.00%, +0.00% PreSGPRs: 947188 -> 947224 (+0.00%); split: -0.01%, +0.02% PreVGPRs: 879583 -> 880173 (+0.07%); split: -0.01%, +0.08% VALU: 16317859 -> 16309975 (-0.05%); split: -0.07%, +0.02% SALU: 4256121 -> 4259315 (+0.08%); split: -0.05%, +0.12% SMEM: 1067069 -> 1067070 (+0.00%) VOPD: 440855 -> 440792 (-0.01%); split: +0.05%, -0.07% Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138>	2026-03-02 15:24:36 +00:00
Georg Lehmann	41878e5714	nir_opt_algebraic: remove unneeded is_not_const These were needed when we didn't constant fold inside nir_search, to prevent infinite loops. But now all they do is slow down pattern matching. Foz-DB Navi48: Totals from 107 (0.09% of 114655) affected shaders: Instrs: 162439 -> 162481 (+0.03%); split: -0.01%, +0.03% CodeSize: 943056 -> 942988 (-0.01%); split: -0.03%, +0.02% Latency: 971667 -> 970865 (-0.08%); split: -0.09%, +0.00% InvThroughput: 164452 -> 164521 (+0.04%); split: -0.02%, +0.07% Copies: 7980 -> 7982 (+0.03%) VALU: 103572 -> 103566 (-0.01%); split: -0.05%, +0.04% SALU: 12825 -> 12878 (+0.41%) VOPD: 5235 -> 5190 (-0.86%) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138>	2026-03-02 15:24:36 +00:00
Georg Lehmann	374cbc17a4	nir_opt_algebraic: reassociate fadd into ffma where one factor is a constant This restriction doesn't really make sense, probably an accident. Foz-DB Navi48: Totals from 2290 (2.00% of 114655) affected shaders: MaxWaves: 57496 -> 57510 (+0.02%); split: +0.06%, -0.03% Instrs: 2817419 -> 2816209 (-0.04%); split: -0.12%, +0.08% CodeSize: 15218816 -> 15220576 (+0.01%); split: -0.09%, +0.10% VGPRs: 147456 -> 147384 (-0.05%); split: -0.07%, +0.02% Latency: 13757114 -> 13751833 (-0.04%); split: -0.13%, +0.09% InvThroughput: 2463343 -> 2462482 (-0.03%); split: -0.07%, +0.04% VClause: 40137 -> 40153 (+0.04%); split: -0.07%, +0.11% SClause: 57351 -> 57385 (+0.06%); split: -0.12%, +0.18% Copies: 135482 -> 136258 (+0.57%); split: -0.22%, +0.79% Branches: 30886 -> 30894 (+0.03%) PreSGPRs: 113470 -> 113462 (-0.01%); split: -0.03%, +0.02% PreVGPRs: 117554 -> 117591 (+0.03%); split: -0.01%, +0.04% VALU: 1682734 -> 1681557 (-0.07%); split: -0.10%, +0.03% SALU: 390685 -> 391301 (+0.16%); split: -0.07%, +0.22% VOPD: 6159 -> 6254 (+1.54%); split: +1.72%, -0.18% Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138>	2026-03-02 15:24:36 +00:00
Georg Lehmann	b949122908	nir/opt_algebraic: remove loops for b2f/b2i equality handling The feq/fneu patterns already existed, and there is no reason to use bit size based loops here. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138>	2026-03-02 15:24:36 +00:00
Georg Lehmann	83091276f8	nir_opt_algebraic: remove more specific cmp+bcsel opts Only some minimal difference from pattern ordering: Foz-DB Navi48: Totals from 3 (0.00% of 114655) affected shaders: Instrs: 4556 -> 4533 (-0.50%) CodeSize: 23716 -> 23608 (-0.46%) Latency: 27424 -> 26336 (-3.97%) InvThroughput: 4674 -> 4672 (-0.04%) SClause: 107 -> 105 (-1.87%) Copies: 351 -> 346 (-1.42%) Branches: 130 -> 126 (-3.08%) VALU: 2598 -> 2595 (-0.12%) SALU: 561 -> 555 (-1.07%) SMEM: 169 -> 167 (-1.18%) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138>	2026-03-02 15:24:36 +00:00
Georg Lehmann	4190241795	nir/opt_algebraic: optimize all comparisons of b2f/b2i with constants Foz-DB Navi48: Totals from 857 (0.75% of 114655) affected shaders: Instrs: 1136993 -> 1132422 (-0.40%); split: -0.48%, +0.08% CodeSize: 6096636 -> 6070832 (-0.42%); split: -0.48%, +0.06% VGPRs: 49668 -> 49620 (-0.10%) Latency: 24014661 -> 24044601 (+0.12%); split: -0.04%, +0.16% InvThroughput: 4182482 -> 4183708 (+0.03%); split: -0.12%, +0.15% VClause: 17698 -> 17695 (-0.02%) SClause: 25214 -> 25213 (-0.00%) Copies: 81474 -> 81396 (-0.10%); split: -0.79%, +0.69% Branches: 24722 -> 24650 (-0.29%); split: -0.36%, +0.07% PreSGPRs: 43338 -> 43291 (-0.11%); split: -0.22%, +0.11% VALU: 652975 -> 649760 (-0.49%); split: -0.50%, +0.00% SALU: 153961 -> 153797 (-0.11%); split: -0.72%, +0.61% VOPD: 10650 -> 10684 (+0.32%); split: +0.38%, -0.07% Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138>	2026-03-02 15:24:36 +00:00
Georg Lehmann	ef6f5377da	nir/opt_algebraic: remove fcmp+fneg patterns that are cleaned up earlier No Foz-DB changes, as expected. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138>	2026-03-02 15:24:36 +00:00
Georg Lehmann	a5334ec239	nir/opt_algebraic: generalize late fcmp(fneg(a), const) patterns No reason just to do this for 1.0. Foz-DB Navi48: Totals from 44 (0.04% of 114655) affected shaders: CodeSize: 111620 -> 111476 (-0.13%) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138>	2026-03-02 15:24:35 +00:00

1 2 3 4 5 ...

7205 commits