fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-18 07:18:06 +02:00

Author	SHA1	Message	Date
Georg Lehmann	0b51ed736d	glsl: reset fp_math_ctrl when changing it per alu I missed that the fp_math_ctrl is otherwise only reset at the next assignment. What a strange IR. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40291>	2026-03-13 07:13:09 +00:00
Georg Lehmann	624313d35d	nir/opt_algebraic: lower ninf fisfinite correctly Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40291>	2026-03-13 07:13:09 +00:00
Georg Lehmann	15eadc1253	nir/lower_frexp: preserve fp_math_ctrl Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40291>	2026-03-13 07:13:09 +00:00
Faith Ekstrand	f2f792996d	Revert "nir: Add a type parameter to nir_lower_point_size()" This reverts commit `6ee4ea5ea3`. Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Acked-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38681>	2026-03-12 22:59:13 +00:00
Faith Ekstrand	ceacec4cc9	nir: Allow 8-bit vertex output stores These can never come from the API but there's a few cases where panvk wants them. Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Acked-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38681>	2026-03-12 22:59:13 +00:00
Mike Blumenkrantz	3dbb7e896d	mesa/st: fix unlower_io_to_vars to work with mesh shaders cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/15034 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/15040 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37408>	2026-03-12 22:02:57 +00:00
Mike Blumenkrantz	e604a8f617	nir: fix nir_is_io_compact for mesh shaders cc: mesa-stable Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37408>	2026-03-12 22:02:57 +00:00
Daivik Bhatia	33092de196	nir: Handle format swizzles for OOB image loads When masking out of bounds image loads, we previously returned a vector of all zeros. However, for robustImageAccess2, depending on the format, some components such as the alpha channel in an RGB format should evaluate to 1. This corrects the replacement value based on the format swizzle. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39430>	2026-03-12 19:14:24 +00:00
Georg Lehmann	9219c6bc31	nir/gather_info: use nir_intrinsic_has_io_semantics Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40338>	2026-03-12 17:00:25 +00:00
Georg Lehmann	eb111bca2c	nir/opt_load_store_vectorize: use nir_intrinsic_has_align_mul Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40338>	2026-03-12 17:00:25 +00:00
Lorenzo Rossi	75425f36dc	nir/opt_varyings: Skip code-motion for upconversions Code-motion should not move back upconversions without any other instruction, that would only increase memory pressure without any significant performance benefit (conversions are usually cheap). This should also help lowering mediump varyings early by not reversing their work. Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40273>	2026-03-11 23:52:10 +00:00
Mary Guillemard	73dba1e151	nir, nvk, nak: Add base to isbewr_nv and isberd_nv Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details On SM86+, we can use a 16-bit unsigned offset along side the register for it. This adds a new base indice that will be used for it, integration with nir_opt_offsets and a lowering pass to get ride of the base on unsupported generations. Signed-off-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39716>	2026-03-11 19:41:34 +00:00
Mary Guillemard	6a8d09972e	nir: Add isbewr_nv intrinsic and extends isberd_nv Adds a new intrinsic allowing to do raw write in the various ISBE spaces where attributes are stored. This also adapt isberd_nv to map to what we have since SM70+. This will be used to support mesh shaders. Signed-off-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39716>	2026-03-11 19:41:33 +00:00
Georg Lehmann	769606e2e6	nir/opt_fp_math_ctrl: handle input/output no_signed_zero flag Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40323>	2026-03-11 16:47:15 +00:00
Georg Lehmann	0d747eee88	nir: add no_signed_zero flag to io semantics Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40323>	2026-03-11 16:47:15 +00:00
Georg Lehmann	26f5a6d6cc	nir: fix nir_intrinsic_copy_const_indices for large indices Fixes: `4ba581887e` ("nir: support intrinsic indicies larger than 32 bits") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40323>	2026-03-11 16:47:15 +00:00
Faith Ekstrand	5de5987678	nir,panfrost: Move lower_bool_to_bitsize to panfrost It's the only driver that uses the pass so it may as well go there. Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40307>	2026-03-10 20:54:44 +00:00
Faith Ekstrand	3fd471dca5	nir/lower_bool_to_bitsize: Make all bN_csel sources match Previously, we assumed that the selector for bcsel could be whatever, regardless of the bit sizes of the data and we'd just fix it in the back-end. This works okay for scalars but falls over the moment we vectorize because all our vector handling assumes bit sizes match. Since matching bit sizes is what the hardware wants anyway, it's better to do the right thing in NIR and hope copy-propagation can fold in conversions if needed. Unfortunately, copy prop isn't that smart yet so this does hurt a bit: Instrs: 1193679 -> 1198086 (+0.37%); split: -0.06%, +0.43% CodeSize: 11915136 -> 11950592 (+0.30%); split: -0.05%, +0.34% Full: 160985 -> 160941 (-0.03%); split: -0.04%, +0.01% Estimated normalized CVT cycles: 4456.938557000181 -> 4480.876069000186 (+0.54%); split: -0.13%, +0.67% Estimated normalized SFU cycles: 6350.9375 -> 6392.21875 (+0.65%) Estimated normalized Load/Store cycles: 205773.0 -> 205795.0 (+0.01%) Maximum number of threads: 12864 -> 12863 (-0.01%) Number of spill instructions: 22487 -> 22489 (+0.01%) Number of fill instructions: 52179 -> 52219 (+0.08%) Hurt shaders: google-meet-clvk/BgBlur google-meet-clvk/Relight parallel-rdp/small_subgroup parallel-rdp/small_uber_subgroup The proper solution here is to teach copy-prop about this stuff so that it can propagate swizzles into ALU ops when they're supported: https://gitlab.freedesktop.org/panfrost/mesa/-/issues/265 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14945 Cc: mesa-stable Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40307>	2026-03-10 20:54:43 +00:00
Lionel Landwerlin	f508c6acbb	brw/nir: improve shader_indirect_data_intel handling Use is_scalar to know if we can do transpose loading. Also enable vectorization if 2 intrinsics share the same source (it means the only difference is the base). Fixes: `e14d6b535c` ("brw/nir: add new intrinsics to load data from the indirect address") Tested-by: Felix DeGrood <felix.j.degrood@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40308>	2026-03-10 18:24:04 +00:00
Georg Lehmann	452025f75e	nir: add free bits in nir_io_semantics for future use Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40299>	2026-03-10 07:46:22 +00:00
Georg Lehmann	a25f00eaed	nir: merge xfb and xfb2 into one 64bit intrinsic index Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40299>	2026-03-10 07:46:22 +00:00
Georg Lehmann	4ba581887e	nir: support intrinsic indicies larger than 32 bits Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40299>	2026-03-10 07:46:21 +00:00
Georg Lehmann	abfd6a4df9	nir: don't assume indicies are always 32bit when accessing them as raw data Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40299>	2026-03-10 07:46:20 +00:00
Georg Lehmann	aa831b6690	nir/opt_algebraic: skip more redundant alignment iand Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Useful for smaller/larger loads. Also there is no reason to be bitsize specific here if we use an signed constant. Foz-DB Navi48: Totals from 8 (0.01% of 114655) affected shaders: Instrs: 7629 -> 7612 (-0.22%) CodeSize: 40772 -> 40692 (-0.20%) Latency: 54880 -> 54944 (+0.12%) InvThroughput: 8879 -> 8880 (+0.01%); split: -0.08%, +0.09% VALU: 4029 -> 4027 (-0.05%); split: -0.15%, +0.10% SALU: 1260 -> 1249 (-0.87%) Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40292>	2026-03-10 06:57:50 +00:00
Karol Herbst	9d90cbc314	nak: add input predicate to load_global_nv and OpLd This is new in SM75 (Turing). Let's use it because it allows us to get rid of the if/else around bound checked global loads. There are some changes in fossils, but it seems that's mostly due to CFG optimizations doing things a bit differently? Totals: CodeSize: 9442152688 -> 9442133184 (-0.00%); split: -0.00%, +0.00% Static cycle count: 6120910991 -> 6120907718 (-0.00%); split: -0.00%, +0.00% Spills to reg: 184789 -> 184810 (+0.01%) Fills from reg: 223831 -> 223860 (+0.01%); split: -0.00%, +0.01% Totals from 334 (0.03% of 1163204) affected shaders: CodeSize: 22020752 -> 22001248 (-0.09%); split: -0.10%, +0.01% Static cycle count: 26582978 -> 26579705 (-0.01%); split: -0.01%, +0.00% Spills to reg: 3110 -> 3131 (+0.68%) Fills from reg: 3401 -> 3430 (+0.85%); split: -0.03%, +0.88% Reviewed-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Acked-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40272>	2026-03-10 00:10:05 +00:00
Georg Lehmann	6936282bd3	nir/opt_algebraic: remove min(a, >= 1.0) before fsat Foz-DB Navi48: Totals from 86 (0.08% of 114655) affected shaders: Instrs: 217553 -> 217408 (-0.07%); split: -0.07%, +0.01% CodeSize: 1159992 -> 1159380 (-0.05%); split: -0.06%, +0.01% Latency: 1657600 -> 1657533 (-0.00%); split: -0.01%, +0.00% InvThroughput: 203205 -> 203178 (-0.01%); split: -0.02%, +0.00% SClause: 5245 -> 5244 (-0.02%) Copies: 13726 -> 13716 (-0.07%); split: -0.14%, +0.07% VALU: 130151 -> 130039 (-0.09%); split: -0.09%, +0.00% SALU: 26476 -> 26474 (-0.01%); split: -0.02%, +0.01% Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40281>	2026-03-09 21:11:25 +00:00
Georg Lehmann	108a4d4341	nir: create more fsat using range analysis Foz-DB Navi48: Totals from 5922 (5.17% of 114655) affected shaders: Instrs: 5188307 -> 5184193 (-0.08%); split: -0.09%, +0.01% CodeSize: 27852544 -> 27843252 (-0.03%); split: -0.05%, +0.01% Latency: 28723967 -> 28714268 (-0.03%); split: -0.04%, +0.01% InvThroughput: 4745002 -> 4742298 (-0.06%); split: -0.07%, +0.01% VClause: 68649 -> 68650 (+0.00%) SClause: 103932 -> 103917 (-0.01%); split: -0.02%, +0.00% Copies: 244683 -> 244706 (+0.01%); split: -0.01%, +0.02% PreSGPRs: 272361 -> 272362 (+0.00%); split: -0.00%, +0.00% VALU: 3248960 -> 3245520 (-0.11%); split: -0.11%, +0.00% SALU: 516784 -> 516796 (+0.00%); split: -0.01%, +0.01% VOPD: 8910 -> 8895 (-0.17%) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40281>	2026-03-09 21:11:25 +00:00
Alyssa Rosenzweig	edccd06a0b	nir/lower_subgroups: fix boolean clustered reductions It is legal to have a cluster size larger than the subgroup/ballot size, but our lowering would blow up in this case due to the nir_ishl_imm overflowing in the lowering. Fortunately, this is easy to handle. Fixes sub_group_clustered_reduce_logical_and() Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40224>	2026-03-09 14:50:37 +00:00
Kenneth Graunke	952bf55483	nir: Fix divergence of Intel URB input/output handle intrinsics Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Tessellation evaluation shaders have a single convergent URB handle (for the common patch data) used by all lanes. Every other stage's IO handles have separate handles in each lane. Thanks to Alyssa Rosenzweig for catching this bug. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40280>	2026-03-09 02:38:59 +00:00
Georg Lehmann	7c217e540c	nir: add a pass to optimize fp_math_ctrl Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40098>	2026-03-07 08:16:27 +01:00
Georg Lehmann	f474e9853e	nir: add fp class analysis tests Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:45 +00:00
Georg Lehmann	4885e5cf3a	nir: remove more fsat using range analysis Foz-DB Navi48: Totals from 3018 (3.65% of 82636) affected shaders: MaxWaves: 69274 -> 69280 (+0.01%) Instrs: 7165414 -> 7157581 (-0.11%); split: -0.12%, +0.01% CodeSize: 38890212 -> 38823132 (-0.17%); split: -0.18%, +0.00% VGPRs: 228672 -> 228624 (-0.02%) Latency: 64789026 -> 64784877 (-0.01%); split: -0.01%, +0.00% InvThroughput: 11805156 -> 11802642 (-0.02%); split: -0.02%, +0.00% VClause: 136900 -> 136886 (-0.01%); split: -0.03%, +0.02% SClause: 150135 -> 150130 (-0.00%); split: -0.01%, +0.01% Copies: 574690 -> 574894 (+0.04%); split: -0.03%, +0.06% Branches: 187169 -> 187086 (-0.04%); split: -0.04%, +0.00% PreSGPRs: 190074 -> 190067 (-0.00%); split: -0.00%, +0.00% PreVGPRs: 189564 -> 189538 (-0.01%); split: -0.02%, +0.00% VALU: 3955188 -> 3949411 (-0.15%); split: -0.15%, +0.00% SALU: 1114659 -> 1114729 (+0.01%); split: -0.02%, +0.03% SMEM: 231080 -> 231077 (-0.00%); split: -0.00%, +0.00% VOPD: 116150 -> 116180 (+0.03%); split: +0.04%, -0.02% Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:45 +00:00
Georg Lehmann	506bb5a609	nir/search_helpers: use fp class analysis more Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:45 +00:00
Georg Lehmann	a9e75d8ee4	nir: remove nir_analyze_fp_range Use fp class analysis instead. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:44 +00:00
Georg Lehmann	eb431efc19	nir/search_helpers: switch to fp class analysis Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:44 +00:00
Georg Lehmann	58799c4e7c	nir/gather_tcs_info: use nir_analyze_fp_class directly The information around positive one helps in theory. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:44 +00:00
Georg Lehmann	0ecf2c322e	nir: add fp class analysis for fround_even Foz-DB Navi48: Totals from 383 (0.33% of 114655) affected shaders: MaxWaves: 9806 -> 9808 (+0.02%) Instrs: 502508 -> 501762 (-0.15%); split: -0.16%, +0.01% CodeSize: 2711404 -> 2707604 (-0.14%); split: -0.15%, +0.01% VGPRs: 24360 -> 24348 (-0.05%) Latency: 2068105 -> 2066817 (-0.06%); split: -0.07%, +0.01% InvThroughput: 370962 -> 370081 (-0.24%) VClause: 7045 -> 7041 (-0.06%) SClause: 10551 -> 10559 (+0.08%); split: -0.08%, +0.15% Copies: 29135 -> 29117 (-0.06%); split: -0.12%, +0.05% Branches: 17333 -> 17328 (-0.03%) PreSGPRs: 21511 -> 21510 (-0.00%) PreVGPRs: 18555 -> 18545 (-0.05%) VALU: 274445 -> 273874 (-0.21%); split: -0.21%, +0.00% SALU: 78819 -> 78779 (-0.05%); split: -0.07%, +0.02% VMEM: 10918 -> 10913 (-0.05%) SMEM: 17662 -> 17656 (-0.03%) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:44 +00:00
Georg Lehmann	7509b4a199	nir: add fp class analysis for fsub Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:44 +00:00
Georg Lehmann	d8734e5453	nir: add fp class analysis for shadow compare Foz-DB Navi48: Totals from 145 (0.18% of 82636) affected shaders: Instrs: 280871 -> 280729 (-0.05%) CodeSize: 1545724 -> 1545488 (-0.02%); split: -0.02%, +0.00% Latency: 10840265 -> 10840216 (-0.00%); split: -0.00%, +0.00% InvThroughput: 2093707 -> 2093646 (-0.00%) SClause: 4483 -> 4481 (-0.04%) VALU: 188142 -> 188039 (-0.05%) SALU: 22238 -> 22236 (-0.01%) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:44 +00:00
Georg Lehmann	6d3a279a3b	nir: add fp class analysis for some intrinsics I also tried ddx/ddy, but that was not worth it. Foz-DB Navi48: Totals from 1019 (1.23% of 82636) affected shaders: Instrs: 516459 -> 515700 (-0.15%); split: -0.17%, +0.02% CodeSize: 2712428 -> 2707008 (-0.20%); split: -0.21%, +0.01% VGPRs: 70152 -> 70140 (-0.02%) Latency: 1799198 -> 1795926 (-0.18%); split: -0.19%, +0.00% InvThroughput: 233497 -> 232628 (-0.37%); split: -0.37%, +0.00% VClause: 15315 -> 15346 (+0.20%); split: -0.11%, +0.31% Copies: 30009 -> 30035 (+0.09%); split: -0.06%, +0.14% VALU: 305519 -> 304727 (-0.26%); split: -0.27%, +0.01% SALU: 45855 -> 45854 (-0.00%) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:44 +00:00
Georg Lehmann	73bce23f65	nir: add fp class analysis for flog2 Foz-DB Navi48: Totals from 230 (0.28% of 82636) affected shaders: Instrs: 599005 -> 598615 (-0.07%); split: -0.09%, +0.02% CodeSize: 3110528 -> 3103136 (-0.24%); split: -0.24%, +0.00% Latency: 3661526 -> 3663241 (+0.05%); split: -0.01%, +0.05% InvThroughput: 526561 -> 526487 (-0.01%); split: -0.01%, +0.00% Copies: 33735 -> 33820 (+0.25%); split: -0.06%, +0.31% VALU: 378034 -> 377904 (-0.03%); split: -0.03%, +0.00% SALU: 65156 -> 65045 (-0.17%); split: -0.19%, +0.02% Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:44 +00:00
Georg Lehmann	81e272aa1d	nir: add fp class analysis for sin/cos Foz-DB Navi48: Totals from 264 (0.32% of 82636) affected shaders: CodeSize: 1688676 -> 1688672 (-0.00%) Latency: 510773 -> 510772 (-0.00%) InvThroughput: 138569 -> 138568 (-0.00%) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:44 +00:00
Georg Lehmann	5a298f3560	nir: rewrite fp range analysis as a fp class analysis Knowing if a value is not larger than one helps proving finite results of fmul/fadd and will allow skipping/creating more fsat. Knowing that a value is larger than one helps proving non zero results of fmul. Separating positive and negative zero also has advantages when signed zero correctness is required. Foz-DB Navi48: Totals from 1344 (1.63% of 82636) affected shaders: Instrs: 5319389 -> 5312280 (-0.13%); split: -0.14%, +0.01% CodeSize: 29702516 -> 29665684 (-0.12%); split: -0.13%, +0.01% Latency: 40694344 -> 40694545 (+0.00%); split: -0.01%, +0.02% InvThroughput: 7481192 -> 7480403 (-0.01%); split: -0.02%, +0.01% VClause: 121947 -> 121946 (-0.00%); split: -0.00%, +0.00% SClause: 104972 -> 104923 (-0.05%); split: -0.05%, +0.00% Copies: 371098 -> 371092 (-0.00%); split: -0.02%, +0.02% Branches: 122929 -> 122919 (-0.01%); split: -0.01%, +0.00% PreSGPRs: 82506 -> 82510 (+0.00%); split: -0.00%, +0.01% PreVGPRs: 79175 -> 79168 (-0.01%) VALU: 2906718 -> 2904777 (-0.07%); split: -0.07%, +0.00% SALU: 726256 -> 723454 (-0.39%); split: -0.39%, +0.00% VMEM: 205021 -> 205016 (-0.00%) SMEM: 163972 -> 163916 (-0.03%) VOPD: 303354 -> 303298 (-0.02%); split: +0.02%, -0.04% Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:44 +00:00
Georg Lehmann	32b5719a9f	nir/opt_algebraic: add is_not_uint_zero for b2i16(uge) pattern More fallout from `f2a59fdea6`. is_not_zero now always returns whether the result is a floating point zero. When combined with the fp denorm handling that will be added to floating point range analysis, this is false for many sensible integer values. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:44 +00:00
Georg Lehmann	ab773fc5d4	nir/opt_algebraic: fix frsq clamp pattern This is not NaN correct. And also make the pattern 32bit only because the constant is hard coded FLT_MAX. Fixes: `780b5c1037` ("nir/algebraic: Simplify some Inf and NaN avoidance code") Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:42 +00:00
Georg Lehmann	ba30de1f97	nir/opt_algebraic: remove pattern that skips iabs with range analysis Fixes: `f2a59fdea6` ("nir: remove non float nir_analyse_range support") Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39987>	2026-03-07 05:01:41 +00:00
Paulo Zanoni	728b58f97c	vtn_bindgen2: limit the nir_opt_peephole_select optimization The way it is, this optimization is too aggressive and may generate code way worse than the original. Remove it from here, so drivers consuming the generated SPIR-V will be able to make their own more-informed decisions later. Let's follow the same strategy of nir_load_liblc.c and just set the limit to 0. For indirect copies in Anv (not merged yet), block compressed formats require some expensive divisions, so I put them all inside 'if' statements that should never run on normal formats. This optimization made us always run all the divisions all the time, tanking the performance of the shader on small copies. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40020>	2026-03-07 00:26:30 +00:00
Caio Oliveira	da57fbfb07	nir: Fix constant folding for iadd_sat Use INT_MIN instead of INT_MAX for underflow. Fixes: `cc4b50b023` ("nir/opcodes: use u_overflow to fix incorrect checks") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pelloux@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40252>	2026-03-06 22:26:07 +00:00
Emma Anholt	2ec8ecd7de	nir: Do NIR_DEBUG=print under a lock. With most Vulkan engines doing multithreaded compiles, NIR_DEBUG=print has been a frustrating racy mess. Take a lock when we're doing per-pass printing, so that the output is coherent. This unfortunately single-threads the compiler process itself in that case, but when you're NIR_DEBUG=printing, that's probably not a big deal. An assert is introduced to make sure that nobody nests NIR_PASS() in a way that would break printing. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40126>	2026-03-06 19:50:38 +00:00
Alyssa Rosenzweig	1c1c119d7b	nir/lower_io: handle Intel URB intrinsics useful to query these too, they're kinda like load_ssbo/store_ssbo. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40181>	2026-03-06 13:28:32 +00:00

1 2 3 4 5 ...

11833 commits