fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-20 13:28:09 +02:00

Author	SHA1	Message	Date
Ian Romanick	1e691e68e2	nir/algebraic: Optimize bfi with odd-valued mask to bitfield_select shader-db: Lunar Lake, Meteor Lake, and DG2 had similar results. (Lunar Lake shown) total instructions in shared programs: 17181254 -> 17181046 (<.01%) instructions in affected programs: 35834 -> 35626 (-0.58%) helped: 130 / HURT: 2 total cycles in shared programs: 888543370 -> 888554248 (<.01%) cycles in affected programs: 7443984 -> 7454862 (0.15%) helped: 95 / HURT: 87 fossil-db: Lunar Lake Totals: Instrs: 233260196 -> 233259474 (-0.00%); split: -0.00%, +0.00% Cycle count: 32754567116 -> 32754515890 (-0.00%); split: -0.00%, +0.00% Max live registers: 71738442 -> 71738398 (-0.00%); split: -0.00%, +0.00% Totals from 6842 (0.87% of 790721) affected shaders: Instrs: 5566926 -> 5566204 (-0.01%); split: -0.01%, +0.00% Cycle count: 512487046 -> 512435820 (-0.01%); split: -0.20%, +0.19% Max live registers: 1100656 -> 1100612 (-0.00%); split: -0.00%, +0.00% Meteor Lake and DG2 had similar results. (Meteor Lake shown) Totals: Instrs: 264071212 -> 264066944 (-0.00%); split: -0.00%, +0.00% Cycle count: 26552458051 -> 26553286277 (+0.00%); split: -0.00%, +0.01% Spill count: 530380 -> 530084 (-0.06%) Fill count: 613416 -> 612900 (-0.08%) Scratch Memory Size: 20089856 -> 20075520 (-0.07%) Max live registers: 46558852 -> 46558811 (-0.00%); split: -0.00%, +0.00% Max dispatch width: 8034616 -> 8034584 (-0.00%) Totals from 6653 (0.73% of 905545) affected shaders: Instrs: 5750844 -> 5746576 (-0.07%); split: -0.08%, +0.00% Cycle count: 416414845 -> 417243071 (+0.20%); split: -0.20%, +0.40% Spill count: 1953 -> 1657 (-15.16%) Fill count: 3556 -> 3040 (-14.51%) Scratch Memory Size: 92160 -> 77824 (-15.56%) Max live registers: 566003 -> 565962 (-0.01%); split: -0.01%, +0.00% Max dispatch width: 55768 -> 55736 (-0.06%) No shader-db or fossil-db changes on any previous Intel platforms. Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37186>	2025-10-10 17:25:11 +00:00
Ian Romanick	f7939f2fdc	nir/range_analysis: Handle bfi and bitfield_select in get_alu_uub I noticed some things related to this while implementing support for bitfield_select / BFN in BRW. shader-db: Lunar Lake total instructions in shared programs: 17183140 -> 17183128 (<.01%) instructions in affected programs: 3830 -> 3818 (-0.31%) helped: 6 / HURT: 0 total cycles in shared programs: 889936934 -> 889936056 (<.01%) cycles in affected programs: 253758 -> 252880 (-0.35%) helped: 4 / HURT: 2 No shader-db changes on any other Intel platform. fossil-db: Lunar Lake Totals: Instrs: 233285343 -> 233284796 (-0.00%); split: -0.00%, +0.00% Cycle count: 32756777978 -> 32756399804 (-0.00%); split: -0.00%, +0.00% Max live registers: 71738646 -> 71738626 (-0.00%) Non SSA regs after NIR: 67837900 -> 67837902 (+0.00%) Totals from 177 (0.02% of 790723) affected shaders: Instrs: 389849 -> 389302 (-0.14%); split: -0.14%, +0.00% Cycle count: 356341872 -> 355963698 (-0.11%); split: -0.11%, +0.01% Max live registers: 39364 -> 39344 (-0.05%) Non SSA regs after NIR: 70453 -> 70455 (+0.00%) Meteor Lake, DG2, and Ice Lake had similar results. (Meteor Lake shown) Totals: Instrs: 264095611 -> 264095358 (-0.00%) Cycle count: 26555705299 -> 26554303407 (-0.01%); split: -0.01%, +0.00% Fill count: 613233 -> 613231 (-0.00%) Totals from 123 (0.01% of 905547) affected shaders: Instrs: 334830 -> 334577 (-0.08%) Cycle count: 326531667 -> 325129775 (-0.43%); split: -0.65%, +0.22% Fill count: 4145 -> 4143 (-0.05%) Tiger Lake and Skylake had similar results. (Tiger Lake shown) Totals: Instrs: 269733849 -> 269733590 (-0.00%) Cycle count: 25240548036 -> 25241435039 (+0.00%); split: -0.00%, +0.01% Totals from 123 (0.01% of 903812) affected shaders: Instrs: 338617 -> 338358 (-0.08%) Cycle count: 326605644 -> 327492647 (+0.27%); split: -0.13%, +0.40% Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37186>	2025-10-10 17:25:08 +00:00
Ian Romanick	aa53735b66	nir/algebraic: Prefer bfi over bitfield_select for bitfield_insert Intel platforms will soon implement both bfi and bitfield_select. bfi is more efficient for bitfield_insert. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37186>	2025-10-10 17:25:08 +00:00
Ian Romanick	08ec408061	nir/algebraic: Optimize f2u of negative value to zero The eliminated SENDs are from a single app that has a bunch of fragment shaders with a sequence like: con 32 %495 = fmul! %203.i, %1 (0.000000) con 32 %496 = ffma! %203.j, %1 (0.000000), %495 con 32 %497 = ffma! %203.k, %1 (0.000000), %496 con 32 %498 = ffma! %203.l, %1 (0.000000), %497 con 32 %499 = @load_reloc_const_intel (param_idx=1, base=0) con 32 %500 = @load_reloc_const_intel (param_idx=0, base=0) con 32 %501 = f2u32 %498 con 32 %502 = umin %501, %172 (0x4) con 32 %503 = ishl %502, %172 (0x4) con 32 %504 = load_const (0x00000040 = 64) con 32 %505 = umin %503, %504 (0x40) con 32 %506 = iadd %500, %505 The `f2u` is replaced with 0, and that makes the `ffma` dot-product sequence be unused. Since it is unused, most of the preceeding block gets eliminated. A lot of instructions after the `f2u` are also eliminated by other algebraic optimizations. Most importantly, %203 is the result of a `load_ubo_uniform_block_intel` that is eliminated. No shader-db changes on any Intel platform. fossil-db: All Intel platforms had similar results. (Lunar Lake shown) Totals: Instrs: 919895603 -> 919804051 (-0.01%); split: -0.01%, +0.00% Send messages: 40892036 -> 40887569 (-0.01%) Cycle count: 99176770712 -> 99174971806 (-0.00%); split: -0.00%, +0.00% Max live registers: 190030365 -> 190030367 (+0.00%) Max dispatch width: 47415040 -> 47415024 (-0.00%) Non SSA regs after NIR: 228872538 -> 228863608 (-0.00%); split: -0.00%, +0.00% Totals from 2234 (0.11% of 1955134) affected shaders: Instrs: 1989743 -> 1898191 (-4.60%); split: -4.60%, +0.00% Send messages: 44179 -> 39712 (-10.11%) Cycle count: 25416114 -> 23617208 (-7.08%); split: -7.08%, +0.00% Max live registers: 367357 -> 367359 (+0.00%) Max dispatch width: 39184 -> 39168 (-0.04%) Non SSA regs after NIR: 471173 -> 462243 (-1.90%); split: -1.90%, +0.00% Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37186>	2025-10-10 17:25:08 +00:00
Ian Romanick	5667459ff1	nir/algebraic: Don't introduce undefined behavior in f2u conversion If the source -1.0 < x < 0.0, simply removing the ftrun will introduce undefined behavior. By chance of how at least Intel and NVIDIA GPUs implement f2u, this has Just Worked. No shader-db changes on any Intel platform. fossil-db: Lunar Lake Totals: Instrs: 913264354 -> 913264366 (+0.00%) Cycle count: 104953995530 -> 104953996854 (+0.00%) Max live registers: 189266026 -> 189266058 (+0.00%) Non SSA regs after NIR: 227779417 -> 227779369 (-0.00%) Totals from 24 (0.00% of 1984794) affected shaders: Instrs: 4669 -> 4681 (+0.26%) Cycle count: 50610 -> 51934 (+2.62%) Max live registers: 1222 -> 1254 (+2.62%) Non SSA regs after NIR: 1174 -> 1126 (-4.09%) Meteor Lake, DG2, Tiger Lake, and Ice Lake had similar results. (Meteor Lake shown) Totals: Instrs: 1001288026 -> 1001288038 (+0.00%) Cycle count: 92813392671 -> 92813392791 (+0.00%) Max live registers: 121935383 -> 121935399 (+0.00%) Max dispatch width: 19949928 -> 19949912 (-0.00%) Totals from 2 (0.00% of 2284670) affected shaders: Instrs: 1380 -> 1392 (+0.87%) Cycle count: 18940 -> 19060 (+0.63%) Max live registers: 136 -> 152 (+11.76%) Max dispatch width: 32 -> 16 (-50.00%) No fossil-db changes on Skylake. Suggested-by: Georg Lehmann Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37186>	2025-10-10 17:25:07 +00:00
Ian Romanick	4338f7d033	nir/algebraic: Remove useless ftrunc inside f2i/f2u Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37186>	2025-10-10 17:25:07 +00:00
Ian Romanick	c49d6e0480	nir/algebraic: Elide range clamping of f2u sources There are no shader-db changes on ELK platforms because those platforms don't support 8- or 16-bit integer types. v2: Restrict patterns generated such that the integer limits are exactly representable in the specified floating point format. With the exception of the value 0, this requires that float_sz > int_sz. This had no impact on shader-db or fossil-db on any Intel platform. Noticed by Georg. v3: Add a missing is_a_number. shader-db: All Intel platforms had similar results. (Lunar Lake shown) total cycles in shared programs: 889936056 -> 889934082 (<.01%) cycles in affected programs: 65806 -> 63832 (-3.00%) helped: 2 / HURT: 0 fossil-db: Lunar Lake Totals: Instrs: 233284796 -> 233282917 (-0.00%); split: -0.00%, +0.00% Cycle count: 32756399804 -> 32754972188 (-0.00%); split: -0.01%, +0.00% Spill count: 519861 -> 519813 (-0.01%) Fill count: 663650 -> 663626 (-0.00%); split: -0.01%, +0.01% Max live registers: 71738626 -> 71738696 (+0.00%) Non SSA regs after NIR: 67837902 -> 67837648 (-0.00%) Totals from 1236 (0.16% of 790723) affected shaders: Instrs: 2134504 -> 2132625 (-0.09%); split: -0.09%, +0.01% Cycle count: 604922278 -> 603494662 (-0.24%); split: -0.48%, +0.25% Spill count: 16509 -> 16461 (-0.29%) Fill count: 32760 -> 32736 (-0.07%); split: -0.22%, +0.15% Max live registers: 250112 -> 250182 (+0.03%) Non SSA regs after NIR: 302368 -> 302114 (-0.08%) Meteor Lake, DG2, and Tiger Lake had similar results. (Meteor Lake shown) Totals: Instrs: 264095370 -> 264094056 (-0.00%); split: -0.00%, +0.00% Cycle count: 26554146277 -> 26553027268 (-0.00%); split: -0.01%, +0.01% Spill count: 530603 -> 530615 (+0.00%) Fill count: 613231 -> 613273 (+0.01%) Max live registers: 46559041 -> 46559087 (+0.00%) Totals from 1237 (0.14% of 905547) affected shaders: Instrs: 2262517 -> 2261203 (-0.06%); split: -0.07%, +0.01% Cycle count: 518219799 -> 517100790 (-0.22%); split: -0.59%, +0.37% Spill count: 17518 -> 17530 (+0.07%) Fill count: 32273 -> 32315 (+0.13%) Max live registers: 128360 -> 128406 (+0.04%) Ice Lake and Skylake had similar results. (Ice Lake shown) Totals: Instrs: 269849640 -> 269848198 (-0.00%); split: -0.00%, +0.00% Cycle count: 26718329643 -> 26718289020 (-0.00%); split: -0.00%, +0.00% Max live registers: 46878430 -> 46878462 (+0.00%) Totals from 1233 (0.14% of 905427) affected shaders: Instrs: 2324225 -> 2322783 (-0.06%); split: -0.06%, +0.00% Cycle count: 531467501 -> 531426878 (-0.01%); split: -0.11%, +0.10% Max live registers: 130782 -> 130814 (+0.02%) Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37186>	2025-10-10 17:25:07 +00:00
Ian Romanick	986086c846	nir: Add saturating float to integer conversion opcodes v2: Add a comment around has_f2[ui]_sat explaining which opcodes it enables. Suggested by Georg. Cast u_uintN_max and friends to double in nir_opcodes.py. This ensures that an exact conversion is made. Eliminate duplicate conversions from half float to double. Both noticed by Georg. v3: Apply "NaN should be zero" fix suggested by Georg. Co-authored-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37186>	2025-10-10 17:25:05 +00:00
Lionel Landwerlin	301b71a19f	compiler: add an access flag for intel EU fusion Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Alyssa Anne Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37394>	2025-10-10 11:19:39 +00:00
Lionel Landwerlin	c7ac46a1d8	nir/lower_io: add get_io_index_src_number support for image intrinsics Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Anne Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37394>	2025-10-10 11:19:39 +00:00
Lionel Landwerlin	ca1533cd03	nir/divergence: add a new mode to cover fused threads on Intel HW The Intel Gfx12.x generation of GPU has an architecture feature called EU fusion in which 2 subgroups run lock step. A typical case where this happens is a compute shader with 1x1x1 local workgroup size and a dispatch command of 2x1x1. In that case 2 threads will be run in lock step for each of the workgroup. This has been the sources of some troubles in the backend because one subgroup can run with all lanes disabled, requiring care for SEND messages using the NoMask flag (execution regardless of the lane mask). We found out that other things are happening when 2 subgroups run together : - the HW will use the surface/sampler handle from only one subgroup - the HW will use the sampler header from only one subgroup So one of the fused subgroup can access the wrong surface/sampler if the value is different between the 2 subgroups and that can happen even with subgroup uniform values. Fortunately we can flag SEND instructions to disable the fusion behavior (most likely at a performance cost). This change introduce a new divergence mode that tries to compute things divergent between subgroups so that we can flag instructions accordingly. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37394>	2025-10-10 11:19:39 +00:00
Simon Perretta	79923115e7	nir/unlower_io_to_vars: keep io bases intact when keeping intrinsics Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details nir_recompute_io_bases will modify i/o intrinsics, which is not the expected behaviour when the keep_intrinsics flag is set. Fixes: `83aecc8f3f` ("mesa/st, nir: commonize unlower_io_to_vars pass") Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37725>	2025-10-10 11:53:24 +01:00
Job Noorman	6d59a3e3e7	nir/lower_alu: use Knuth's Algorithm M for [iu]mul_high Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This significantly simplifies the handling of signed numbers as the same code path can handle signed and unsigned numbers by simply using ishr instead of ushr for some of the shifts. For both cases, the number of additions and shifts are also reduced. Note that LLVM uses the same algorithm. fossil-db stats for Turnip: Totals from 4849 (2.94% of 164705) affected shaders: MaxWaves: 52318 -> 52332 (+0.03%); split: +0.04%, -0.02% Instrs: 5262458 -> 5218922 (-0.83%); split: -0.87%, +0.05% CodeSize: 10831900 -> 10655170 (-1.63%); split: -1.64%, +0.01% NOPs: 829481 -> 836010 (+0.79%); split: -0.95%, +1.74% MOVs: 176187 -> 173788 (-1.36%); split: -3.27%, +1.91% COVs: 104096 -> 86543 (-16.86%); split: -16.87%, +0.01% Full: 90434 -> 90158 (-0.31%); split: -0.33%, +0.03% (ss): 131091 -> 130866 (-0.17%); split: -0.87%, +0.70% (sy): 55550 -> 55769 (+0.39%); split: -0.92%, +1.32% (ss)-stall: 406003 -> 407194 (+0.29%); split: -1.10%, +1.39% (sy)-stall: 1668213 -> 1678082 (+0.59%); split: -1.31%, +1.90% Preamble Instrs: 1105270 -> 1067290 (-3.44%); split: -3.50%, +0.06% Constlen: 423776 -> 423560 (-0.05%) Last helper: 1038202 -> 1035540 (-0.26%); split: -0.42%, +0.16% Last baryf: 38908 -> 38632 (-0.71%) Subgroup size: 336640 -> 336832 (+0.06%) Cat0: 916209 -> 922848 (+0.72%); split: -0.87%, +1.59% Cat1: 282813 -> 262845 (-7.06%); split: -7.49%, +0.43% Cat2: 2198715 -> 2183012 (-0.71%); split: -0.72%, +0.01% Cat3: 1390914 -> 1376421 (-1.04%) Cat7: 123127 -> 123116 (-0.01%); split: -0.24%, +0.23% Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37793>	2025-10-10 05:31:17 +00:00
Job Noorman	18f69890d1	nir: add nir_shr builder Sometimes we need to select between ishr/ushr based some condition; this builder makes this less verbose. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37793>	2025-10-10 05:31:17 +00:00
Emma Anholt	d01aae2fb1	nir: Add a shader bisect tool. When you're trying to figure out what shader some NIR pass broke, use nir_shader_bisect_select() to decide between NIR pass behaviors, and then nir_shader_bisect.py will help you automatically bisect down to which source_blake3 is at fault. Once it's identified, it prints you a C call you can use for selecting that shader specifically, which you can use for continuing on in your debugging. On a test I was looking at, this took 10 steps to bisect 134 shaders down to the source_blake3 of the NIR shader in question. This idea is heavily lifted from Job Noorman's ir3_shader_bisect. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37468>	2025-10-09 17:56:30 +00:00
Alyssa Rosenzweig	c1d75c6e51	treewide: use BITSET_CALLOC Via Coccinelle patch: @@ expression count; type T; @@ -calloc(BITSET_WORDS(count), sizeof(T)) +BITSET_CALLOC(count) @@ expression count; type T; @@ -calloc(sizeof(T), BITSET_WORDS(count)) +BITSET_CALLOC(count) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37779>	2025-10-09 12:29:55 +00:00
Romaric Jodin	cb86341829	meson: remove '--outdir' argument in script Usage of '--outdir' argument in python scripts makes it very complicated for tools like ninja-to-soong to generate the Android equivalent build file. This is because the option is less clear on what will be generated. Instead, change it for '--out' where we give the full path of the file to generate. This has the good point of deduplicating the locations of the file name to have it only in 'meson.build'. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37741>	2025-10-08 20:51:20 +00:00
Marek Olšák	3fe651f607	nir: remove load_smem_amd replaced by load_global_amd + ACCESS_SMEM_AMD Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36936>	2025-10-08 08:54:11 +00:00
Daniel Schürmann	7593667b0a	nir/divergence_analysis: check ACCESS_SMEM_AMD Revert "nir/divergence: make smem load_global_amd uniform" This reverts commit `2d0f93631c`. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36936>	2025-10-08 08:53:55 +00:00
Daniel Schürmann	cacb390ec9	nir/load_store_vectorize: Fix parsing offsets through u2u64 Fixes: `cfba417316` ('nir/load_store_vectorize: optimize accesses with u2u64(ishl.nuw(iadd))') Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36936>	2025-10-08 08:53:51 +00:00
Rhys Perry	8fba196164	nir: assume non-atomic loads don't tear Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36602>	2025-10-07 17:41:30 +00:00
Rhys Perry	0dd09a292b	nir: add ACCESS_ATOMIC This is so that passes and backends can tell if a coherent load/store is atomic or not, instead of having to assume it could be either. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36602>	2025-10-07 17:41:30 +00:00
Samuel Pitoiset	e868e8d946	nir: adjust nir_tex_instr_need_sampler() for AMD FMASK instructions These instructions don't need a sampler. This doesn't fix anything now because this helper isn't unused yet, but it will help for descriptor heap. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37720>	2025-10-07 15:22:47 +00:00
Georg Lehmann	84f26ed117	nir: optimize atomic isub if supported Foz-DB Navi48: Totals from 1 (0.00% of 80287) affected shaders: Instrs: 1641 -> 1637 (-0.24%) CodeSize: 8472 -> 8456 (-0.19%) Latency: 19132 -> 19131 (-0.01%) InvThroughput: 9566 -> 9565 (-0.01%) Copies: 126 -> 125 (-0.79%) VALU: 565 -> 563 (-0.35%) SALU: 439 -> 438 (-0.23%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37702>	2025-10-07 14:07:56 +00:00
Georg Lehmann	b0d3db3733	nir: add atomic isub Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37702>	2025-10-07 14:07:56 +00:00
Lionel Landwerlin	94f8d0072d	nir: add pass to propagate image format to intrinsics Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Anne Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36773>	2025-10-07 08:54:26 +00:00
Lionel Landwerlin	0922a0dd50	nir/lower_tex: remove unused options Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Anne Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37692>	2025-10-03 20:19:03 +00:00
Lionel Landwerlin	97dde5bc10	nir/lower_tex: add an callback to lower txd ops Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Anne Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37692>	2025-10-03 20:19:02 +00:00
Daniel Schürmann	0e3bc3d8c0	nir/opt_offsets: call allow_offset_wrap() for try_fold_shared2() Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This prevents applying wrapping offsets on GFX6. Fixes: `e1a692f74b` ('nir/opt_offsets: allow for unsigned wraps when folding load/store_shared2_amd offsets') Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37667>	2025-10-03 07:54:12 +00:00
Kenneth Graunke	25cb6dfbf7	nir: Add load_simd_width_intel to divergence analysis For some reason we missed adding this. This prevents some asserts from triggering when I call divergence analysis at certain points in an upcoming patch. Cc: mesa-stable Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36750>	2025-09-30 19:44:02 +00:00
sjfricke	05ea82a766	nir: Fix gnu-empty-initializer warning Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Found with clang 14 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37615>	2025-09-30 19:09:31 +00:00
Ella Stanforth	082e6369f9	nir: add v3d specific intrinsic normalised to float conversion Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35820>	2025-09-30 12:48:42 +00:00
Simon Perretta	a1acd6f8d1	pvr, pco: add primitive support for VK_KHR_robustness2.nullDescriptor Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37512>	2025-09-30 12:15:54 +00:00
Simon Perretta	2a7ebf2ae0	nir/lower_alpha: extend to support dynamic a2c Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37512>	2025-09-30 12:15:53 +00:00
Simon Perretta	6dc5e1e109	pco: fully support Vulkan 1.2 image atomics Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37512>	2025-09-30 12:15:48 +00:00
Aitor Camacho	06dbd4c33c	nir: Set cursor in lower_sampler_lod_bias Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Alyssa Anne Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37494>	2025-09-29 23:24:52 +00:00
Simon Perretta	b111b8a844	pvr, pco: implement prerequisites for sampleRateShading - Implement load_interpolated_input and friends. - Optimize load_barycentric_* cases that can be simplified. - Initial support for non-standard sample locations. Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Acked-by: Frank Binns <frank.binns@imgtec.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37540>	2025-09-27 23:45:54 +01:00
Simon Perretta	83aecc8f3f	mesa/st, nir: commonize unlower_io_to_vars pass Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37540>	2025-09-27 23:45:54 +01:00
Aleksi Sapon	8949473023	nir: Fix nir.h MSVC compilation for C++ source files This kind of C initializer is not accepted by MSVC in C++ mode. Fixed: `75292ae7` ("nir: Fix gnu-empty-initializer warning ") Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37604>	2025-09-26 18:25:22 +00:00
Georg Lehmann	46a4569c22	nir/opt_undef: prefer 0 over NaN for pack_half_2x16_rtz_split Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Using NaN doesn't usually allow any extra optimizations, and 0 is an inline constant on AMD hw where this opcode is used with undef for fragment shader exports. Foz-DB GFX1201: Totals from 889 (1.11% of 80287) affected shaders: Instrs: 1676365 -> 1676348 (-0.00%) CodeSize: 8827040 -> 8821760 (-0.06%) Latency: 13346728 -> 13346699 (-0.00%) InvThroughput: 1799283 -> 1799262 (-0.00%) Copies: 108125 -> 108102 (-0.02%) VALU: 974875 -> 974852 (-0.00%) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37552>	2025-09-26 15:11:26 +00:00
Aleksi Sapon	75292ae7e4	nir: Fix gnu-empty-initializer warning Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This also causes a build error on older MSVC. Fixes: `75381670` ("nir,rusticl: NIR_PASS/nir_pass! validation fixes and improvements") Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37569>	2025-09-25 18:14:22 +00:00
Daniel Schürmann	e1a692f74b	nir/opt_offsets: allow for unsigned wraps when folding load/store_shared2_amd offsets Totals from 131 (0.16% of 79839) affected shaders: (Navi48) Instrs: 217026 -> 216541 (-0.22%); split: -0.24%, +0.01% CodeSize: 1150136 -> 1146772 (-0.29%); split: -0.31%, +0.02% Latency: 4225732 -> 4225549 (-0.00%); split: -0.01%, +0.00% InvThroughput: 840231 -> 839823 (-0.05%); split: -0.05%, +0.00% VClause: 3815 -> 3816 (+0.03%) Copies: 15414 -> 15358 (-0.36%); split: -0.38%, +0.02% PreSGPRs: 6322 -> 6323 (+0.02%) PreVGPRs: 6064 -> 6062 (-0.03%) VALU: 117317 -> 116873 (-0.38%); split: -0.40%, +0.02% SALU: 25384 -> 25331 (-0.21%); split: -0.22%, +0.02% Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37453>	2025-09-24 14:28:24 +00:00
Rhys Perry	7538167096	nir: add NIR_DEBUG=progress_validation Fails if a shader was changed but the pass didn't report progress. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35069>	2025-09-24 08:20:28 +00:00
Rhys Perry	706ba80057	nir: fix NIR_DEBUG=extended_validation This broke after divergence became metadata because the divergence analysis pass does not support all instructions. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35069>	2025-09-24 08:20:28 +00:00
Rhys Perry	026e2527bf	nir/opt_if: rewrite progress reporting and metadata invalidation This would unconditionally invalid all metadata except nir_metadata_control_flow and then invalidate that if opt_if_safe_cf_list and opt_if_regs_cf_list made no progress. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35069>	2025-09-24 08:20:27 +00:00
Rhys Perry	da23b17c8b	nir/opt_if: fix progress reporting with multiple function impls Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35069>	2025-09-24 08:20:27 +00:00
Rhys Perry	12ee2b0fd4	nir: fix progress reporting in nir_io_add_const_offset_to_base Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35069>	2025-09-24 08:20:27 +00:00
Pierre-Eric Pelloux-Prayer	81f3a5a035	nir/opcodes: remove invalid comment Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details "\b" is interpreted by python which results in an invalid char being written to the C file. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37331>	2025-09-23 09:09:55 +02:00
Pierre-Eric Pelloux-Prayer	cc4b50b023	nir/opcodes: use u_overflow to fix incorrect checks Operands of an addition will be promoted to int making the a+b<a kind of checks ineffective. Use u_overflow.h helpers to perform the check correctly. The commit would be simpler if it used __typeof__ like so: util_add_check_overflow(__typeof__(src0), src0, src1) But typeof only became a standard in C23 so this commit instead extends nir_opcodes a bit to allow opcodes that need the dest_type to get it. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Dylan Baker <dylan.c.baker@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37331>	2025-09-23 09:09:55 +02:00
Simon Perretta	7b7fb811ab	pvr, pco: switch to clc load/store sr and idfwdf shaders Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37439>	2025-09-22 14:52:05 +01:00

1 2 3 4 5 ...

6708 commits