fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-06-10 16:28:27 +02:00

Author	SHA1	Message	Date
Kenneth Graunke	d89a0b486a	jay: Implement dual color blending (but require SIMD16) It's mildly tempting to reuse the src0_alpha source for color1 since the two features should never overlap, but for now we add an extra optional source. We require SIMD16 for now as we only have SIMD16 messages. Eventually, we're likely to want to support SIMD32 with 2x16 sends, but this gets us going for now. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41872>	2026-06-03 15:23:20 +00:00
Marek Olšák	e5723a61f2	ac/nir: add a new pass ac_nir_lower_sample_mask_in This covers all the optimal lowering cases of sample_mask_in. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41768>	2026-06-02 20:38:05 +00:00
Ian Romanick	dcfc90a8fc	nir/algebraic: Convert bcsel of addition to addition of b2i or b2f Recent changes to continue handling in loops results in many cases of loop { ... if (...) { do_continue = true; // was continue; } i = do_continue ? i : i + 1; } I noticed this while investigating mesa#15154. Unfortunately, this doesn't fix the performance regressions noted in that issue. One fragment shader in XCOM: Enemy Unknown doesn't like this change. :( v2: Drop _nsz from a couple bcsel patterns where it is not needed. Suggested by Georg. v3: Drop ~ from the last two fadd patterns. Suggested by Georg. Update expected checksum for plot3d-v2.trace on many platforms. shader-db: All Iris platforms had similar results. (Lunar Lake shown) total instructions in shared programs: 17089936 -> 17086837 (-0.02%) instructions in affected programs: 864928 -> 861829 (-0.36%) helped: 696 / HURT: 110 total cycles in shared programs: 864096306 -> 863913752 (-0.02%) cycles in affected programs: 345726340 -> 345543786 (-0.05%) helped: 620 / HURT: 196 total spills in shared programs: 3318 -> 3319 (0.03%) spills in affected programs: 14 -> 15 (7.14%) helped: 0 / HURT: 1 total fills in shared programs: 1604 -> 1606 (0.12%) fills in affected programs: 28 -> 30 (7.14%) helped: 0 / HURT: 1 total sends in shared programs: 876852 -> 876850 (<.01%) sends in affected programs: 6 -> 4 (-33.33%) helped: 2 / HURT: 0 fossil-db: Lunar Lake Totals: Instrs: 914468779 -> 914215874 (-0.03%); split: -0.03%, +0.00% CodeSize: 12885732160 -> 12881939568 (-0.03%); split: -0.04%, +0.01% Cycle count: 100100279922 -> 100096866800 (-0.00%); split: -0.05%, +0.04% Spill count: 3459786 -> 3459693 (-0.00%); split: -0.01%, +0.01% Fill count: 4909835 -> 4909177 (-0.01%); split: -0.04%, +0.03% Max live registers: 191819298 -> 191822052 (+0.00%); split: -0.00%, +0.00% Max dispatch width: 48511264 -> 48510608 (-0.00%); split: +0.00%, -0.00% Non SSA regs after NIR: 136334891 -> 136301926 (-0.02%); split: -0.03%, +0.00% Totals from 37416 (1.87% of 2003390) affected shaders: Instrs: 53346249 -> 53093344 (-0.47%); split: -0.48%, +0.01% CodeSize: 775396384 -> 771603792 (-0.49%); split: -0.60%, +0.11% Cycle count: 32275003526 -> 32271590404 (-0.01%); split: -0.14%, +0.13% Spill count: 569304 -> 569211 (-0.02%); split: -0.05%, +0.03% Fill count: 620240 -> 619582 (-0.11%); split: -0.31%, +0.21% Max live registers: 6712048 -> 6714802 (+0.04%); split: -0.01%, +0.05% Max dispatch width: 893344 -> 892688 (-0.07%); split: +0.10%, -0.17% Non SSA regs after NIR: 7191473 -> 7158508 (-0.46%); split: -0.49%, +0.03% Meteor Lake and DG2 had similar results. (Meteor Lake shown) Totals: Instrs: 985625036 -> 985366432 (-0.03%); split: -0.03%, +0.00% CodeSize: 16446268768 -> 16442606864 (-0.02%); split: -0.03%, +0.01% Cycle count: 91278956920 -> 91272371300 (-0.01%); split: -0.07%, +0.06% Spill count: 3713935 -> 3714003 (+0.00%); split: -0.00%, +0.00% Fill count: 5001514 -> 5001259 (-0.01%); split: -0.03%, +0.02% Max live registers: 120736970 -> 120738919 (+0.00%); split: -0.00%, +0.00% Max dispatch width: 37827808 -> 37829472 (+0.00%); split: +0.01%, -0.00% Non SSA regs after NIR: 160606595 -> 160573270 (-0.02%); split: -0.02%, +0.00% Totals from 38664 (1.71% of 2265137) affected shaders: Instrs: 53621392 -> 53362788 (-0.48%); split: -0.49%, +0.01% CodeSize: 932994544 -> 929332640 (-0.39%); split: -0.52%, +0.13% Cycle count: 24442489628 -> 24435904008 (-0.03%); split: -0.25%, +0.22% Spill count: 550952 -> 551020 (+0.01%); split: -0.02%, +0.03% Fill count: 525010 -> 524755 (-0.05%); split: -0.27%, +0.23% Max live registers: 3594805 -> 3596754 (+0.05%); split: -0.01%, +0.07% Max dispatch width: 510928 -> 512592 (+0.33%); split: +0.47%, -0.14% Non SSA regs after NIR: 7652247 -> 7618922 (-0.44%); split: -0.46%, +0.03% Tiger Lake, Ice Lake, and Skylake had similar results. (Tiger Lake shown) Totals: Instrs: 997905938 -> 997771670 (-0.01%); split: -0.01%, +0.00% CodeSize: 13990460928 -> 13988346016 (-0.02%); split: -0.02%, +0.00% Cycle count: 83465002175 -> 83456829524 (-0.01%); split: -0.02%, +0.01% Spill count: 3815020 -> 3814879 (-0.00%); split: -0.01%, +0.00% Fill count: 6561078 -> 6560768 (-0.00%); split: -0.01%, +0.00% Max live registers: 121468149 -> 121468160 (+0.00%); split: -0.00%, +0.00% Max dispatch width: 37914400 -> `37914624` (+0.00%); split: +0.00%, -0.00% Non SSA regs after NIR: 155941530 -> 155944033 (+0.00%); split: -0.00%, +0.00% Totals from 27771 (1.22% of 2273117) affected shaders: Instrs: 31224666 -> 31090398 (-0.43%); split: -0.44%, +0.01% CodeSize: 450250800 -> 448135888 (-0.47%); split: -0.57%, +0.10% Cycle count: 15045135658 -> 15036963007 (-0.05%); split: -0.13%, +0.08% Spill count: 406812 -> 406671 (-0.03%); split: -0.05%, +0.01% Fill count: 391210 -> 390900 (-0.08%); split: -0.10%, +0.02% Max live registers: 2592759 -> 2592770 (+0.00%); split: -0.02%, +0.02% Max dispatch width: 383888 -> 384112 (+0.06%); split: +0.23%, -0.17% Non SSA regs after NIR: 4221402 -> 4223905 (+0.06%); split: -0.01%, +0.07% Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41871>	2026-06-02 17:44:14 +00:00
Ian Romanick	daa38c1972	nir/opt_if: Merge if-statements with inverted conditions Cases like if (x) { ... } else { ... } if (!x) { ... } else { ... } should be merged. I don't know why Ice Lake is affected differetly by this commit. v2: Add implementation of srcs_equal_or_logical_inverse after bad rebase. That's what I get for rushing out an MR right before lunch. Noticed by Georg. shader-db: Lunar Lake No changes. All other Iris platforms had simlar results. (Meteor Lake shown) total cycles in shared programs: 882310108 -> 882311504 (<.01%) cycles in affected programs: 74306 -> 75702 (1.88%) helped: 4 HURT: 2 helped stats (abs) min: 2.0 max: 38.0 x̄: 11.00 x̃: 2 helped stats (rel) min: 0.02% max: 0.29% x̄: 0.09% x̃: 0.02% HURT stats (abs) min: 720.0 max: 720.0 x̄: 720.00 x̃: 720 HURT stats (rel) min: 5.27% max: 5.27% x̄: 5.27% x̃: 5.27% 95% mean confidence interval for cycles value: -163.75 629.08 95% mean confidence interval for cycles %-change: -1.21% 4.61% Inconclusive result (value mean confidence interval includes 0). fossil-db: All Intel platforms except Ice Lake had similar results. (Lunar Lake shown) Totals: Instrs: 914554534 -> 914546744 (-0.00%); split: -0.00%, +0.00% CodeSize: 12887129264 -> 12886823808 (-0.00%); split: -0.00%, +0.00% Send messages: 40220826 -> 40219429 (-0.00%); split: -0.00%, +0.00% Cycle count: 100101810976 -> 100101804762 (-0.00%); split: -0.00%, +0.00% Spill count: 3459811 -> 3459786 (-0.00%); split: -0.00%, +0.00% Fill count: 4909877 -> 4909835 (-0.00%); split: -0.00%, +0.00% Max live registers: 191837229 -> 191838000 (+0.00%); split: -0.00%, +0.00% Max dispatch width: 48514400 -> 48514336 (-0.00%) Non SSA regs after NIR: 136346777 -> 136343948 (-0.00%); split: -0.00%, +0.00% Totals from 1937 (0.10% of 2003486) affected shaders: Instrs: 3013550 -> 3005760 (-0.26%); split: -0.39%, +0.13% CodeSize: 43169072 -> 42863616 (-0.71%); split: -0.81%, +0.10% Send messages: 183171 -> 181774 (-0.76%); split: -0.82%, +0.06% Cycle count: 126864798 -> 126858584 (-0.00%); split: -0.67%, +0.67% Spill count: 7354 -> 7329 (-0.34%); split: -0.45%, +0.11% Fill count: 5547 -> 5505 (-0.76%); split: -0.88%, +0.13% Max live registers: 296895 -> 297666 (+0.26%); split: -0.04%, +0.30% Max dispatch width: 41856 -> 41792 (-0.15%) Non SSA regs after NIR: 545672 -> 542843 (-0.52%); split: -1.15%, +0.63% Ice Lake Totals: Instrs: 996341606 -> 996312120 (-0.00%); split: -0.00%, +0.00% CodeSize: 12563695936 -> 12563195200 (-0.00%); split: -0.00%, +0.00% Send messages: 45911343 -> 45909063 (-0.00%); split: -0.00%, +0.00% Cycle count: 82819362995 -> 82818778468 (-0.00%); split: -0.00%, +0.00% Spill count: 2935451 -> 2935452 (+0.00%); split: -0.00%, +0.00% Fill count: 5034267 -> 5034281 (+0.00%); split: -0.00%, +0.00% Max live registers: 124672355 -> 124672961 (+0.00%); split: -0.00%, +0.00% Max dispatch width: 41330808 -> 41330672 (-0.00%) Non SSA regs after NIR: 160790466 -> 160785863 (-0.00%); split: -0.01%, +0.00% Totals from 2163 (0.09% of 2327905) affected shaders: Instrs: 4164788 -> 4135302 (-0.71%); split: -0.80%, +0.09% CodeSize: 53351344 -> 52850608 (-0.94%); split: -0.95%, +0.01% Send messages: 271164 -> 268884 (-0.84%); split: -0.84%, +0.00% Cycle count: 145818114 -> 145233587 (-0.40%); split: -0.66%, +0.26% Spill count: 7819 -> 7820 (+0.01%); split: -0.32%, +0.33% Fill count: 7191 -> 7205 (+0.19%); split: -0.57%, +0.76% Max live registers: 192403 -> 193009 (+0.31%); split: -0.08%, +0.40% Max dispatch width: 34728 -> 34592 (-0.39%) Non SSA regs after NIR: 570874 -> 566271 (-0.81%); split: -1.49%, +0.68% Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41871>	2026-06-02 17:44:14 +00:00
Ian Romanick	e8cef4725d	nir/opt_if: use nir_def_replace() instead of nir_def_rewrite_uses() Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41871>	2026-06-02 17:44:13 +00:00
Ian Romanick	4a37fda884	nir: Use nir_instr_remove_v in nir_def_replace The non _v version sets up and returns a nir_cursor that isn't used. Skip that work by calling nir_instr_remove_v directly. Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41871>	2026-06-02 17:44:13 +00:00
Pavel Ondračka	f6b06ea3de	nir/algebraic: prevent ffract optimization on lowered ffloor Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details ffloor(a) is lowered as a - ffract(a). dEQP expects that for example ffloor(a) == 1.0 for every a in between 1.0 a 2.0. This worked fine, but the new ffract(a + b(is_integral)) -> ffract(a) rule broke this. Specifically, dEQP-GLES2.functional.shaders.struct.uniform.equal_fragment checks that ffloor(a + 1.0) == 1.0 for every a between 0.0 and 1.0. However this is not exactly true once the ffract(a + 1.0) is lowered to ffract(a). Prevent this by marking ffract from ffloor lowering as exact so that the recently introduced ffract(a + b(is_integral)) -> ffract(a) rule does not trigger. Fixes: `c6aaafa3` ("nir: add lowering for ffloor") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15562 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41882>	2026-06-02 12:03:09 +00:00
Mary Guillemard	b95dbc64bf	nir,nak: Add match_any_nv NVIDIA hardware have an instruction allowering you to retrive the mask of active threads matching the same source value as the current invocation. This is going to be used by shared memory lowering for mesh / task stages on NVK. Signed-off-by: Mary Guillemard <mary@mary.zone> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Tested-by: Thomas H.P. Andersen <phomes@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27196>	2026-06-02 10:34:31 +00:00
Mary Guillemard	90d963d353	nir/nir_format_convert: Add missing u2f32 in nir_format_unpack_r9g9b9e5 Fix "dEQP-VK.api.copy_and_blit..image_to_image.all_formats.color.2d_to_1d..e5b9g9r9_ufloat_pack32.*" on HK. Signed-off-by: Mary Guillemard <mary@mary.zone> Fixes: `5f5f4474f6` ("nir: Add a format unpack helper and tests") Reviewed-by: Janne Grunau <j@jannau.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41929>	2026-06-01 20:28:44 +00:00
Karol Herbst	d25e7e330f	nir/lower_alu: fix lower_fminmax_signed_zero for denorms When both inputs are denorms, the bcsel picks the integer min/max result, which does not flush denorms and therefore might return the wrong result. Fixes OpenCL fmin/fmax on asahi. Fixes: `d238d766c6` ("nir: add lower_fminmax_signed_zero") Reviewed-by: Mary Guillemard <mary@mary.zone> Reviewed-by Janne Grunau <j@jannau.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41386>	2026-06-01 13:43:01 +00:00
Konstantin Seurer	f48f681fb5	nir: Duplicate the name in nir_def_set_name nir_sweep expects that nir_instr_debug_info::variable_name is owned by nir_instr_debug_info. Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40706>	2026-05-31 13:31:55 +02:00
Karol Herbst	87b5340831	nir/opt_dead_write_vars: cache is_entrypoint of the function Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details ends_program calls into nir_cf_node_get_function repeadtly to fetch the same function and to check whether we are inside an entry point or not. But we already got the information higher up the chain so use that instead. nir_cf_node_get_function is quite expensive, because it follows pointers through the tree. Speeds up compilation of more complex shaders by quite a bit. I am seeing a 66% cut of compilation time spent in e.g. llama-bench. Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41891>	2026-05-29 22:58:00 +00:00
Georg Lehmann	dea444f80f	nir/deref: consider atomics that store derefs as complex use src[1] or src[2] would mean that the atomic uses the deref as data for the op, we only want to allow address source uses. Fixes: `bb311ce370` ("nir: Allow atomics as non-complex uses for var-splitting passes") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41818>	2026-05-28 18:58:33 +00:00
Karol Herbst	8dc4e8094e	nir/opt_algebraic: add missing fmadz lowering for lower_fmulz_with_abs_min Fixes: `32e91a7467` ("nir: add new float multiply-add opcodes") Suggested-by: Georg Lehmann <dadschoorse@gmail.com> Acked-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41723>	2026-05-27 16:28:48 +00:00
Rhys Perry	b1429caab3	nir,ac/nir,aco: add load_global_tr_amd Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41653>	2026-05-27 14:44:59 +00:00
Rhys Perry	b982e71084	nir: add load_global_transpose_amd Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41653>	2026-05-27 14:44:59 +00:00
Rhys Perry	57498eca83	nir: add load_deref_transpose_amd Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41653>	2026-05-27 14:44:59 +00:00
Rhys Perry	6229e89fa8	nir: make cmat_muladd_amd a subgroup intrinsic It's a subgroup op. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41653>	2026-05-27 14:44:59 +00:00
Rhys Perry	81925d7f41	nir/algebraic: optimize ishl(iadd(iadd(iadd(a, #b), c), d), #e) This improves combining of constants offsets into memory accesses in dEQP-VK.compute.pipeline.cooperative_matrix.khr_a.subgroupscope.mul.float16_float16.buffer.colmajor.linear fossil-db (gfx1201): Totals from 121 (0.06% of 208640) affected shaders: Instrs: 204278 -> 204199 (-0.04%); split: -0.06%, +0.03% CodeSize: 1110856 -> 1110076 (-0.07%); split: -0.10%, +0.03% VGPRs: 7620 -> 7680 (+0.79%); split: -0.16%, +0.94% Latency: 1225169 -> 1225067 (-0.01%); split: -0.02%, +0.01% InvThroughput: 191629 -> 191580 (-0.03%); split: -0.03%, +0.01% SClause: 5732 -> 5731 (-0.02%) Copies: 16358 -> 16356 (-0.01%); split: -0.02%, +0.01% PreSGPRs: 5715 -> 5711 (-0.07%) PreVGPRs: 5907 -> 5905 (-0.03%) VALU: 112808 -> 112742 (-0.06%); split: -0.06%, +0.00% SALU: 27121 -> 27113 (-0.03%) fossil-db (gfx1201, dEQP-VK.compute.pipeline.cooperative_matrix.*): Totals from 198 (12.20% of 1623) affected shaders: Instrs: 13011 -> 11584 (-10.97%) CodeSize: 90188 -> 77920 (-13.60%) VGPRs: 3456 -> 2724 (-21.18%) Latency: 144421 -> 142553 (-1.29%) InvThroughput: 11158 -> 10608 (-4.93%) Copies: 1119 -> 1117 (-0.18%) PreSGPRs: 1954 -> 1857 (-4.96%) PreVGPRs: 1675 -> 1354 (-19.16%) VALU: 4894 -> 3476 (-28.97%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41653>	2026-05-27 14:44:59 +00:00
Rhys Perry	c3db34a525	nir/algebraic: optimize ishl(iadd(ishl, ishl)) This reduces arithmetic for cooperative matrix loads: v_mbcnt_lo_u32_b32 v0, -1, 0 v_and_b32_e32 v1, 15, v0 v_lshrrev_b32_e32 v0, 4, v0 v_lshlrev_b32_e32 v1, 4, v1 v_lshl_add_u32 v0, v0, 3, v1 v_lshlrev_b32_e32 v0, 1, v0 -> v_mbcnt_lo_u32_b32 v0, -1, 0 v_and_b32_e32 v1, -16, v0 v_and_b32_e32 v0, 15, v0 v_lshl_add_u32 v0, v0, 5, v1 fossil-db (gfx1201): Totals from 38 (0.02% of 208640) affected shaders: Instrs: 42234 -> 42181 (-0.13%) CodeSize: 232656 -> 232384 (-0.12%) Latency: 128807 -> 128759 (-0.04%) InvThroughput: 20860 -> 20850 (-0.05%) VALU: 23035 -> 23013 (-0.10%) SALU: 4790 -> 4784 (-0.13%) fossil-db (gfx1201, dEQP-VK.compute.pipeline.cooperative_matrix.*): Totals from 44 (2.71% of 1623) affected shaders: Instrs: 46834 -> 46802 (-0.07%) CodeSize: 287536 -> 287272 (-0.09%) Latency: 100960 -> 100918 (-0.04%); split: -0.10%, +0.06% InvThroughput: 21808 -> 21796 (-0.06%) VALU: 19336 -> 19328 (-0.04%) SALU: 10790 -> 10782 (-0.07%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41653>	2026-05-27 14:44:59 +00:00
Samuel Pitoiset	8c9995e7fa	nir: add nir_lower_abort Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41651>	2026-05-27 06:37:03 +00:00
Samuel Pitoiset	f431d6bc87	nir: add new intrinsics for SPV_KHR_abort Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41651>	2026-05-27 06:37:03 +00:00
Marek Olšák	7f2130c86e	nir/opt_algebraic: add more ffract/ffloor/ftrunc/f2u/f2i patterns Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Totals from 1390 (0.69% of 202429) affected shaders: MaxWaves: 33336 -> 33348 (+0.04%) Instrs: 4101809 -> 4095218 (-0.16%); split: -0.17%, +0.01% CodeSize: 22973700 -> 22944812 (-0.13%); split: -0.13%, +0.00% VGPRs: 95592 -> 95460 (-0.14%); split: -0.15%, +0.01% SpillSGPRs: 2910 -> 2913 (+0.10%) Latency: 27815305 -> 27807064 (-0.03%); split: -0.06%, +0.03% InvThroughput: 4563067 -> 4555622 (-0.16%); split: -0.18%, +0.02% VClause: 98544 -> 98570 (+0.03%); split: -0.04%, +0.06% SClause: 91148 -> 91149 (+0.00%); split: -0.00%, +0.01% Copies: 324008 -> 324028 (+0.01%); split: -0.10%, +0.10% Branches: 99085 -> 99084 (-0.00%); split: -0.00%, +0.00% PreSGPRs: 70920 -> 70734 (-0.26%); split: -0.27%, +0.00% PreVGPRs: 78288 -> 78190 (-0.13%); split: -0.15%, +0.03% VALU: 2123606 -> 2117766 (-0.28%); split: -0.28%, +0.00% SALU: 621757 -> 621671 (-0.01%); split: -0.02%, +0.00% VMEM: 163395 -> 163387 (-0.00%); split: -0.01%, +0.00% SMEM: 140374 -> 140376 (+0.00%) VOPD: 258332 -> 258264 (-0.03%); split: +0.04%, -0.07% Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41455>	2026-05-25 20:02:30 +00:00
Thong Thai	931dba218e	nir: Only build NIR headers when with_gfx_compute is false Signed-off-by: Thong Thai <thong.thai@amd.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41493>	2026-05-25 15:44:12 +00:00
Marek Olšák	1b45a8aee2	radv: select frag_coord_xy and pixel_coord conditionally based on dynamic state the code explains it Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> (shader parts) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41689>	2026-05-25 13:38:08 +00:00
Marek Olšák	a5ba7694b5	nir/opt_frag_coord_to_pixel_coord: factor out helper nir_all_uses_of_float_are_integer to be used in a new RADV pass Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41689>	2026-05-25 13:38:05 +00:00
Timur Kristóf	b0b61a4bf8	nir/divergence: Consider ttmp_register_amd and load_scalar_arg_amd as workgroup divergent These are SGPR inputs, so they are uniform in subgroups but may have different values in different subgroups. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41584>	2026-05-25 12:29:27 +00:00
Timur Kristóf	dd5b6f3940	nir/divergence: Consider uniformity of read_invocation accross subgroups These intrinsics are generally divergent between different subgroups, but they can be uniform when all their sources are also uniform. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41584>	2026-05-25 12:29:27 +00:00
Timur Kristóf	5b385b703b	nir/divergence: Consider ACCESS_SMEM_AMD divergence across subgroups AMD SMEM instructions are always uniform within a subgroup, but they may be divergent across subgroups, ie. each subgroup may have a different value from the same SMEM instruction. This needs to be considered for divergence across subgroups as well as for vertex divergence, because vertices of the same primitive may be split between different waves. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41584>	2026-05-25 12:29:27 +00:00
Georg Lehmann	a92d0356eb	nir: seperate ffmaz from has_fmulz There is no hardware which supports ffmaz with denorms. We also need this to be seperate because there is AMD hardware with ffma but not ffmaz. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41649>	2026-05-25 11:50:38 +00:00
Valentine Burley	190ce8280f	meson: Add Soong compatibility compiler flags to Vulkan drivers Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Suggested by @gurchetansingh. Android's Soong build system treats several compiler warnings as errors by default: https://android.googlesource.com/platform/build/soong/+/27f57506/cc/config/global.go/#218 To catch these issues in Mesa, introduce `soong_compat_c_args` and `soong_compat_cpp_args` with the following flags treated as errors: -D_LIBCPP_ENABLE_THREAD_SAFETY_ANNOTATIONS -Werror=date-time -Werror=gnu-alignof-expression -Werror=ignored-qualifiers -Werror=implicit-fallthrough -Werror=int-conversion -Werror=missing-prototypes -Werror=pragma-pack -Werror=pragma-pack-suspicious-include -Werror=sizeof-array-div -Werror=string-plus-int -Werror=unreachable-code-loop-increment These compatibility flags are added to the meson configurations for ANV, Gfxstream, Lavapipe, PanVK, Turnip, and Venus. Signed-off-by: Valentine Burley <valentine.burley@collabora.com> Acked-by: Emma Anholt <emma@anholt.net> Reviewed-by: Gurchetan Singh <gurchetan.singh.foss@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41644>	2026-05-22 07:09:49 +00:00
Caio Oliveira	c8914985c4	compiler: Support more than 255 cols/rows in cmat descriptions This struct was initially packed to fit in a slot in NIR intrinsics indices. Nowadays NIR supports larger indices and cooperative matrix has extensions that allow it to go beyond the existing limit. This patch changes the struct to be larger and remove the manual bit packing. The hash table change is to use the specialized version for u64 keys that's available in src/util. Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41691>	2026-05-21 21:47:03 +00:00
Caio Oliveira	7b286abe33	nir: Add print for other cmat_description slots Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Fixes: `102d7409ef` ("nir: Add convert_cmat_intel intrinsic") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41690>	2026-05-21 19:23:12 +00:00
Kenneth Graunke	35622f165f	jay, nir: Make a dispatch_mask_intel intrinsic jay is trying to use the fragment shader dispatch mask for helper invocation lowering, but it was using load_sample_mask_in for that (now load_coverage_mask_intel). But this isn't the MSAA coverage mask, the two are different payload fields. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41688>	2026-05-21 15:34:46 +00:00
Kenneth Graunke	6c142f7edc	jay: Implement sample mask writes Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41688>	2026-05-21 15:34:46 +00:00
Kenneth Graunke	b01d286083	jay: Move render target store payload/descriptor construction to backend Constructing the render target store payload is more complex than we can reasonably handle at the NIR level. The main reason is that samplemask and stencil are packed 16-bit and 8-bit parameters, respectively, which are intermixed with other values that are 32-bit. In SIMD32 mode, the packed sub-32-bit values take up fewer registers than normal values. Currently we also don't specialize the NIR for each FS dispatch width, and we can't construct the message descriptor without knowing it. So, we alter nir_intrinsic_store_render_target_intel to take each of the expected parameters - colour, depth, stencil, samplemask, src0_alpha, and discard predicate. We construct the payloads and descriptors in the backend. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41688>	2026-05-21 15:34:46 +00:00
Karol Herbst	273204e24e	nir: add uniform address to nvidia IO intrinsics Adding the zero constants have a minor impact on stats due to some unlucky interactions with nir_opt_cse, opt_instr_sched_prepass and assign_regs. Totals from 61 (0.01% of 1212873) affected shaders: CodeSize: 1044720 -> 1047472 (+0.26%); split: -0.00%, +0.27% Static cycle count: 1198932 -> 1198490 (-0.04%); split: -0.07%, +0.04% Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39384>	2026-05-20 17:23:33 +00:00
Karol Herbst	32fd51687d	nir: add nir_intrinsic_cmat_load_shared_nv to nir_get_io_offset_src_number Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39384>	2026-05-20 17:23:32 +00:00
Caio Oliveira	992b35704e	nir/instr_set: Consider normalization when calculating hash Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details The nir_instrs_equal normalizes the some indices but hash_intrinsic wasn't normalizing them. Reorganize the code so both do it using the same helper. Fixes: `b2bc57551a` ("nir/instr_set: allow cse with fp_math_ctrl mismatches for intrinsics") Assisted-by: Pi coding agent (GPT-5.5) Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41606>	2026-05-20 05:24:21 +00:00
Karol Herbst	e9c1cce35f	nir: remove ffma_old Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>	2026-05-19 18:13:42 +00:00
Karol Herbst	e1aaaf4ed0	nir: make lowering use new ffma opcodes Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>	2026-05-19 18:13:41 +00:00
Karol Herbst	109d93dd98	nir: update ffma helpers to use new opcodes Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>	2026-05-19 18:13:41 +00:00
Karol Herbst	aeea2e7c1f	nir: add fmad_or_ffma helpers and use it in lower_double_ops We skip emitting ffma_weak here, because otherwise we'd require a lowering loop with opt_algebraic and lower_double_ops and this way it's also cheaper. Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>	2026-05-19 18:13:40 +00:00
Karol Herbst	688e5cda94	nir/tests: use ffma_weak Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>	2026-05-19 18:13:31 +00:00
Karol Herbst	b7094546f4	nir: duplicate old ffma opts where necessary for new multadd ones Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>	2026-05-19 18:13:30 +00:00
Karol Herbst	86007ae1ad	nir: handle new multadd opcodes in helpers Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>	2026-05-19 18:13:30 +00:00
Karol Herbst	68dc336af7	nir: handle new multadd opcodes in lowerings and opts Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>	2026-05-19 18:13:30 +00:00
Karol Herbst	bb2b7c58fc	nir/opt_algebraic: add fmad and ffma_weak lowering rules Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>	2026-05-19 18:13:28 +00:00
Karol Herbst	9ffa7c826f	nir/tests: handle new multadd opcodes Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>	2026-05-19 18:13:28 +00:00
Karol Herbst	7ce841cb71	nir: validate new float_mul_add options Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>	2026-05-19 18:13:28 +00:00

1 2 3 4 5 ...

7546 commits