fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-23 19:50:11 +01:00

Author	SHA1	Message	Date
Antonio Ospite	ddf2aa3a4d	build: avoid redefining unreachable() which is standard in C23 In the C23 standard unreachable() is now a predefined function-like macro in <stddef.h> See https://android.googlesource.com/platform/bionic/+/HEAD/docs/c23.md#is-now-a-predefined-function_like-macro-in And this causes build errors when building for C23: ----------------------------------------------------------------------- In file included from ../src/util/log.h:30, from ../src/util/log.c:30: ../src/util/macros.h:123:9: warning: "unreachable" redefined 123 \| #define unreachable(str) \ \| ^~~~~~~~~~~ In file included from ../src/util/macros.h:31: /usr/lib/gcc/x86_64-linux-gnu/14/include/stddef.h:456:9: note: this is the location of the previous definition 456 \| #define unreachable() (__builtin_unreachable ()) \| ^~~~~~~~~~~ ----------------------------------------------------------------------- So don't redefine it with the same name, but use the name UNREACHABLE() to also signify it's a macro. Using a different name also makes sense because the behavior of the macro was extending the one of __builtin_unreachable() anyway, and it also had a different signature, accepting one argument, compared to the standard unreachable() with no arguments. This change improves the chances of building mesa with the C23 standard, which for instance is the default in recent AOSP versions. All the instances of the macro, including the definition, were updated with the following command line: git grep -l '[^_]unreachable(' -- "src/**" \| sort \| uniq \| \ while read file; \ do \ sed -e 's/$[^_]$unreachable(/\1UNREACHABLE(/g' -i "$file"; \ done && \ sed -e 's/#undef unreachable/#undef UNREACHABLE/g' -i src/intel/isl/isl_aux_info.c Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36437>	2025-07-31 17:49:42 +00:00
jhananit	debd903a00	intel: Update all NIR_PASS_V to NIR_PASS Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35889>	2025-07-14 19:25:52 +00:00
Ian Romanick	5adab50283	brw/nir: Use nir_opt_reassociate_matrix_mul This needs to be called before intel_nir_opt_peephole_ffma, so I arbitrarilly decided to call it right before. All Intel platforms had similar results. (Lunar Lake shown) total instructions in shared programs: 17120227 -> 17118227 (-0.01%) instructions in affected programs: 5854 -> 3854 (-34.16%) helped: 51 / HURT: 0 total cycles in shared programs: 895497762 -> 894733940 (-0.09%) cycles in affected programs: 4603518 -> 3839696 (-16.59%) helped: 95 / HURT: 21 LOST: 1 GAINED: 0 Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35925>	2025-07-09 19:28:49 +00:00
Daniel Schürmann	2c51a8870d	nir: add nir_vectorize_cb callback parameter to nir_lower_phis_to_scalar() Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Similar to nir_lower_alu_width(), the callback can return the desired number of components for a phi, or 0 for no lowering. The previous behavior of nir_lower_phis_to_scalar() with lower_all=true can be elicited via nir_lower_all_phis_to_scalar() while the previous behavior with lower_all=false now corresponds to nir_lower_phis_to_scalar() with NULL callback. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mel Henning <mhenning@darkrefraction.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35783>	2025-07-08 15:33:59 +00:00
Alyssa Rosenzweig	d31cb824df	treewide: use VARYING_BIT_* Some checks failed macOS-CI / macOS-CI (dri) (push) Has been cancelled Details macOS-CI / macOS-CI (xlib) (push) Has been cancelled Details Via Coccinelle patch generated by the following Python: varys = [ "POS", "COL0", "COL1", "FOGC", "TEX0", "TEX1", "TEX2", "TEX3", "TEX4", "TEX5", "TEX6", "TEX7", "PSIZ", "BFC0", "BFC1", "EDGE", "CLIP_VERTEX", "CLIP_DIST0", "CLIP_DIST1", "CULL_DIST0", "CULL_DIST1", "PRIMITIVE_ID", "PRIMITIVE_COUNT", "LAYER", "VIEWPORT", "FACE", "PRIMITIVE_SHADING_RATE", "PNTC", "TESS_LEVEL_OUTER", "TESS_LEVEL_INNER", "PRIMITIVE_INDICES", "BOUNDING_BOX0", "BOUNDING_BOX1", "VIEWPORT_MASK", "CULL_PRIMITIVE" ] t = """ @@ @@ -(1 << VARYING_SLOT_${V}) +VARYING_BIT_${V} @@ @@ -BITFIELD_BIT(VARYING_SLOT_${V}) +VARYING_BIT_${V} @@ @@ -(1ull << VARYING_SLOT_${V}) +VARYING_BIT_${V} @@ @@ -BITFIELD64_BIT(VARYING_SLOT_${V}) +VARYING_BIT_${V} """ for v in varys: from mako.template import Template print(Template(t).render(V = v)) Closes: #13453 Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Marek Olšák <maraeo@gmail.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> [panfrost, common] Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [broadcom] Reviewed-by: Corentin Noël <corentin.noel@collabora.com> [virgl] Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> [zink] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35917>	2025-07-04 19:01:04 +00:00
Lionel Landwerlin	fcf4401824	brw: handle wa_18019110168 with independent shader compilation Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>	2025-06-28 05:55:35 +00:00
Lionel Landwerlin	bc8d18aee2	brw: make a helper for vertex attribute offset computation Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>	2025-06-28 05:55:34 +00:00
Lionel Landwerlin	f0f4f9c566	brw: fix vertex attribute offset computation The formula uses scalar indices (4bytes), not slots (16bytes). We also incorrectly passed a scalar (vertex case) & slot (mesh case) offset in the push constants. Use slots instead so that the value is smaller and we can pack more stuff into fs_msaa_flags. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `18bbcf9a63` ("intel: introduce new VUE layout for separate compiled shader with mesh") Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>	2025-06-28 05:55:31 +00:00
Marek Olšák	1754507d49	nir: rename nir_lower_io_to_temporaries -> nir_lower_io_vars_to_temporaries Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>	2025-06-26 18:20:54 +00:00
Marek Olšák	1e03827c77	nir: rename nir_lower_io_arrays_to_elements -> nir_lower_io_array_vars_to_elements same for *_no_indirects Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>	2025-06-26 18:20:54 +00:00
Marek Olšák	12df9b3def	nir: rename nir_vectorize_tess_levels -> nir_lower_tess_level_array_vars_to_vec Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>	2025-06-26 18:20:50 +00:00
Marek Olšák	2aa94caf82	nir: rename nir_lower_io_to_vector -> nir_opt_vectorize_io_vars Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>	2025-06-26 18:20:50 +00:00
Marek Olšák	439d805291	nir: rename nir_lower_io_to_scalar_early -> nir_lower_io_vars_to_scalar Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>	2025-06-26 18:20:49 +00:00
Georg Lehmann	9da23499ff	compiler: add float8 glsl types e4m3fn: 8bit floating point format with 4bit exponent, 3bit mantissa and no infinities (finite only) e5m2: 8bit floating point format with 5bit exponent, 2bit mantissa and with infinities. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35434>	2025-06-23 07:59:24 +00:00
Rohan Garg	e103afe7be	brw: run the nir_opt_offsets pass and set the maximum offset size Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Perf A/B testing on DG2: no changes Perf A/B testing on BMG: +2.1% Blackops3, +1.5% Cyberpunk DG2 stats (mostly insignificant): Assassins Creed Valhalla: Totals from 1169 (55.67% of 2100) affected shaders: Instrs: 509237 -> 509215 (-0.00%) Cycle count: 30614325 -> 30607419 (-0.02%); split: -0.03%, +0.00% Non SSA regs after NIR: 83434 -> 85909 (+2.97%) Blackops 3: Totals from 1045 (64.63% of 1617) affected shaders: Instrs: 527312 -> 527310 (-0.00%) Cycle count: 496912222 -> 496902846 (-0.00%); split: -0.00%, +0.00% Non SSA regs after NIR: 106883 -> 109095 (+2.07%) Cyberpunk: Totals from 706 (56.03% of 1260) affected shaders: Instrs: 345976 -> 345974 (-0.00%); split: -0.00%, +0.00% Cycle count: 9775138 -> 9775472 (+0.00%); split: -0.00%, +0.00% Max live registers: 40295 -> 40297 (+0.00%) Non SSA regs after NIR: 93245 -> 94718 (+1.58%) Fortnite: Totals from 4210 (55.98% of 7521) affected shaders: Instrs: 2205471 -> 2205469 (-0.00%) Cycle count: 91451040 -> 91450956 (-0.00%); split: -0.00%, +0.00% Non SSA regs after NIR: 952354 -> 961664 (+0.98%) LNL stats (notable changes): Assassins Creed Valhalla: Totals from 1684 (83.57% of 2015) affected shaders: Instrs: 774305 -> 764501 (-1.27%); split: -1.27%, +0.01% Cycle count: 58845842 -> 58699250 (-0.25%); split: -0.98%, +0.73% Spill count: 625 -> 638 (+2.08%) Fill count: 1490 -> 1503 (+0.87%) Scratch Memory Size: 41984 -> 44032 (+4.88%) Max live registers: 196424 -> 197561 (+0.58%); split: -0.10%, +0.68% Blackops 3: Totals from 1125 (76.53% of 1470) affected shaders: Instrs: 781749 -> 773275 (-1.08%); split: -1.08%, +0.00% Subgroup size: 22896 -> 22912 (+0.07%) Cycle count: 659864454 -> 654641032 (-0.79%); split: -1.10%, +0.31% Max live registers: 116772 -> 116854 (+0.07%); split: -0.01%, +0.08% Non SSA regs after NIR: 172648 -> 168260 (-2.54%); split: -2.55%, +0.01% Control: Totals from 378 (51.50% of 734) affected shaders: Instrs: 148184 -> 147544 (-0.43%) Cycle count: 6905200 -> 6913366 (+0.12%); split: -0.30%, +0.42% Max live registers: 41271 -> 41281 (+0.02%) Non SSA regs after NIR: 44964 -> 43868 (-2.44%); split: -2.45%, +0.01% Cyberpunk: Totals from 1141 (92.46% of 1234) affected shaders: Instrs: 636744 -> 629333 (-1.16%) Subgroup size: 24256 -> 24272 (+0.07%) Cycle count: 24952258 -> 24801298 (-0.60%); split: -1.39%, +0.78% Max live registers: 125848 -> 126855 (+0.80%); split: -0.00%, +0.80% Non SSA regs after NIR: 127399 -> 119837 (-5.94%); split: -5.95%, +0.02% Fortnite: Totals from 5497 (83.52% of 6582) affected shaders: Instrs: 4072831 -> 4041852 (-0.76%); split: -0.77%, +0.01% Subgroup size: 103296 -> 103312 (+0.02%) Cycle count: 133046874 -> 132789242 (-0.19%); split: -0.67%, +0.48% Spill count: 7218 -> 7254 (+0.50%); split: -0.33%, +0.83% Fill count: 11724 -> 11749 (+0.21%); split: -0.34%, +0.55% Scratch Memory Size: 591872 -> 599040 (+1.21%) Max live registers: 816530 -> 818522 (+0.24%); split: -0.01%, +0.26% Non SSA regs after NIR: 1610296 -> 1560284 (-3.11%); split: -3.11%, +0.00% Hitman3: Totals from 4713 (92.39% of 5101) affected shaders: Instrs: 2731598 -> 2698224 (-1.22%) Cycle count: 186422098 -> 185472640 (-0.51%); split: -1.12%, +0.61% Spill count: 3244 -> 3242 (-0.06%) Fill count: 9937 -> 9933 (-0.04%) Max live registers: 585035 -> 589801 (+0.81%); split: -0.00%, +0.82% Non SSA regs after NIR: 347681 -> 324314 (-6.72%); split: -6.73%, +0.01% Hogwarts Legacy: Totals from 930 (59.81% of 1555) affected shaders: Instrs: 464146 -> 459526 (-1.00%); split: -1.00%, +0.01% Subgroup size: 19104 -> 19120 (+0.08%) Cycle count: 24062460 -> 24078964 (+0.07%); split: -0.49%, +0.56% Spill count: 2068 -> 1964 (-5.03%); split: -5.22%, +0.19% Fill count: 2342 -> 2205 (-5.85%); split: -6.40%, +0.56% Scratch Memory Size: 147456 -> 141312 (-4.17%) Max live registers: 112384 -> 112787 (+0.36%); split: -0.08%, +0.44% Non SSA regs after NIR: 80293 -> 79161 (-1.41%); split: -1.72%, +0.32% Metro Exodus: Totals from 29755 (78.62% of 37846) affected shaders: Instrs: 11495578 -> 11492951 (-0.02%); split: -0.02%, +0.00% Subgroup size: 644688 -> 644704 (+0.00%) Cycle count: 301572068 -> 301548054 (-0.01%); split: -0.03%, +0.02% Max live registers: 3369504 -> 3370454 (+0.03%); split: -0.00%, +0.03% Non SSA regs after NIR: 2476561 -> `2396090` (-3.25%); split: -3.27%, +0.02% Red Dead Redemption 2: Totals from 4161 (78.61% of 5293) affected shaders: Instrs: 2428782 -> 2409032 (-0.81%); split: -0.82%, +0.00% Subgroup size: 85344 -> 85360 (+0.02%) Cycle count: 8514984142 -> 8533415324 (+0.22%); split: -0.02%, +0.23% Spill count: 4659 -> 4674 (+0.32%); split: -0.02%, +0.34% Fill count: 11236 -> 11231 (-0.04%); split: -0.19%, +0.14% Scratch Memory Size: 398336 -> 397312 (-0.26%) Max live registers: 473946 -> 475798 (+0.39%); split: -0.08%, +0.47% Non SSA regs after NIR: 616820 -> 567706 (-7.96%); split: -8.09%, +0.12% Rise Of The Tomb Raider: Totals from 68 (46.58% of 146) affected shaders: Instrs: 28209 -> 27801 (-1.45%) Subgroup size: 1584 -> 1600 (+1.01%) Cycle count: 16182992 -> 16249364 (+0.41%); split: -0.97%, +1.38% Max live registers: 7320 -> 7296 (-0.33%); split: -0.38%, +0.05% Non SSA regs after NIR: 8438 -> 8207 (-2.74%); split: -2.82%, +0.08% Spiderman Remastered: Totals from 6403 (93.87% of 6821) affected shaders: Instrs: 5662713 -> 5597949 (-1.14%); split: -1.28%, +0.14% Cycle count: 282861519016 -> 279806958122 (-1.08%); split: -1.26%, +0.18% Spill count: 61150 -> 60754 (-0.65%); split: -1.13%, +0.48% Fill count: 162597 -> 163190 (+0.36%); split: -0.84%, +1.21% Scratch Memory Size: 5834752 -> 5804032 (-0.53%); split: -0.70%, +0.18% Max live registers: 901926 -> 903820 (+0.21%); split: -0.01%, +0.22% Non SSA regs after NIR: 555053 -> 521016 (-6.13%); split: -6.14%, +0.01% Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>	2025-06-22 10:55:24 +00:00
Lionel Landwerlin	16fca611d7	nir: add new intel ssbo intrinsics Similar to ir3 ones, to optimize offsets in the backend. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>	2025-06-22 10:55:23 +00:00
Lionel Landwerlin	1d8382b88e	brw: enable more lowering for bitfield manipulation at non 32bit sizes Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35381>	2025-06-11 14:09:56 +00:00
Christian Gmeiner	41f2da1a6e	treewide: Do not use NIR_PASS_V for nir_divergence_analysis(..) Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35131>	2025-05-23 21:19:25 +00:00
Caleb Callaway	e7454f5318	intel/debug: shader dump filter Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details v2: Fixes filtering for various brw shader dump logic Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35061>	2025-05-23 19:57:02 +00:00
Lionel Landwerlin	df15968813	anv/brw: stop turning load_push_constants into load_uniform Those intrinsics have different semantics in particular with regards to divergence. Turning one into the other without invalidating the divergence information breaks NIR validation. But also the conversion means we get artificially less convergent values in the shaders. So just handle load_push_constants in the backend and stop changing things in Anv. Fixes a bunch of tests in dEQP-VK.descriptor_indexing.* dEQP-VK.pipeline..push_constant.graphics_pipeline.dynamic_index_ Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34546>	2025-05-22 07:49:20 +00:00
Iván Briano	27a2f6d1ff	brw: add lowering passes for FS barycentric inputs Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34445>	2025-05-20 20:57:59 +00:00
Iván Briano	acdd30a9da	brw: check if the FS needs vertex_attributes_bypass to be set Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34445>	2025-05-20 20:57:58 +00:00
Lionel Landwerlin	18bbcf9a63	intel: introduce new VUE layout for separate compiled shader with mesh Mesh shaders have per vertex block in URB pretty much identical to the VUE format. Let's just reuse that concept to do all of our layout in the payload attribute registers. This will ensure that we have consistent VUE layout between Mesh & non-Mesh pipelines. We need a new way of laying out the VUE though as we have to accomodate a HW constraint of maximum (per-primitive + per-vertex) of 32 varying. This means we cannot have 2 locations in the payload for things like PrimitiveID which can come from either the per-primitive or the per-vertex block. The new layout places the PrimitiveID at the end of the per-vertex attributes and shrinks the delivery dynamically if the mesh stage is active. The shader is compiled with a MOV_INDIRECT to read the PrimitiveID from the right location in the attributes. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>	2025-05-08 06:48:35 +00:00
Lionel Landwerlin	6230f3029f	brw: fix brw_nir_move_interpolation_to_top In a case like this : block_0: %5 = ... %6 = ... block_1: %7 = load_interpolated_input %5, %6 The current logic would move load_interpolated_input to block_0 before %5 but not move %5 & %6 which are sources of that instruction. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>	2025-05-08 06:48:34 +00:00
Lionel Landwerlin	dd1ef73aae	brw: use newer NIR constructs nir_shader_intrinsics_pass() & NIR_PASS() Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>	2025-05-08 06:48:34 +00:00
Lionel Landwerlin	b64f237dc4	brw: move helper to brw_nir.c Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>	2025-05-08 06:48:34 +00:00
Caio Oliveira	a38960e8f3	brw, nir: Use glsl_base_type instead of nir_alu_type for @dpas_intel This will allow including types that don't have a nir_alu_type equivalent, like bfloat16. Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>	2025-04-29 16:29:37 +00:00
Rohan Garg	9e5d7eb88d	compiler/types: add a bfloat16 type Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>	2025-04-29 16:29:36 +00:00
Faith Ekstrand	436f175187	intel/compiler: Use nir_split_conversions() Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34266>	2025-04-07 17:45:21 -05:00
Ian Romanick	e210b79ce3	brw/nir: Lower fsign again after last call to brw_nir_optimize Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details No shader-db or fossil-db changes on any Intel platform. Fixes: `13332c23` ("intel/brw: Unconditionally run optimizations after nir_opt_uniform_subgroup") Closes: #12888 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34251>	2025-04-02 01:59:49 +00:00
Ian Romanick	ca95cb8178	brw: Fix typo in comment Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34251>	2025-04-02 01:59:49 +00:00
Lionel Landwerlin	4346210ae6	brw: move texture offset packing to NIR That way we can deal with upcoming non constant values for VK_KHR_maintenance8. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33138>	2025-03-29 02:15:18 +00:00
Lionel Landwerlin	67ae49dede	intel: move lower_texture to brw Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33138>	2025-03-29 02:15:18 +00:00
Lionel Landwerlin	86773b2ba6	brw: don't lower tg4 offsets without LOD The problem this fixes is currently hidden because of the order in which we run nir_lower_tex & intel_nir_lower_texture. The issue is that nir_lower_tex removes the LOD source in some cases and the second run of nir_lower_tex can add it back. This is also only needed on Gfx12.5+ if the LOD is present. Finally move all of the texture lowering to the postprocess phase. No need to run this multiple times. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33138>	2025-03-29 02:15:18 +00:00
Sagar Ghuge	191d1e7345	intel/compiler: Don't lower 64bit data memory access on LSC Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34189>	2025-03-28 03:07:56 +00:00
Matt Turner	0a63d629fe	intel/compiler: Use unreachable instead of assert(!"...") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34014>	2025-03-13 20:11:10 +00:00
Lionel Landwerlin	1835bf3520	brw: avoid calling lower_indirect_derefs multiple times Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Lowering the indirect derefs multiple times leads to very inefficient shaders because of all the control flow inserted. In particular on some DGC tests with mesh shaders, the tests can spin for 1hour on an i7 and still not complete compilation. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33809>	2025-03-09 20:52:01 +00:00
Sagar Ghuge	1bfe2571f5	intel/compiler: Lower sample index into coord for MSRT messages Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32690>	2025-03-07 23:06:14 +00:00
Sagar Ghuge	bea9d79cb9	intel/compiler: Add support for MSAA typed load/store messages Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32690>	2025-03-07 23:06:14 +00:00
Alyssa Rosenzweig	9a58a8257e	treewide: Switch to nir_progress Via the Coccinelle patch at the end of the commit message, followed by sed -ie 's/progress = progress \| /progress \|=/g' $(git grep -l 'progress = prog') ninja -C ~/mesa/build clang-format cd ~/mesa/src/compiler/nir && clang-format -i *.c agxfmt @@ identifier prog; expression impl, metadata; @@ -if (prog) { -nir_metadata_preserve(impl, metadata); -} else { -nir_metadata_preserve(impl, nir_metadata_all); -} -return prog; +return nir_progress(prog, impl, metadata); @@ expression prog_expr, impl, metadata; @@ -if (prog_expr) { -nir_metadata_preserve(impl, metadata); -return true; -} else { -nir_metadata_preserve(impl, nir_metadata_all); -return false; -} +bool progress = prog_expr; +return nir_progress(progress, impl, metadata); @@ identifier prog; expression impl, metadata; @@ -nir_metadata_preserve(impl, prog ? (metadata) : nir_metadata_all); -return prog; +return nir_progress(prog, impl, metadata); @@ identifier prog; expression impl, metadata; @@ -nir_metadata_preserve(impl, prog ? (metadata) : nir_metadata_all); +nir_progress(prog, impl, metadata); @@ expression impl, metadata; @@ -nir_metadata_preserve(impl, metadata); -return true; +return nir_progress(true, impl, metadata); @@ expression impl; @@ -nir_metadata_preserve(impl, nir_metadata_all); -return false; +return nir_no_progress(impl); @@ identifier other_prog, prog; expression impl, metadata; @@ -if (prog) { -nir_metadata_preserve(impl, metadata); -} else { -nir_metadata_preserve(impl, nir_metadata_all); -} -other_prog \|= prog; +other_prog = other_prog \| nir_progress(prog, impl, metadata); @@ identifier prog; expression impl, metadata; @@ -if (prog) { -nir_metadata_preserve(impl, metadata); -} else { -nir_metadata_preserve(impl, nir_metadata_all); -} +nir_progress(prog, impl, metadata); @@ identifier other_prog, prog; expression impl, metadata; @@ -if (prog) { -nir_metadata_preserve(impl, metadata); -other_prog = true; -} else { -nir_metadata_preserve(impl, nir_metadata_all); -} +other_prog = other_prog \| nir_progress(prog, impl, metadata); @@ expression prog_expr, impl, metadata; identifier prog; @@ -if (prog_expr) { -nir_metadata_preserve(impl, metadata); -prog = true; -} else { -nir_metadata_preserve(impl, nir_metadata_all); -} +bool impl_progress = prog_expr; +prog = prog \| nir_progress(impl_progress, impl, metadata); @@ identifier other_prog, prog; expression impl, metadata; @@ -if (prog) { -other_prog = true; -nir_metadata_preserve(impl, metadata); -} else { -nir_metadata_preserve(impl, nir_metadata_all); -} +other_prog = other_prog \| nir_progress(prog, impl, metadata); @@ expression prog_expr, impl, metadata; identifier prog; @@ -if (prog_expr) { -prog = true; -nir_metadata_preserve(impl, metadata); -} else { -nir_metadata_preserve(impl, nir_metadata_all); -} +bool impl_progress = prog_expr; +prog = prog \| nir_progress(impl_progress, impl, metadata); @@ expression prog_expr, impl, metadata; @@ -if (prog_expr) { -nir_metadata_preserve(impl, metadata); -} else { -nir_metadata_preserve(impl, nir_metadata_all); -} +bool impl_progress = prog_expr; +nir_progress(impl_progress, impl, metadata); @@ identifier prog; expression impl, metadata; @@ -nir_metadata_preserve(impl, metadata); -prog = true; +prog = nir_progress(true, impl, metadata); @@ identifier prog; expression impl, metadata; @@ -if (prog) { -nir_metadata_preserve(impl, metadata); -} -return prog; +return nir_progress(prog, impl, metadata); @@ identifier prog; expression impl, metadata; @@ -if (prog) { -nir_metadata_preserve(impl, metadata); -} +nir_progress(prog, impl, metadata); @@ expression impl; @@ -nir_metadata_preserve(impl, nir_metadata_all); +nir_no_progress(impl); @@ expression impl, metadata; @@ -nir_metadata_preserve(impl, metadata); +nir_progress(true, impl, metadata); squashme! sed -ie 's/progress = progress \| /progress \|=/g' $(git grep -l 'progress = prog') Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33722>	2025-02-26 15:19:53 +00:00
Georg Lehmann	f26069fdd9	nir: replace nir_opt_conditional_discard with nir_opt_peephole_select Foz-DB Navi21: Totals from 118 (0.15% of 79377) affected shaders: Instrs: 208001 -> 207355 (-0.31%); split: -0.33%, +0.01% CodeSize: 1080428 -> 1078432 (-0.18%); split: -0.20%, +0.02% SpillSGPRs: 202 -> 211 (+4.46%) Latency: 1923508 -> 1919093 (-0.23%); split: -0.62%, +0.39% InvThroughput: 407475 -> 407081 (-0.10%); split: -0.12%, +0.02% SClause: 7050 -> 7033 (-0.24%); split: -0.31%, +0.07% Copies: 12156 -> 11821 (-2.76%); split: -3.04%, +0.28% PreSGPRs: 8198 -> 8331 (+1.62%); split: -0.02%, +1.65% PreVGPRs: 7628 -> 7528 (-1.31%) VALU: 155747 -> 155657 (-0.06%); split: -0.06%, +0.00% SALU: 18295 -> 17782 (-2.80%); split: -2.98%, +0.18% SMEM: 10521 -> 10519 (-0.02%) Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33590>	2025-02-20 21:59:17 +00:00
Georg Lehmann	ca8147edbe	nir/peephole_select: add options struct Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33590>	2025-02-20 21:59:16 +00:00
Lionel Landwerlin	f19c5f4fcc	brw: use meaningful io locations for system values Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32418>	2025-02-13 14:36:15 +00:00
Lionel Landwerlin	bae9344baf	brw: port vs input to lower_64bit_to_32_new Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32418>	2025-02-13 14:36:15 +00:00
Daniel Schürmann	175c06e5cd	intel: switch to nir_metadata_divergence Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30814>	2025-02-13 10:08:43 +00:00
Francisco Jerez	80b2355b39	intel/brw: Allow specifying a required subgroup size for fragment shaders. On older hardware the "use_rep_send" compile parameter was being implicitly used to request the compilation of the SIMD16 variant of clear pixel shaders that require it due to hardware restrictions. However starting on Gfx12+ this flag is never set since replicated data clears are no longer supported, but BLORP still implicitly relies on the SIMD16 variant being generated even though there's no way for BLORP to explicitly request it. This doesn't cause much of a problem right now since brw_compile_fs() typically generates a SIMD16 kernel unless the SIMD8 kernel spills or SIMD debugging flags are enabled, but it won't work reliably on Xe3+ since we'll start using SIMD32 more aggressively. In order to avoid these issues use the standard required subgroup_size parameter from shader_info to signal that the SIMD16 variant of the shader is needed by the caller. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32664>	2025-01-29 23:39:32 +00:00
Daniel Schürmann	f3be7ce01b	nir/from_ssa: only consider divergence if requested This pass used to unconditionally use divergence information which forced the caller to either call divergence_analysis or ensure that the divergence is properly reset. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33009>	2025-01-23 01:31:23 +00:00
Kenneth Graunke	21636ff9fa	brw: Align and combine constant-offset UBO loads in NIR The hope here is to replace our backend handling for loading whole cachelines at a time from UBOs into NIR-based handling, which plays nicely with the NIR load/store vectorizer. Rounding down offsets to multiples of 64B allows us to globally CSE UBO loads across basic blocks. This is really useful. However, blindly rounding down the offset to a multiple of 64B can trigger anti-patterns where...a single unaligned memory load could have hit all the necessary data, but rounding it down split it into two loads. By moving this to NIR, we gain more control of the interplay between nir_opt_load_store_vectorize and this rebasing and CSE'ing. The backend can then simply load between nir_def_{first,last}_component_read() and trust that our NIR has the loads blockified appropriately. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32888>	2025-01-10 22:44:09 +00:00
Kenneth Graunke	35f175301d	brw: Fix vectorizer hole_size condition after signedness change Marek recently changed hole_size to be signed, rather than unsigned. A negative hole_size means that the two loads overlap - and thus are prime candidates to be combined. My original hole_size handling was: if hole_size > 4 * (8 - low->num_components) then don't vectorize For non-overlapping loads, this worked: NIR's largest vector is vec16, and if low was already a vec16, combining it with anything would exceed that, so it'd never be considered. That meant low would always be a vec8 or less, so (8 - low->num_components) was a positive number. Now that we see overlapping loads, we can see a vec16 low, vec4 high, and also a negative hole size, giving us fun comparisons like: -16 > 4 * (8 - 16) => -16 > -32 => true, don't vectorize Which is absolutely the wrong thing to do, because the high load's data is entirely included within the former load's data. The idea here was to make sure the second load would be able to pack at least one component into the first's V8 result. But even this isn't the best, because...even if it's simply adjacent, doing one V16 load is more efficient than requesting two back to back V8 loads. So, we just simplify down to a static check: if there's an entire V8 of hole, don't vectorize. This already won't happen because the core pass has max_hole set to 28 bytes (7 32-bit components), but that could change based on the needs of other drivers, so let's be defensive. fossil-db results on Alchemist: Instrs: 161533978 -> 161295137 (-0.15%); split: -0.20%, +0.05% Subgroup size: 8092544 -> 8092568 (+0.00%) Send messages: 7915233 -> 7844503 (-0.89%); split: -0.94%, +0.05% Cycle count: 16577700697 -> 16702609256 (+0.75%); split: -0.59%, +1.35% Spill count: 72338 -> 67226 (-7.07%); split: -7.36%, +0.29% Fill count: 134058 -> 125980 (-6.03%); split: -6.83%, +0.80% Scratch Memory Size: 4092928 -> 3786752 (-7.48%); split: -7.53%, +0.05% Max live registers: 33031460 -> 32945994 (-0.26%); split: -0.27%, +0.01% Max dispatch width: 5778384 -> 5778536 (+0.00%); split: +0.26%, -0.26% Non SSA regs after NIR: 179809505 -> 152735471 (-15.06%); split: -15.08%, +0.03% Fixes: `c21bc65ba7` ("nir/opt_load_store_vectorize: make hole_size signed to indicate overlapping loads") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32932>	2025-01-08 00:19:54 +00:00
Marek Olšák	c21bc65ba7	nir/opt_load_store_vectorize: make hole_size signed to indicate overlapping loads A negative hole size means the loads overlap. This will be used by drivers to handle overlapping loads in the callback easily. Reviewed-by: Mel Henning <drawoc@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32699>	2025-01-01 00:03:55 +00:00

1 2 3 4 5 ...

445 commits