fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-23 22:00:13 +01:00

Author	SHA1	Message	Date
Marek Olšák	439d805291	nir: rename nir_lower_io_to_scalar_early -> nir_lower_io_vars_to_scalar Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>	2025-06-26 18:20:49 +00:00
Ian Romanick	b83f618fb2	brw: Fully write temporary destinations Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Consider an innocuous instruction like: and(1) v250:UD, g0.3<0,1,0>:UD, 4294967264u NoMask group0 If register allocation decides to spill v250, it will see this instruction and say, "Oh no! The other components of v250 aren't set, so I'd better add a fill before that instruction!" But it gets even worse than that... if register coalesce decided to merge two of these, the live range gets massively extended because the writes don't fully initialize the value. This causes the need to spill these registers in the first place. Changing that instruction to SIMD16 on Xe2 or SIMD8 on other platforms alleviates these issues. shader-db: Lunar Lake total instructions in shared programs: 17118324 -> 17113191 (-0.03%) instructions in affected programs: 93701 -> 88568 (-5.48%) helped: 42 / HURT: 6 total cycles in shared programs: 895422566 -> 895079488 (-0.04%) cycles in affected programs: 30111338 -> 29768260 (-1.14%) helped: 35 / HURT: 40 total spills in shared programs: 3588 -> 3304 (-7.92%) spills in affected programs: 285 -> 1 (-99.65%) helped: 10 / HURT: 0 total fills in shared programs: 2218 -> 1663 (-25.02%) fills in affected programs: 556 -> 1 (-99.82%) helped: 10 / HURT: 0 Meteor Lake, DG2, Tiger Lake, and Ice Lake had similar results. (Meteor Lake shown) total instructions in shared programs: 20059218 -> 20053563 (-0.03%) instructions in affected programs: 96938 -> 91283 (-5.83%) helped: 43 / HURT: 6 total cycles in shared programs: 884174588 -> 883536475 (-0.07%) cycles in affected programs: 22105268 -> 21467155 (-2.89%) helped: 35 / HURT: 27 total spills in shared programs: 5032 -> 4679 (-7.02%) spills in affected programs: 355 -> 2 (-99.44%) helped: 12 / HURT: 0 total fills in shared programs: 4782 -> 4113 (-13.99%) fills in affected programs: 671 -> 2 (-99.70%) helped: 12 / HURT: 0 Skylake total instructions in shared programs: 19097658 -> 19097665 (<.01%) instructions in affected programs: 14202 -> 14209 (0.05%) helped: 0 / HURT: 5 total cycles in shared programs: 862058109 -> 862058267 (<.01%) cycles in affected programs: 3450244 -> 3450402 (<.01%) helped: 7 / HURT: 11 fossil-db: Lunar Lake Totals: Cycle count: 31439652246 -> 31439652272 (+0.00%) Totals from 2 (0.00% of 707091) affected shaders: Cycle count: 2602 -> 2628 (+1.00%) No other Intel platforms had any fossil-db changes. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35721>	2025-06-26 17:59:47 +00:00
Eric Engestrom	99e8d804bf	intel/compiler tests: fix variable type for getopt_long() return value Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details `getopt_long()` returns an `int`, not a `char`; putting the value in a `char` before comparing it to `-1` was making the comparison always fail, resulting in the invalid codepath taken that then fails with: option `-' is invalid: ignored cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34756>	2025-06-23 08:26:29 +00:00
Eric Engestrom	f545f9eed4	intel/compiler tests: fix "is there something after the options" check cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34756>	2025-06-23 08:26:29 +00:00
Eric Engestrom	729922cdae	intel/compiler tests: fix path-to-string conversion cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34756>	2025-06-23 08:26:29 +00:00
Eric Engestrom	de6ab1beda	intel/compiler tests: rewrite subprocess handling in run-test.py `subprocess.Popen()` returns immediately, and the subprocess might not have finished by the time `stdout` is read on the next line, spuriously failing the tests. `subprocess.check_output()` makes sure the output is available before returning, solving this issue; it additionally raises an error if the subprocess failed, giving a better error than a failed diff later in the script. cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34756>	2025-06-23 08:26:29 +00:00
Georg Lehmann	9da23499ff	compiler: add float8 glsl types e4m3fn: 8bit floating point format with 4bit exponent, 3bit mantissa and no infinities (finite only) e5m2: 8bit floating point format with 5bit exponent, 2bit mantissa and with infinities. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35434>	2025-06-23 07:59:24 +00:00
Rohan Garg	e103afe7be	brw: run the nir_opt_offsets pass and set the maximum offset size Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Perf A/B testing on DG2: no changes Perf A/B testing on BMG: +2.1% Blackops3, +1.5% Cyberpunk DG2 stats (mostly insignificant): Assassins Creed Valhalla: Totals from 1169 (55.67% of 2100) affected shaders: Instrs: 509237 -> 509215 (-0.00%) Cycle count: 30614325 -> 30607419 (-0.02%); split: -0.03%, +0.00% Non SSA regs after NIR: 83434 -> 85909 (+2.97%) Blackops 3: Totals from 1045 (64.63% of 1617) affected shaders: Instrs: 527312 -> 527310 (-0.00%) Cycle count: 496912222 -> 496902846 (-0.00%); split: -0.00%, +0.00% Non SSA regs after NIR: 106883 -> 109095 (+2.07%) Cyberpunk: Totals from 706 (56.03% of 1260) affected shaders: Instrs: 345976 -> 345974 (-0.00%); split: -0.00%, +0.00% Cycle count: 9775138 -> 9775472 (+0.00%); split: -0.00%, +0.00% Max live registers: 40295 -> 40297 (+0.00%) Non SSA regs after NIR: 93245 -> 94718 (+1.58%) Fortnite: Totals from 4210 (55.98% of 7521) affected shaders: Instrs: 2205471 -> 2205469 (-0.00%) Cycle count: 91451040 -> 91450956 (-0.00%); split: -0.00%, +0.00% Non SSA regs after NIR: 952354 -> 961664 (+0.98%) LNL stats (notable changes): Assassins Creed Valhalla: Totals from 1684 (83.57% of 2015) affected shaders: Instrs: 774305 -> 764501 (-1.27%); split: -1.27%, +0.01% Cycle count: 58845842 -> 58699250 (-0.25%); split: -0.98%, +0.73% Spill count: 625 -> 638 (+2.08%) Fill count: 1490 -> 1503 (+0.87%) Scratch Memory Size: 41984 -> 44032 (+4.88%) Max live registers: 196424 -> 197561 (+0.58%); split: -0.10%, +0.68% Blackops 3: Totals from 1125 (76.53% of 1470) affected shaders: Instrs: 781749 -> 773275 (-1.08%); split: -1.08%, +0.00% Subgroup size: 22896 -> 22912 (+0.07%) Cycle count: 659864454 -> 654641032 (-0.79%); split: -1.10%, +0.31% Max live registers: 116772 -> 116854 (+0.07%); split: -0.01%, +0.08% Non SSA regs after NIR: 172648 -> 168260 (-2.54%); split: -2.55%, +0.01% Control: Totals from 378 (51.50% of 734) affected shaders: Instrs: 148184 -> 147544 (-0.43%) Cycle count: 6905200 -> 6913366 (+0.12%); split: -0.30%, +0.42% Max live registers: 41271 -> 41281 (+0.02%) Non SSA regs after NIR: 44964 -> 43868 (-2.44%); split: -2.45%, +0.01% Cyberpunk: Totals from 1141 (92.46% of 1234) affected shaders: Instrs: 636744 -> 629333 (-1.16%) Subgroup size: 24256 -> 24272 (+0.07%) Cycle count: 24952258 -> 24801298 (-0.60%); split: -1.39%, +0.78% Max live registers: 125848 -> 126855 (+0.80%); split: -0.00%, +0.80% Non SSA regs after NIR: 127399 -> 119837 (-5.94%); split: -5.95%, +0.02% Fortnite: Totals from 5497 (83.52% of 6582) affected shaders: Instrs: 4072831 -> 4041852 (-0.76%); split: -0.77%, +0.01% Subgroup size: 103296 -> 103312 (+0.02%) Cycle count: 133046874 -> 132789242 (-0.19%); split: -0.67%, +0.48% Spill count: 7218 -> 7254 (+0.50%); split: -0.33%, +0.83% Fill count: 11724 -> 11749 (+0.21%); split: -0.34%, +0.55% Scratch Memory Size: 591872 -> 599040 (+1.21%) Max live registers: 816530 -> 818522 (+0.24%); split: -0.01%, +0.26% Non SSA regs after NIR: 1610296 -> 1560284 (-3.11%); split: -3.11%, +0.00% Hitman3: Totals from 4713 (92.39% of 5101) affected shaders: Instrs: 2731598 -> 2698224 (-1.22%) Cycle count: 186422098 -> 185472640 (-0.51%); split: -1.12%, +0.61% Spill count: 3244 -> 3242 (-0.06%) Fill count: 9937 -> 9933 (-0.04%) Max live registers: 585035 -> 589801 (+0.81%); split: -0.00%, +0.82% Non SSA regs after NIR: 347681 -> 324314 (-6.72%); split: -6.73%, +0.01% Hogwarts Legacy: Totals from 930 (59.81% of 1555) affected shaders: Instrs: 464146 -> 459526 (-1.00%); split: -1.00%, +0.01% Subgroup size: 19104 -> 19120 (+0.08%) Cycle count: 24062460 -> 24078964 (+0.07%); split: -0.49%, +0.56% Spill count: 2068 -> 1964 (-5.03%); split: -5.22%, +0.19% Fill count: 2342 -> 2205 (-5.85%); split: -6.40%, +0.56% Scratch Memory Size: 147456 -> 141312 (-4.17%) Max live registers: 112384 -> 112787 (+0.36%); split: -0.08%, +0.44% Non SSA regs after NIR: 80293 -> 79161 (-1.41%); split: -1.72%, +0.32% Metro Exodus: Totals from 29755 (78.62% of 37846) affected shaders: Instrs: 11495578 -> 11492951 (-0.02%); split: -0.02%, +0.00% Subgroup size: 644688 -> 644704 (+0.00%) Cycle count: 301572068 -> 301548054 (-0.01%); split: -0.03%, +0.02% Max live registers: 3369504 -> 3370454 (+0.03%); split: -0.00%, +0.03% Non SSA regs after NIR: 2476561 -> `2396090` (-3.25%); split: -3.27%, +0.02% Red Dead Redemption 2: Totals from 4161 (78.61% of 5293) affected shaders: Instrs: 2428782 -> 2409032 (-0.81%); split: -0.82%, +0.00% Subgroup size: 85344 -> 85360 (+0.02%) Cycle count: 8514984142 -> 8533415324 (+0.22%); split: -0.02%, +0.23% Spill count: 4659 -> 4674 (+0.32%); split: -0.02%, +0.34% Fill count: 11236 -> 11231 (-0.04%); split: -0.19%, +0.14% Scratch Memory Size: 398336 -> 397312 (-0.26%) Max live registers: 473946 -> 475798 (+0.39%); split: -0.08%, +0.47% Non SSA regs after NIR: 616820 -> 567706 (-7.96%); split: -8.09%, +0.12% Rise Of The Tomb Raider: Totals from 68 (46.58% of 146) affected shaders: Instrs: 28209 -> 27801 (-1.45%) Subgroup size: 1584 -> 1600 (+1.01%) Cycle count: 16182992 -> 16249364 (+0.41%); split: -0.97%, +1.38% Max live registers: 7320 -> 7296 (-0.33%); split: -0.38%, +0.05% Non SSA regs after NIR: 8438 -> 8207 (-2.74%); split: -2.82%, +0.08% Spiderman Remastered: Totals from 6403 (93.87% of 6821) affected shaders: Instrs: 5662713 -> 5597949 (-1.14%); split: -1.28%, +0.14% Cycle count: 282861519016 -> 279806958122 (-1.08%); split: -1.26%, +0.18% Spill count: 61150 -> 60754 (-0.65%); split: -1.13%, +0.48% Fill count: 162597 -> 163190 (+0.36%); split: -0.84%, +1.21% Scratch Memory Size: 5834752 -> 5804032 (-0.53%); split: -0.70%, +0.18% Max live registers: 901926 -> 903820 (+0.21%); split: -0.01%, +0.22% Non SSA regs after NIR: 555053 -> 521016 (-6.13%); split: -6.14%, +0.01% Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>	2025-06-22 10:55:24 +00:00
Rohan Garg	8a5e062e5e	brw: store the buffer offset for load/store intrinsics This will later be encoded by the backend into the LSC extended descriptor message. Reworks: * Sagar: Add nir_intrinsic_ssbo_atomic_swap Signed-off-by: Rohan Garg <rohan.garg@intel.com> Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>	2025-06-22 10:55:24 +00:00
Rohan Garg	0186113640	brw: encode the offset into the message descriptor for Xe2 Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>	2025-06-22 10:55:24 +00:00
Rohan Garg	937d37f0b1	brw: introduce MEMORY_LOGICAL_ADDRESS_OFFSET to encode address offsets Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>	2025-06-22 10:55:24 +00:00
Lionel Landwerlin	d5a58364b1	brw: add new helper for immediate integer register with type Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>	2025-06-22 10:55:24 +00:00
Lionel Landwerlin	16fca611d7	nir: add new intel ssbo intrinsics Similar to ir3 ones, to optimize offsets in the backend. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>	2025-06-22 10:55:23 +00:00
Lionel Landwerlin	ba119c73c6	intel: replace RANGE_BASE by BASE for uniform block loads We're not currently using RANGE_BASE and we'll use BASE for offset optimizations on Xe2+. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>	2025-06-22 10:55:23 +00:00
Lionel Landwerlin	098249ba66	brw: print descriptor & extended descriptors Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>	2025-06-22 10:55:22 +00:00
Emma Anholt	cd981e27f7	intel/elk: Move wpos_w setup right into nir_intrinsic_load_frag_w. Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Given that the intrinsic will be CSEed at the NIR level, we don't need to preemptively set it up at the top of the shader. No change in HSW shader-db. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>	2025-06-18 23:11:43 +00:00
Emma Anholt	269fbcb144	intel/elk: Use pixel_z for gl_FragCoord.z on pre-gen6. Unless I've seriously missed something, we have the Z in the payload (which we can always request if we need access to it and it's not already passed to us due other WM IZ settings). total instructions in shared programs: 4408303 -> 4408186 (<.01%) instructions in affected programs: 1164 -> 1047 (-10.05%) total cycles in shared programs: 142485036 -> 142484566 (<.01%) cycles in affected programs: 26820 -> 26350 (-1.75%) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>	2025-06-18 23:11:43 +00:00
Emma Anholt	dc55b47a58	intel/elk: Move pre-gen6 smooth interpolation 1/w multiply to NIR. NIR catches that if you're just doing something like adding two smooth inputs, we can do the multiply once on the result instead of on each input. BRW shader-db results: total instructions in shared programs: 4409146 -> 4408303 (-0.02%) instructions in affected programs: 800761 -> 799918 (-0.11%) total cycles in shared programs: 143203198 -> 142485036 (-0.50%) cycles in affected programs: 79081682 -> 78363520 (-0.91%) total sends in shared programs: 363044 -> 363042 (<.01%) sends in affected programs: 33 -> 31 (-6.06%) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>	2025-06-18 23:11:42 +00:00
Emma Anholt	fb9b2261a1	intel/elk: Move pre-gen6 gl_FragCoord.w -> interpolation lowering to NIR. BRW shader-db: total instructions in shared programs: 4409143 -> 4409146 (<.01%) instructions in affected programs: 330 -> 333 (0.91%) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>	2025-06-18 23:11:41 +00:00
Emma Anholt	17ab39fbf8	intel/elk: Fix some tabs in gen4 URB setup. This formatted terribly in my editor, just use spaces. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>	2025-06-18 23:11:40 +00:00
Emma Anholt	9d7a016ed1	intel/elk: Retire the global float pixel_x/y values. Nothing used them any more. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>	2025-06-18 23:11:40 +00:00
Emma Anholt	e1bf014b6e	intel/elk: Reduce this->pixel_x/y usage in gfx4 interp setup. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>	2025-06-18 23:11:40 +00:00
Emma Anholt	241bc5da70	intel/elk: Use the pixel_coord UW x/y values for noncoherent FB reads. No need to force generating the float cast just to turn it back to an int. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>	2025-06-18 23:11:39 +00:00
Emma Anholt	1134cdc198	intel/elk: Lower load_frag_coord to load_{pixel_coord,frag_coord_z/w} in NIR. This moves some conversions to NIR that may get eliminated, and also distinguishes gl_FragCoord.z/w loads at the shader info level so we don't need to flag uses_src_depth/uses_src_w when only gl_FragCoord.xy get used (as is typical). This reduces thread payload setup on many shaders. Also, interestingly, blorp shaders stop reserving space for z/w despite not putting them in the payload (since PS_EXTRA isn't filled out for z/w). HSW shader-db is noise: total instructions in shared programs: 9942649 -> 9942997 (<.01%) instructions in affected programs: 143167 -> 143515 (0.24%) total cycles in shared programs: 314768862 -> 314299112 (-0.15%) cycles in affected programs: 62951452 -> 62481702 (-0.75%) LOST: 44 GAINED: 26 Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>	2025-06-18 23:11:39 +00:00
Emma Anholt	88f1656133	intel/elk: Save the UW pixel x/y as a temp. This will be used for representing gl_FragCoord in NIR and reducing payload registers pushed. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>	2025-06-18 23:11:38 +00:00
Emma Anholt	5222c35924	intel/elk: Save the UW pixel x/y as a temp on gfx6+. This will be used for representing gl_FragCoord in NIR and reducing payload registers pushed. HSW results: total instructions in shared programs: 9940636 -> 9948574 (0.08%) instructions in affected programs: 852560 -> 860498 (0.93%) total cycles in shared programs: 314804525 -> 314900080 (0.03%) cycles in affected programs: 39786599 -> 39882154 (0.24%) LOST: 5 GAINED: 11 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>	2025-06-18 23:11:38 +00:00
Emma Anholt	af74abd68c	intel/fs: Don't bother checking if load_frag_coord uses interpolation. This was leftover dead code from `4bb6e6817e` ("intel: Use a system value for gl_FragCoord") -- the sysval doesn't do any interpolation and doesn't have sources that could use a barycentric. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>	2025-06-18 23:11:37 +00:00
Emma Anholt	0bf114736a	intel: Use the common NIR lowering for fquantize2f16. This generates one extra instruction to set the rounding mode to RTE due to f2f16_rtne in the lowering. This changes the result for fquantize2f16(65505.0) from 65536 to 65504, which fixes SPIR-V conformance for this value: If Value is positive with a magnitude too large to represent as a 16-bit floating-point value, the result is positive infinity. If Value is negative with a magnitude too large to represent as a 16-bit floating-point value, the result is negative infinity. SPIR-V doesn't specify whether this overflow check is before or after rounding, but IEEE specifies rounding first, which is what produces our 65504. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25552>	2025-06-18 22:45:08 +00:00
Lionel Landwerlin	1d8382b88e	brw: enable more lowering for bitfield manipulation at non 32bit sizes Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35381>	2025-06-11 14:09:56 +00:00
Paulo Zanoni	12192f6489	brw: properly decode TGL_PIPE_SCALAR Source: BSpec "Instruction Fields" page (56701), SWSB field. Credits to Caio Oliveira here, since he was helping me while we found this issue together. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35395>	2025-06-09 22:21:13 +00:00
Dave Airlie	870b8717b2	Revert "hasvk/elk: stop turning load_push_constants into load_uniform" Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This reverts commit `b036d2ded2`. This seems to break gtk4 and other stuff. Cc: mesa-stable (taking ack from Lionel saying we should revert) Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35407>	2025-06-09 09:20:19 +10:00
llyyr	c8bd9ac789	brw: don't unconditionally print message on instance creation This would cause Mesa to print this message even if an Intel GPU is just being enumerated by a Vulkan application. For example, `vulkaninfo --summary`. Fixes: `52f73db5b7` ("brw: implement read without format lowering") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35396>	2025-06-07 13:59:22 +00:00
Caio Oliveira	80fb555718	brw: Fix MAD instruction usage in spilling logic Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details The intention here is to build a SIMD8 value, that will be expanded as needed -- just like the SHL/ADD case, but with a single instruction. Found when the was triggering invalid MAD with SIMD32 (that gets compressed) and with overlapping destination and source and which would cause conflict when divided into two SIMD16. Fixes: `338273dedd` ("brw/reg_allocate: Optimize spill offset calculation using integer MAD") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35302>	2025-06-06 15:31:50 +00:00
Lionel Landwerlin	52f73db5b7	brw: implement read without format lowering Load the format enum and then just go through a series of : if format == R16G16B16A16_UNORM color = lower_r32g32_uint_tor_r16g16b16a16_unorm(color) else if format == R16G16B16A16_SNORM ... For Gfx12.5, there is no in-shader conversion. For Gfx12/11, the in-shader conversion covers the following formats : - ISL_FORMAT_R10G10B10A2_UNORM - ISL_FORMAT_R10G10B10A2_UINT - ISL_FORMAT_R11G11B10_FLOAT For Gfx9, the following formats : - ISL_FORMAT_R16G16B16A16_UNORM - ISL_FORMAT_R16G16B16A16_SNORM - ISL_FORMAT_R10G10B10A2_UNORM - ISL_FORMAT_R10G10B10A2_UINT - ISL_FORMAT_R8G8B8A8_UNORM - ISL_FORMAT_R8G8B8A8_SNORM - ISL_FORMAT_R16G16_UNORM - ISL_FORMAT_R16G16_SNORM - ISL_FORMAT_R11G11B10_FLOAT - ISL_FORMAT_R8G8_UNORM - ISL_FORMAT_R8G8_SNORM - ISL_FORMAT_R16_UNORM - ISL_FORMAT_R16_SNORM - ISL_FORMAT_R8_UNORM - ISL_FORMAT_R8_SNORM Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22524>	2025-06-06 12:28:42 +00:00
Lionel Landwerlin	79498a0849	brw: fix brw_nir_fs_needs_null_rt helper In `9b42215e0d` ("iris: ensure null render target for specific cases") I wrongly assumed that writing gl_SampleMask would only happen in multisampled cases. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `9b42215e0d` ("iris: ensure null render target for specific cases") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13292 Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35313>	2025-06-04 10:10:38 +00:00
Lionel Landwerlin	a51d061c00	brw: don't generate invalid instructions `0e3e5146cf` ("intel/brw: Use correct instruction for value change check when coalescing") enabled some new cases that exposed a pre-existing bug that would turn something like this : mul.sat(16) %789:F, %787:F, %788:F mov.g.f0.0(16) %790:F, %789:F (+f0.0) sel(16) %800:UD, %790:UD, 0u into this : mul.sat(16) %790:F, %787:F, %788:F mov.g.f0.0(16) null:F, null<8,8,1>:F (+f0.0) sel(16) %800:UD, %790:UD, 0u The mov[] array can contain the same instruction because it's repeated for each REG_SIZE writes and a SIMD16 instruction will write 2 REG_SIZE. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `0e3e5146cf` ("intel/brw: Use correct instruction for value change check when coalescing") Cc: mesa-stable Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35276>	2025-06-04 06:08:26 +00:00
Caio Oliveira	2bb9b94c4c	brw/disasm: Don't print src1 information for SEND gather Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details There's always only the ARF scalar register source, so don't bother printing other information that won't be used. Matches the assembler code. Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35297>	2025-06-03 22:52:39 +00:00
Sviatoslav Peleshko	0e3e5146cf	intel/brw: Use correct instruction for value change check when coalescing Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details When we have partial VGRF MOVs with offsets, we will reach `channels_remaining == 0` with `inst` that is not writing the whole VGRF. Currently, even though we check `can_coalesce_vars()` for each offset separately, it will always check if the dst value is not changed only for the offset from the instruction that satisfied the `channels_remaining == 0` condition. Instead, we should remember and use the correct instruction for each written offset separately. Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10916 Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35062>	2025-06-01 17:37:10 +00:00
Lionel Landwerlin	f0e18c475b	intel: remove GRL/intel-clc Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35227>	2025-05-29 20:17:13 +00:00
Matt Turner	37016468a5	intel/compiler: Align human-readable send message info This fprintf() was added in commit `cce3bea2a7` ("i965/disasm: Align send instruction meta-information with dst.")) to align the human-readable send message info (e.g. "render MsgDesc: RT write ...") with the destination register on the previous line. Two months later we disabled printing the instruction offset in commit `662f1ccc24` ("i965: Disable hex offset printing in disassembly."), thereby unaligning the human-readable send message info for the next 11 years. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35077>	2025-05-28 21:54:40 +00:00
Caleb Callaway	52db0e1480	intel/compiler: fix SHA generation for shader replace Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35140>	2025-05-27 22:57:19 +00:00
Christian Gmeiner	41f2da1a6e	treewide: Do not use NIR_PASS_V for nir_divergence_analysis(..) Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35131>	2025-05-23 21:19:25 +00:00
Caleb Callaway	e7454f5318	intel/debug: shader dump filter Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details v2: Fixes filtering for various brw shader dump logic Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35061>	2025-05-23 19:57:02 +00:00
Sushma Venkatesh Reddy	6d226ceca1	intel/compiler: Call brw_try_override_assembly independent of debug flag Previously, brw_try_override_assembly was only called when a debug flag was enabled. However, during investigations involving workloads such as Steam games, enabling the debug flag results in excessive NIR and ISA output to stderr, making debugging more difficult. This change ensures that brw_try_override_assembly is called when the INTEL_SHADER_ASM_READ_PATH is set, regardless of the debug flag. This improves usability in scenarios where minimal debug output is desired. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35115>	2025-05-22 21:45:38 +00:00
Lionel Landwerlin	b036d2ded2	hasvk/elk: stop turning load_push_constants into load_uniform Those intrinsics have different semantics in particular with regards to divergence. Turning one into the other without invalidating the divergence information breaks NIR validation. But also the conversion means we get artificially less convergent values in the shaders. So just handle load_push_constants in the backend and stop changing things in Hasvk. Fixes a bunch of tests in dEQP-VK.descriptor_indexing.* dEQP-VK.pipeline..push_constant.graphics_pipeline.dynamic_index_ Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34546>	2025-05-22 07:49:20 +00:00
Lionel Landwerlin	df15968813	anv/brw: stop turning load_push_constants into load_uniform Those intrinsics have different semantics in particular with regards to divergence. Turning one into the other without invalidating the divergence information breaks NIR validation. But also the conversion means we get artificially less convergent values in the shaders. So just handle load_push_constants in the backend and stop changing things in Anv. Fixes a bunch of tests in dEQP-VK.descriptor_indexing.* dEQP-VK.pipeline..push_constant.graphics_pipeline.dynamic_index_ Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34546>	2025-05-22 07:49:20 +00:00
Sushma Venkatesh Reddy	524733a990	intel/compiler: Centralize type stomping logic for Gen12.5 restrictions This patch improves code readability by centralizing the type stomping logic for Gen12.5 region restrictions in `brw_lower_alu_restrictions`. It removes redundant comments and ensures type consistency assertions in `brw_broadcast`, `generate_mov_indirect`, and `generate_shuffle`. Thank you Ken for guiding me on this. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35006>	2025-05-22 06:46:18 +00:00
Iván Briano	27a2f6d1ff	brw: add lowering passes for FS barycentric inputs Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34445>	2025-05-20 20:57:59 +00:00
Iván Briano	8ee14e5291	brw/anv: add provoking vertex to fs_msaa_flags This will be necessary to select the right value for flat inputs in fragment shaders when fragment shader barycentrics are in use. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34445>	2025-05-20 20:57:58 +00:00
Iván Briano	acdd30a9da	brw: check if the FS needs vertex_attributes_bypass to be set Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34445>	2025-05-20 20:57:58 +00:00

1 2 3 4 5 ...

4373 commits