fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-25 00:00:11 +01:00

Author	SHA1	Message	Date
Kenneth Graunke	df836ee895	brw: Drop unused defines Nothing uses these. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33297>	2025-02-08 01:07:22 +00:00
Caio Oliveira	650ec7169d	intel/brw: Add SHADER_OPCODE_SEND_GATHER Starting in Xe3, there's a variant of SEND that take the register numbers from the ARF scalar register, and don't require them to be contiguous. The new opcode added here represents that kind of SEND. To make the original sources still reachable, we keep them around during the IR, just ignoring them at generator time. This allow software scoreboard to properly reason the dependencies without trying to decode the contents of ARF scalar register being used. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <None> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32410>	2025-01-30 04:43:58 +00:00
Lionel Landwerlin	8ac7802ac8	brw: move final send lowering up into the IR Because we do emit the final send message form in code generation, a lot of emissions look like this : add(8) vgrf0, u0, 0x100 mov(1) a0.1, vgrf0 # emitted by the generator send(8) ..., a0.1 By moving address register manipulation in the IR, we can get this down to : add(1) a0.1, u0, 0x100 send(8) ..., a0.1 This reduce register pressure around some send messages by 1 vgrf. All lost shaders in the below results are fragment SIMD32, due to the throughput estimator. If turned off, we loose no SIMD32 shaders with this change. DG2 results: Assassin's Creed Valhalla: Totals from 2044 (96.87% of 2110) affected shaders: Instrs: 852879 -> 832044 (-2.44%); split: -2.45%, +0.00% Subgroup size: 23832 -> 23824 (-0.03%) Cycle count: 53345742 -> 52144277 (-2.25%); split: -5.08%, +2.82% Spill count: 729 -> 554 (-24.01%); split: -28.40%, +4.39% Fill count: 2005 -> 1256 (-37.36%) Scratch Memory Size: 25600 -> 19456 (-24.00%); split: -32.00%, +8.00% Max live registers: 116765 -> 115058 (-1.46%) Max dispatch width: 19152 -> 18872 (-1.46%); split: +0.21%, -1.67% Cyberpunk 2077: Totals from 1181 (93.43% of 1264) affected shaders: Instrs: 667192 -> 663615 (-0.54%); split: -0.55%, +0.01% Subgroup size: 13016 -> 13032 (+0.12%) Cycle count: 17383539 -> 17986073 (+3.47%); split: -0.93%, +4.39% Spill count: 12 -> 8 (-33.33%) Fill count: 9 -> 6 (-33.33%) Dota2: Totals from 173 (11.59% of 1493) affected shaders: Cycle count: 274403 -> 280817 (+2.34%); split: -0.01%, +2.34% Max live registers: 5787 -> 5779 (-0.14%) Max dispatch width: 1344 -> 1152 (-14.29%) Hitman3: Totals from 5072 (95.39% of 5317) affected shaders: Instrs: 2879952 -> 2841804 (-1.32%); split: -1.32%, +0.00% Cycle count: 153208505 -> 165860401 (+8.26%); split: -2.22%, +10.48% Spill count: 3942 -> 3200 (-18.82%) Fill count: 10158 -> 8846 (-12.92%) Scratch Memory Size: 257024 -> 223232 (-13.15%) Max live registers: 328467 -> 324631 (-1.17%) Max dispatch width: 43928 -> 42768 (-2.64%); split: +0.09%, -2.73% Fortnite: Totals from 360 (4.82% of 7472) affected shaders: Instrs: 778068 -> 777925 (-0.02%) Subgroup size: 3128 -> 3136 (+0.26%) Cycle count: 38684183 -> 38734579 (+0.13%); split: -0.06%, +0.19% Max live registers: 50689 -> 50658 (-0.06%) Hogwarts Legacy: Totals from 1376 (84.00% of 1638) affected shaders: Instrs: 758810 -> 749727 (-1.20%); split: -1.23%, +0.03% Cycle count: 27778983 -> 28805469 (+3.70%); split: -1.42%, +5.12% Spill count: 2475 -> 2299 (-7.11%); split: -7.47%, +0.36% Fill count: 2677 -> 2445 (-8.67%); split: -9.90%, +1.23% Scratch Memory Size: 99328 -> 89088 (-10.31%) Max live registers: 84969 -> 84671 (-0.35%); split: -0.58%, +0.23% Max dispatch width: 11848 -> 11920 (+0.61%) Metro Exodus: Totals from 92 (0.21% of 43072) affected shaders: Instrs: 262995 -> 262968 (-0.01%) Cycle count: 13818007 -> 13851266 (+0.24%); split: -0.01%, +0.25% Max live registers: 11152 -> 11140 (-0.11%) Red Dead Redemption 2 : Totals from 451 (7.71% of 5847) affected shaders: Instrs: 754178 -> 753811 (-0.05%); split: -0.05%, +0.00% Cycle count: 3484078523 -> 3484111965 (+0.00%); split: -0.00%, +0.00% Max live registers: 42294 -> 42185 (-0.26%) Spiderman Remastered: Totals from 6820 (98.02% of 6958) affected shaders: Instrs: 6921500 -> 6747933 (-2.51%); split: -4.16%, +1.65% Cycle count: 234400692460 -> 236846720707 (+1.04%); split: -0.20%, +1.25% Spill count: 72971 -> 72622 (-0.48%); split: -8.08%, +7.61% Fill count: 212921 -> 198483 (-6.78%); split: -12.37%, +5.58% Scratch Memory Size: 3491840 -> 3410944 (-2.32%); split: -12.05%, +9.74% Max live registers: 493149 -> 487458 (-1.15%) Max dispatch width: 56936 -> 56856 (-0.14%); split: +0.06%, -0.20% Strange Brigade: Totals from 3769 (91.21% of 4132) affected shaders: Instrs: 1354476 -> 1321474 (-2.44%) Cycle count: 25351530 -> 25339190 (-0.05%); split: -1.64%, +1.59% Max live registers: 199057 -> 193656 (-2.71%) Max dispatch width: 30272 -> 30240 (-0.11%) Witcher 3: Totals from 25 (2.40% of 1041) affected shaders: Instrs: 24621 -> 24606 (-0.06%) Cycle count: 2218793 -> 2217503 (-0.06%); split: -0.11%, +0.05% Max live registers: 1963 -> 1955 (-0.41%) LNL results: Assassin's Creed Valhalla: Totals from 1928 (98.02% of 1967) affected shaders: Instrs: 856107 -> 835756 (-2.38%); split: -2.48%, +0.11% Subgroup size: 41264 -> 41280 (+0.04%) Cycle count: 64606590 -> 62371700 (-3.46%); split: -5.57%, +2.11% Spill count: 915 -> 669 (-26.89%); split: -32.79%, +5.90% Fill count: 2414 -> 1617 (-33.02%); split: -36.62%, +3.60% Scratch Memory Size: 62464 -> 44032 (-29.51%); split: -36.07%, +6.56% Max live registers: 205483 -> 202192 (-1.60%) Cyberpunk 2077: Totals from 1177 (96.40% of 1221) affected shaders: Instrs: 682237 -> 678931 (-0.48%); split: -0.51%, +0.03% Subgroup size: 24912 -> 24944 (+0.13%) Cycle count: 24355928 -> 25089292 (+3.01%); split: -0.80%, +3.81% Spill count: 8 -> 3 (-62.50%) Fill count: 6 -> 3 (-50.00%) Max live registers: 126922 -> 125472 (-1.14%) Dota2: Totals from 428 (32.47% of 1318) affected shaders: Instrs: 89355 -> 89740 (+0.43%) Cycle count: 1152412 -> 1152706 (+0.03%); split: -0.52%, +0.55% Max live registers: 32863 -> 32847 (-0.05%) Fortnite: Totals from 5354 (81.72% of 6552) affected shaders: Instrs: 4135059 -> 4239015 (+2.51%); split: -0.01%, +2.53% Cycle count: 132557506 -> 132427302 (-0.10%); split: -0.75%, +0.65% Spill count: 7144 -> 7234 (+1.26%); split: -0.46%, +1.72% Fill count: 12086 -> 12403 (+2.62%); split: -0.73%, +3.35% Scratch Memory Size: 600064 -> 604160 (+0.68%); split: -1.02%, +1.71% Hitman3: Totals from 4912 (97.09% of 5059) affected shaders: Instrs: 2952124 -> 2916824 (-1.20%); split: -1.20%, +0.00% Cycle count: 179985656 -> 189175250 (+5.11%); split: -2.44%, +7.55% Spill count: 3739 -> 3136 (-16.13%) Fill count: 10657 -> 9564 (-10.26%) Scratch Memory Size: 373760 -> 318464 (-14.79%) Max live registers: 597566 -> 589460 (-1.36%) Hogwarts Legacy: Totals from 1471 (96.33% of 1527) affected shaders: Instrs: 748749 -> 766214 (+2.33%); split: -0.71%, +3.05% Cycle count: 33301528 -> 34426308 (+3.38%); split: -1.30%, +4.68% Spill count: 3278 -> 3070 (-6.35%); split: -8.30%, +1.95% Fill count: 4553 -> 4097 (-10.02%); split: -10.85%, +0.83% Scratch Memory Size: 251904 -> 217088 (-13.82%) Max live registers: 168911 -> 168106 (-0.48%); split: -0.59%, +0.12% Metro Exodus: Totals from 18356 (49.81% of 36854) affected shaders: Instrs: 7559386 -> 7621591 (+0.82%); split: -0.01%, +0.83% Cycle count: 195240612 -> 196455186 (+0.62%); split: -1.22%, +1.84% Spill count: 595 -> 546 (-8.24%) Fill count: 1604 -> 1408 (-12.22%) Max live registers: 2086937 -> 2086933 (-0.00%) Red Dead Redemption 2: Totals from 4171 (79.31% of 5259) affected shaders: Instrs: 2619392 -> 2719587 (+3.83%); split: -0.00%, +3.83% Subgroup size: 86416 -> 86432 (+0.02%) Cycle count: 8542836160 -> 8531976886 (-0.13%); split: -0.65%, +0.53% Fill count: 12949 -> 12970 (+0.16%); split: -0.43%, +0.59% Scratch Memory Size: 401408 -> 385024 (-4.08%) Spiderman Remastered: Totals from 6639 (98.94% of 6710) affected shaders: Instrs: 6877980 -> 6800592 (-1.13%); split: -3.11%, +1.98% Cycle count: 282183352210 -> 282100051824 (-0.03%); split: -0.62%, +0.59% Spill count: 63147 -> 64218 (+1.70%); split: -7.12%, +8.82% Fill count: 184931 -> 175591 (-5.05%); split: -10.81%, +5.76% Scratch Memory Size: 5318656 -> 5970944 (+12.26%); split: -5.91%, +18.17% Max live registers: 918240 -> 906604 (-1.27%) Strange Brigade: Totals from 3675 (92.24% of 3984) affected shaders: Instrs: 1462231 -> 1429345 (-2.25%); split: -2.25%, +0.00% Cycle count: 37404050 -> 37345292 (-0.16%); split: -1.25%, +1.09% Max live registers: 361849 -> 351265 (-2.92%) Witcher 3: Totals from 13 (46.43% of 28) affected shaders: Instrs: 593 -> 660 (+11.30%) Cycle count: 28302 -> 28714 (+1.46%) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28199>	2025-01-11 08:41:42 +00:00
Lionel Landwerlin	b110b06447	brw: introduce a new register type for the address register We want to reuse the brw::nr field as a virtual address register identifer. So we can't use brw::file=ARF brw::nr=ADDRESS. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28199>	2025-01-11 08:41:42 +00:00
Kenneth Graunke	7ce66e2b61	brw: Add a new MEMORY_MODE_CONSTANT option This will translate to HDC Constant Cache loads or LSC UGM loads. On LSC, MEMORY_MODE_UNTYPED would be fine, but for HDC we need to distinguish between the regular and constant cache data ports. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32888>	2025-01-10 22:44:09 +00:00
Caio Oliveira	93dfe504f2	intel/brw: Add SHADER_OPCODE_READ_FROM_CHANNEL and LIVE_CHANNEL Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32412>	2024-12-14 11:38:14 -08:00
Caio Oliveira	f8c7348468	intel/brw: Add assembly support for ARF scalar register And the SEND gather variant that uses a scalar register as its only source. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32236>	2024-12-13 02:18:15 +00:00
Caio Oliveira	46e9fe6981	intel/brw: Add TGL_PIPE_SCALAR value Add the enum value for the (in-order) scalar pipe. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32236>	2024-12-13 02:18:15 +00:00
Caio Oliveira	abe41b1d2c	intel/compiler: Use #pragma once instead of header guards Acked-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32534>	2024-12-11 19:47:44 +00:00
Caio Oliveira	8474dc853d	intel/brw: Add SHADER_OPCODE_QUAD_SWAP For the horizontal, vertical and diagonal variants. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31053>	2024-11-22 00:27:01 +00:00
Caio Oliveira	2bd7592b0b	intel/brw: Add SHADER_OPCODE_BALLOT Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31052>	2024-11-21 19:32:59 +00:00
Caio Oliveira	019770f026	intel/brw: Add SHADER_OPCODE_VOTE_* Add opcodes for VOTE_ALL, VOTE_ANY and VOTE_EQUAL. The first two are also used for the quad variants. Move their lowering from NIR conversion to brw_lower_subgroup_ops. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31029>	2024-10-19 02:44:20 +00:00
Caio Oliveira	0ba1159b0a	intel/brw: Add SHADER_OPCODE_*_SCAN Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30496>	2024-10-11 06:40:29 +00:00
Caio Oliveira	9537b62759	intel/brw: Add SHADER_OPCODE_REDUCE Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30496>	2024-10-11 06:40:29 +00:00
Lionel Landwerlin	b45ce7d43e	brw: move null_rt control up a layer We'll want to tune this setting based on other parameters. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Backport-to: 24.2 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31196>	2024-09-23 15:56:02 +00:00
Kenneth Graunke	02482604e5	intel/brw: Delete old-style surface and A64 message opcodes These have now been replaced by the MEMORY_*_LOGICAL opcodes. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Acked-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>	2024-09-12 20:54:36 +00:00
Kenneth Graunke	d5f38be713	intel/brw: Introduce new MEMORY_*_LOGICAL opcodes This is a new unified set of opcodes for memory access loosely patterned after the new LSC-style data port messages introduced on Alchemist GPUs. Rather than creating an opcode for every type of memory access, it has only three opcodes: load, store, and atomic. It has various sources to indicate the rest: - Binding type (raw pointer, pointer to surface state, or BT index) - Address size (A64, A32, A16) - Data size (bit size, number of components) - Opcode (atomic opcode, or LOAD/STORE vs. LOAD_CMASK/STORE_CMASK) - Mode (typed vs. untyped vs. shared-local vs. scratch) - Address (and its dimensionality) - Data (0 for loads, 1 for stores, 2 for atomics) - Whether we want block access Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>	2024-09-12 20:54:36 +00:00
Caio Oliveira	31dfb04fd3	intel/brw: Remove long register file names The long names were originally meant to map to the HW encoding but nowadays the actual encoding values depend on gfx version, whether instruction is 3src, etc. Suggested by Ken. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30704>	2024-08-25 22:08:14 +00:00
Caio Oliveira	6bdf2de4d2	intel/brw: Remove unused ARF values and helpers These were used by old Gfx versions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30704>	2024-08-25 22:08:14 +00:00
Caio Oliveira	72b687abb4	intel/brw: Make BAD_FILE the zero value for brw_reg_file Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30704>	2024-08-25 22:08:14 +00:00
Caio Oliveira	e7179232c9	intel/brw: Move encoding of Gfx11 3-src inside the inst helpers Create specific helper for register file encoding and handle it there. Use ad-hoc structs to let the macro take optional named arguments. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30704>	2024-08-25 22:08:14 +00:00
Kenneth Graunke	d22d6d814d	intel/brw: Fix Xe2+ SWSB encoding/decoding for DPAS instructions SBID SET can only be used on SEND, SENDC, or DPAS instructions. The existing code was handling SET for SEND/SENDC, but was using the wrong encoding for DPAS. Add a new case to handle that and make it clear that the existing code is only for SEND/SENDC. While here, rewrite the encoder to use 2-bit binary immediates shifted up into the mode [9:8] field, rather than pre-shifted hex values. This matches the documentation better and is a little easier to follow. On the decode side, we were incorrectly decoding MATH instructions. Because they're marked is_unordered, we were hitting the SEND/SENDC decoding, which is incorrect for MATH. Fixes 22 cooperative matrix tests on Lunar Lake. Huge thanks to Paulo Zanoni for bisecting failures to one of my commits, then analyzing shaders and experimenting to discover that the failure was really an unrelated bug, just being provoked by different choices of registers. His work narrowing the problem down made it much easier to discover and fix this bug. Backport-to: 24.2 Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30705>	2024-08-20 19:09:37 +00:00
Kenneth Graunke	89f9a6e10b	intel/brw: Pass opcode to brw_swsb_encode/decode We're going to need to handle encoding/decoding differently for DPAS vs. SEND/SENDC vs. other instructions. Pass the opcode so we can figure out the encodings for each type of instruction. Backport-to: 24.2 Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30705>	2024-08-20 19:09:37 +00:00
Kenneth Graunke	84219892ad	intel/brw: Make gl_SubgroupInvocation lane index loading SSA Our code to initialize gl_SubgroupInvocation uses multiple instructions some of which are partial writes. This makes it difficult to analyze expressions involving gl_SubgroupInvocation, which appear very frequently in compute shaders. To make this easier, we add a new virtual opcode which initializes a full VGRF to the value of gl_SubgroupInvocation. (We also expand it to UD for SIMD8 so there are not partial write issues.) We then lower it to the original code later on in compilation, after we've done the bulk of our optimizations. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>	2024-06-18 09:02:25 +00:00
Kenneth Graunke	f04bb49465	intel/brw: Delete SAD2 and SADA2 opcodes These were removed with Icelake. While they technically still exist on Skylake, which this compiler supports, we have never used these opcodes in the 14 years we could have done so. So just scrap them. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29665>	2024-06-10 16:47:50 -07:00
Lionel Landwerlin	d8b78924c5	brw: use a single virtual opcode to read ARF registers In `2c65d90bc8` I forgot to add the new SHADER_OPCODE_READ_MASK_REG opcode to the list of barrier instruction in the scheduler. Let's just use a single opcode for all ARF registers that need special scoreboarding and put the register as source (nicer for the debug output). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `2c65d90bc8` ("intel/brw: ensure find_live_channel don't access arch register without sync") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29446>	2024-05-31 20:22:27 +00:00
Lionel Landwerlin	2c65d90bc8	intel/brw: ensure find_live_channel don't access arch register without sync Another architecture register that requires some care before reading. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `49ee3ae9e8` ("intel/compiler: Lower FIND_[LAST_]LIVE_CHANNEL in IR on Gfx8+") Tested-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29319>	2024-05-24 07:26:17 +00:00
Kenneth Graunke	d5b8cec7a2	intel/brw: Replace FS_OPCODE_LINTERP with BRW_OPCODE_PLN We no longer support the old LINE+MAC lowering, and we already lower this to MAD in NIR on Gfx11+, so the LINTERP virtual opcode always corresponds the PLN. The only catch is that LINTERP's operands are reversed from PLN, so we have to switch them. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28705>	2024-04-16 02:14:49 +00:00
Kenneth Graunke	12b0e03bd2	intel/brw: Use SHADER_OPCODE_SEND for coherent framebuffer reads We already have a logical opcode and lower to what is basically a send instruction. We just weren't using SHADER_OPCODE_SEND, instead having extra redundant infrastructure for no real gain. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28705>	2024-04-16 02:14:49 +00:00
Francisco Jerez	189422de1b	intel/brw/xe2+: Update encoding of FB write descriptor message control. Ref: bspec: 65209, 63908 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28306>	2024-03-20 15:46:44 -07:00
Kenneth Graunke	97bf3d3b2d	intel/brw: Replace CS_OPCODE_CS_TERMINATE with SHADER_OPCODE_SEND There's no need for special handling here, it's just a send message with a trivial g0 header and descriptor. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27924>	2024-03-05 11:16:20 +00:00
Kenneth Graunke	ad37622a8f	intel/brw: Delete legacy texture opcodes We first generate the logical opcodes, and these days fully lower to SHADER_OPCODE_SEND. In the past, we lowered to a non-logical variant and handled that in the generator. These days, we were just using the non-logical opcodes as an awkward intermediate opcode change during the lowering...which isn't really necessary at all. This patch eliminates them by using the original logical opcodes. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27908>	2024-03-01 22:19:51 +00:00
Kenneth Graunke	45a5e4c0c4	intel/brw: Delete SHADER_OPCODE_TXF_UMS Nothing seems to generate this anymore. I guess we always use CMS. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27908>	2024-03-01 22:19:51 +00:00
Kenneth Graunke	601ef12467	intel/brw: Delete SHADER_OPCODE_TXF_CMS[_LOGICAL] We always use the wide variant (_W) on hardware this compiler supports. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27908>	2024-03-01 22:19:50 +00:00
Rohan Garg	73d98848fa	intel/compiler: Xe2+ can do URB load/store with a byte offset Thanks to Ken for suggesting this URB refactoring change and pointing out that the LSC can operate on the byte offset granularity. This should fix the geometry shader test cases where we have more than 32 vertices since previously we were failing to write the correct control data bits because of incorrect write mask. Shader-db results for Xe2: total instructions in shared programs: 153475 -> 153437 (-0.02%) instructions in affected programs: 1374 -> 1336 (-2.77%) helped: 11 HURT: 0 helped stats (abs) min: 3 max: 5 x̄: 3.45 x̃: 3 helped stats (rel) min: 1.67% max: 4.92% x̄: 3.23% x̃: 2.70% 95% mean confidence interval for instructions value: -3.92 -2.99 95% mean confidence interval for instructions %-change: -4.10% -2.36% Instructions are helped. total loops in shared programs: 140 -> 140 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 16002649 -> 16002329 (<.01%) cycles in affected programs: 9174 -> 8854 (-3.49%) helped: 11 HURT: 0 helped stats (abs) min: 22 max: 38 x̄: 29.09 x̃: 32 helped stats (rel) min: 2.62% max: 5.54% x̄: 3.78% x̃: 3.85% 95% mean confidence interval for cycles value: -33.56 -24.62 95% mean confidence interval for cycles %-change: -4.48% -3.08% Cycles are helped. total spills in shared programs: 52 -> 52 (0.00%) spills in affected programs: 0 -> 0 helped: 0 HURT: 0 total fills in shared programs: 94 -> 94 (0.00%) fills in affected programs: 0 -> 0 helped: 0 HURT: 0 total sends in shared programs: 4240 -> 4240 (0.00%) sends in affected programs: 0 -> 0 helped: 0 HURT: 0 LOST: 0 GAINED: 0 Rework: (Sagar) - Adjust offset/indirect offset calculation. - Add shader-db results - Always calculate dword index - Drop changes for indirect writes Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27602>	2024-03-01 16:11:30 +00:00
Kenneth Graunke	eebd24680c	intel/brw: Delete SINCOS Only existed on Gfx4-5. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27872>	2024-02-29 18:00:14 +00:00
Kenneth Graunke	a18030305c	intel/brw: Delete SIMD4x2 URB opcodes Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27872>	2024-02-29 18:00:14 +00:00
Kenneth Graunke	288b966e3e	intel/brw: Delete legacy SFIDs This involves a little rework in the assembler to treat "math" as an opcode token rather than an SFID. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27872>	2024-02-29 18:00:14 +00:00
Kenneth Graunke	afae5e78ca	intel/brw: Delete more unused defines Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27872>	2024-02-29 18:00:14 +00:00
Kenneth Graunke	3202f3fdbe	intel/brw: Delete enum brw_urb_write_flags This was used by the vec4 backend. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27872>	2024-02-29 18:00:14 +00:00
Caio Oliveira	8f3c52c1da	intel/brw: Remove MRF type Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691>	2024-02-28 05:45:39 +00:00
Caio Oliveira	5c93a0e125	intel/brw: Remove Gfx8- remaining opcodes Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691>	2024-02-28 05:45:39 +00:00
Caio Oliveira	071e9f49f1	intel/brw: Remove F16TO32 and F32TO16 opcodes These are done with MOVs and appropriate types in Gfx9+. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691>	2024-02-28 05:45:38 +00:00
Caio Oliveira	c621f75e7b	intel/brw: Remove now unused vec4-only opcodes Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691>	2024-02-28 05:45:38 +00:00
Ian Romanick	8fb37ef985	intel/fs: Add fast path for ballot(true) This doesn't help very much now. A later commit adds a NIR optimization pass, tentatively called nir_opt_uniform_subgroup, that converts many kinds of subgroup operations to things involving bitCount(ballot(true)). This commit makes a huge difference in the results of that later commit. No shader-db changes on any Intel platform. Fossil-db results: All Intel platforms had similar results. (Ice Lake shown) Totals: Instrs: 165558033 -> 165557519 (-0.00%) Cycles: 15156188362 -> 15156178922 (-0.00%); split: -0.00%, +0.00% Totals from 299 (0.05% of 656117) affected shaders: Instrs: 88293 -> 87779 (-0.58%) Cycles: 3709498 -> 3700058 (-0.25%); split: -0.28%, +0.03% v2: Rebase on splitting ELK from BRW. Remove devinfo->ver >= 8 check. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27044>	2024-02-27 08:37:46 -08:00
Sagar Ghuge	6f0ab5e4d5	intel/compiler: Add texture gather offset LOD/Bias message support v2: (Ian) - Space formatting on conditional statement Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27447>	2024-02-27 00:22:46 +00:00
Sagar Ghuge	79af0ac29a	intel/compiler: Add gather4_i/l/[_c]/b sampler message v2: (Ian) - Format comment Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27447>	2024-02-27 00:22:46 +00:00
Ian Romanick	68da9e4dff	intel/compiler/xe2: Set SIMD mode for sampler messages Since SIMD8 no longer exists, the SIMD modes enums have different names and different values. v2 (Francisco Jerez): Rebase on `07b9bfacc7` ("intel/compiler: Move logical-send lowering to a separate file"). v3: Update brw_disasm.c with SIMD descriptions. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27305>	2024-02-02 02:39:10 +00:00
Ian Romanick	78e7f7b377	intel/compiler/xe2: Use new sample_*_mlod messages Note: a future commit will expand the sampler message type to the 6 bits used on Xe2. v2 (Francisco Jerez): Rebase on `07b9bfacc7` ("intel/compiler: Move logical-send lowering to a separate file"). v3: Drop XE2_SAMPLER_MESSAGE_SAMPLE_BIAS_MLOD as it does not actually exist. This resulted in some bigger changes in brw_disasm.c. Noticed by Sagar. v4: Now that XE2_SAMPLER_MESSAGE_SAMPLE_MLODc conflicts with GFX7_SAMPLER_MESSAGE_SAMPLE_GATHER4_PO_C, the determination of min_lod_is_first must include devinfo->ver or previous platforms will break. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27305>	2024-02-02 02:39:09 +00:00
Francisco Jerez	f974eacab3	intel/compiler/xe2: Fix for the removal of most predication modes. Reworks: * Remove changes to fixup_nomask workaround since it applies only for Gfx12 family. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26860>	2024-01-12 20:18:03 +00:00

1 2 3 4

172 commits