Kenneth Graunke
84139470a5
intel/brw: Use VEC for emit_unzip()
...
Helps make SIMD-split code more SSA-friendly.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28971 >
2024-04-30 17:16:54 -07:00
Kenneth Graunke
545bb8fb6f
intel/brw: Replace type_sz and brw_reg_type_to_size with brw_type_size_*
...
Both of these helpers do the same thing. We now have brw_type_size_bits
and brw_type_size_bytes and can use whichever makes sense in that place.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847 >
2024-04-25 11:41:48 +00:00
Kenneth Graunke
f523bfcf90
intel/brw: Reindent after shortening BRW_REGISTER_TYPE_* to BRW_TYPE_*
...
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847 >
2024-04-25 11:41:48 +00:00
Kenneth Graunke
873fcdff38
intel/brw: Stop using long BRW_REGISTER_TYPE enum names
...
s/BRW_REGISTER_TYPE/BRW_TYPE/g
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847 >
2024-04-25 11:41:48 +00:00
Kenneth Graunke
d5b8cec7a2
intel/brw: Replace FS_OPCODE_LINTERP with BRW_OPCODE_PLN
...
We no longer support the old LINE+MAC lowering, and we already lower
this to MAD in NIR on Gfx11+, so the LINTERP virtual opcode always
corresponds the PLN. The only catch is that LINTERP's operands are
reversed from PLN, so we have to switch them.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28705 >
2024-04-16 02:14:49 +00:00
Caio Oliveira
dae9795628
intel/brw: Remove vestiges of sources on IF opcode, only valid on Gfx6
...
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28379 >
2024-03-29 22:44:01 +00:00
Rohan Garg
467ee9d27a
intel/brw: Xe2+ can do SIMD16 for extended math on HF types
...
BSpec 56797:
Math operation rules when half-floats are used on both source and
destination operands and both source and destinations are packed.
The execution size must be 16.
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27235 >
2024-03-28 19:53:40 +00:00
Rohan Garg
c4b38c717d
intel/brw: account for sources when determining if a operation uses half floats
...
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27235 >
2024-03-28 19:53:40 +00:00
Ian Romanick
cd70e49394
intel/brw: Allow SIMD16 F and HF type conversion moves
...
On DG2, the lowering generated for these MOV instructions is
**awful**. The original SIMD16 MOV
{ 18} 67: mov(16) vgrf54+0.0:HF, vgrf46+0.0:F NoMask group0
is lowered to SIMD8 MOVs:
{ 18} 118: mov(8) vgrf54+0.0:HF, vgrf46+0.0:F NoMask group0
{ 18} 119: mov(8) vgrf54+0.16:HF, vgrf46+1.0:F NoMask group8
These MOVs violate Gfx12.5 region restrictions, so these are further
lowered:
{ 17} 119: mov(8) vgrf83<2>:HF, vgrf46+0.0:F NoMask group0
{ 19} 120: mov(8) vgrf54+0.0:UW, vgrf83<2>:UW NoMask group0
{ 19} 122: mov(8) vgrf84<2>:HF, vgrf46+1.0:F NoMask group8
{ 19} 123: mov(8) vgrf54+0.16:UW, vgrf84<2>:UW NoMask group8
The shader-db and fossil-db results are nothing to get excited
about. However, the affect on vk_cooperative_matrix_perf is substantial. In one subtest
shader: shaders/shmemfp16.spv
cooperativeMatrixProps = 8x8x16 A = float16_t B = float16_t C = float16_t D = float16_t scope = subgroup
TILE_M=128 TILE_N=128, TILE_K=32 BLayout=0
performance on my DG2 improved by ~60% due to a MASSIVE reduction in spills and fills:
-Native code for unnamed compute shader (null) (src_hash 0x00000000) (sha1 c6a41b1c4e7aa2da327a39a70ed36c822a4b172f)
-SIMD32 shader: 32484 instructions. 1 loops. 1893868 cycles. 737:1820 spills:fills, 442 sends, scheduled with mode none. Promoted 1 constants. Compacted 519744 to 492224 bytes (5%)
- START B0 (20782 cycles)
+Native code for unnamed compute shader (null) (src_hash 0x00000000) (sha1 621e960daad5b5579b176717f24a315e7ea560a1)
+SIMD32 shader: 23918 instructions. 1 loops. 1089894 cycles. 432:1166 spills:fills, 442 sends, scheduled with mode none. Promoted 1 constants. Compacted 382688 to 353232 bytes (8%)
shader-db:
All Gfx9 and later platforms had similar results. (Meteor Lake shown)
total instructions in shared programs: 19656270 -> 19653981 (-0.01%)
instructions in affected programs: 61810 -> 59521 (-3.70%)
helped: 116 / HURT: 0
total cycles in shared programs: 823368888 -> 823375854 (<.01%)
cycles in affected programs: 1165284 -> 1172250 (0.60%)
helped: 51 / HURT: 57
fossil-db:
DG2 and Meteor Lake had similar results. (Meteor Lake shown)
*** Shaders only in 'before' results are ignored:
fossil-db/steam-dxvk/total_war_warhammer3/2a3ed2ca632a7cb7/fs.32,
fossil-db/steam-dxvk/total_war_warhammer3/18b9d4a3b1961616/fs.32,
fossil-db/steam-dxvk/total_war_warhammer3/04ac9f3146a6db19/fs.32,
fossil-db/steam-dxvk/total_war_warhammer3/f37ebec6aa1b379a/fs.32,
fossil-db/steam-dxvk/total_war_warhammer3/255c987feb0d4310/fs.32, and 25
more
from 1 apps: fossil-db/steam-dxvk/total_war_warhammer3
Totals:
Instrs: 160946537 -> 160928389 (-0.01%); split: -0.01%, +0.00%
Cycles: 14125908620 -> 14125873958 (-0.00%); split: -0.00%, +0.00%
Totals from 1002 (0.15% of 652134) affected shaders:
Instrs: 411261 -> 393113 (-4.41%); split: -4.41%, +0.00%
Cycles: 16676735 -> 16642073 (-0.21%); split: -0.48%, +0.27%
Tiger Lake
Totals:
Instrs: 164511816 -> 164497202 (-0.01%); split: -0.01%, +0.00%
Cycles: 13801675722 -> 13801629397 (-0.00%); split: -0.00%, +0.00%
Subgroup size: 7955168 -> 7955152 (-0.00%)
Send messages: 8544494 -> 8544486 (-0.00%)
Totals from 997 (0.15% of 651454) affected shaders:
Instrs: 460820 -> 446206 (-3.17%); split: -3.17%, +0.00%
Cycles: 16265514 -> 16219189 (-0.28%); split: -0.84%, +0.56%
Subgroup size: 17552 -> 17536 (-0.09%)
Send messages: 26045 -> 26037 (-0.03%)
Ice Lake
Totals:
Instrs: 165504747 -> 165489970 (-0.01%); split: -0.01%, +0.00%
Cycles: 15145244554 -> 15145149627 (-0.00%); split: -0.00%, +0.00%
Subgroup size: 8107032 -> 8107016 (-0.00%)
Send messages: 8598680 -> 8598672 (-0.00%)
Spill count: 45427 -> 45423 (-0.01%)
Fill count: 74749 -> 74747 (-0.00%)
Totals from 1125 (0.17% of 656115) affected shaders:
Instrs: 521676 -> 506899 (-2.83%); split: -2.83%, +0.00%
Cycles: 19555434 -> 19460507 (-0.49%); split: -0.59%, +0.10%
Subgroup size: 21616 -> 21600 (-0.07%)
Send messages: 28623 -> 28615 (-0.03%)
Spill count: 603 -> 599 (-0.66%)
Fill count: 1362 -> 1360 (-0.15%)
Skylake
*** Shaders only in 'after' results are ignored:
fossil-db/steam-native/red_dead_redemption2/cef460b80bad8485/fs.16,
fossil-db/steam-native/red_dead_redemption2/cd5fe081e2e5529d/fs.16
from 1 apps: fossil-db/steam-native/red_dead_redemption2
Totals:
Instrs: 141607617 -> 141593776 (-0.01%); split: -0.01%, +0.00%
Cycles: 14257812441 -> 14257661671 (-0.00%); split: -0.00%, +0.00%
Subgroup size: 7743752 -> 7743736 (-0.00%)
Send messages: 7552728 -> 7552720 (-0.00%)
Spill count: 43660 -> 43661 (+0.00%)
Fill count: 71301 -> 71303 (+0.00%)
Totals from 1017 (0.16% of 636964) affected shaders:
Instrs: 392454 -> 378613 (-3.53%); split: -3.53%, +0.00%
Cycles: 16622974 -> 16472204 (-0.91%); split: -1.04%, +0.13%
Subgroup size: 19840 -> 19824 (-0.08%)
Send messages: 23021 -> 23013 (-0.03%)
Spill count: 484 -> 485 (+0.21%)
Fill count: 1155 -> 1157 (+0.17%)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28281 >
2024-03-21 15:12:58 -07:00
Francisco Jerez
efc0601ddf
intel/brw/xe2+: Double allowed SIMD width of FB write SEND messages.
...
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28306 >
2024-03-20 15:46:44 -07:00
Lionel Landwerlin
4df58ef503
intel/fs: bump max simd size of some messages for xe2
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28191 >
2024-03-15 03:01:53 +00:00
Kenneth Graunke
edf14f4b7c
intel/brw: Unindent code after previous change
...
I kept things indented in the previous patch to make the diffs easier to
read, but there's no reason to continue doing so.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27959 >
2024-03-05 12:03:31 +00:00
Kenneth Graunke
4c10613625
intel/brw: Remove SIMD lowering to a larger SIMD size
...
On Gfx4, we had to emulate SIMD8 texturing with SIMD16 for some message
types. This ceased to be a thing with Gfx5 and hasn't come up again.
So, we can simply assert that we are truly "SIMD splitting", and assume
that the lowered size is smaller than the original instruction size.
This avoids some mental complexity as we can always think of the split
instructions as taking apart, operating on, and recombining subsets of
the original values.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27959 >
2024-03-05 12:03:31 +00:00
Caio Oliveira
337641cfcc
intel/compiler: Fix SIMD lowering when instruction needs a larger SIMD
...
When lower_simd_width() encounters an instruction that needs a larger
SIMD, for example SHADER_OPCODE_TXS_LOGICAL in Gfx4 needs at least
SIMD16. In this case the builder needs to be at least as large as
max_width, otherwise the group() setup will assert.
Turns out this did not assert before "by accident", since it was
relying on the default fs_visitor builder that had a dispatch width of 64,
a bogus placeholder value, expected not to be used.
However, when we changed the code to remove that builder (and the bogus
value), we created a new builder in the pass shader dispatch_width --
which work fine except in the case where we want to "lower" the SIMD above
the shader dispatch width. The fix is to also consider the already
calculated max_width when creating the builder.
Fixes: 5b8ec015f2 ("intel/compiler: Don't use fs_visitor::bld in remaining places")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10338
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27782 >
2024-03-01 22:54:57 +00:00
Kenneth Graunke
ad37622a8f
intel/brw: Delete legacy texture opcodes
...
We first generate the logical opcodes, and these days fully lower to
SHADER_OPCODE_SEND. In the past, we lowered to a non-logical variant
and handled that in the generator. These days, we were just using the
non-logical opcodes as an awkward intermediate opcode change during
the lowering...which isn't really necessary at all.
This patch eliminates them by using the original logical opcodes.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27908 >
2024-03-01 22:19:51 +00:00
Kenneth Graunke
45a5e4c0c4
intel/brw: Delete SHADER_OPCODE_TXF_UMS
...
Nothing seems to generate this anymore. I guess we always use CMS.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27908 >
2024-03-01 22:19:51 +00:00
Kenneth Graunke
601ef12467
intel/brw: Delete SHADER_OPCODE_TXF_CMS[_LOGICAL]
...
We always use the wide variant (_W) on hardware this compiler supports.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27908 >
2024-03-01 22:19:50 +00:00
Caio Oliveira
c793644ce9
intel/brw: Remove Gfx8- code from SIMD lowering
...
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691 >
2024-02-28 05:45:38 +00:00
Caio Oliveira
071e9f49f1
intel/brw: Remove F16TO32 and F32TO16 opcodes
...
These are done with MOVs and appropriate types in Gfx9+.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691 >
2024-02-28 05:45:38 +00:00
Sagar Ghuge
6f0ab5e4d5
intel/compiler: Add texture gather offset LOD/Bias message support
...
v2: (Ian)
- Space formatting on conditional statement
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27447 >
2024-02-27 00:22:46 +00:00
Sagar Ghuge
79af0ac29a
intel/compiler: Add gather4_i/l/[_c]/b sampler message
...
v2: (Ian)
- Format comment
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27447 >
2024-02-27 00:22:46 +00:00
Caio Oliveira
c25803880e
intel/brw: Move lower_simd_width to its own file
...
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26887 >
2024-02-26 20:54:25 +00:00