fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-22 07:00:12 +01:00

Author	SHA1	Message	Date
Caio Oliveira	ff44f4d278	intel/brw: Update outdated comments Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32536>	2025-02-11 09:13:28 +00:00
Caio Oliveira	228aba779f	intel/brw: Rename brw_inst_* helpers to brw_eu_inst_* Acked-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32643>	2024-12-30 17:16:15 +00:00
Caio Oliveira	06ccaad5f1	intel/brw: Rename brw_compact_inst to brw_eu_compact_inst Consistent with brw_eu_inst. Acked-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32643>	2024-12-30 17:16:15 +00:00
Caio Oliveira	3c3f4a1235	intel/brw: Rename brw_inst to brw_eu_inst Free the old name for the BRW IR instruction. Acked-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32643>	2024-12-30 17:16:15 +00:00
Francisco Jerez	43d59c6186	intel/brw/xe3+: Relax SEND EOT register assignment restrictions. These restrictions have been removed from the hardware. Make the code enforcing and validating them conditional. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32758>	2024-12-20 14:03:15 -08:00
Caio Oliveira	f308be16a0	intel/brw: Add validation for some Xe2 register regioning restrictions Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28636>	2024-12-14 02:15:18 +00:00
Caio Oliveira	6a5a316312	intel/brw: Extract format enum in EU validation code Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28636>	2024-12-14 02:15:18 +00:00
Caio Oliveira	57b703cec3	intel/brw: Skip some regioning EU validation for Vx1 and VxH modes Skip the ones that check the VertStride -- which is set to a special value in those modes. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28636>	2024-12-14 02:15:18 +00:00
Caio Oliveira	5420c027e6	intel/brw: Add validation for ARF scalar register Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32236>	2024-12-13 02:18:15 +00:00
Caio Oliveira	7acd84da51	intel/brw: Consider if SEND is gather variant when setting ex_desc SEND instructions of gather variant will use the upcoming ARF scalar register. They use only Src0 and reuse the bits of Src1.Length (part of ex_desc). Src1.Length is (implicitly) defined as 0. Adapt the helper functions to take the new variant into account when manipulating ex_desc. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32236>	2024-12-13 02:18:15 +00:00
Caio Oliveira	1d704af515	intel/brw: Fix decoding of cond_modifier and saturate in EU validation These fields are only valid in certain formats, so set them accordingly. Note the check if !is_send is used because FORMAT_BASIC is reused for SEND/SENDS in some platforms. If we start to see more cases like that, we can create a new FORMAT for it. The cond_modifier is trickier because on top of that, it is not valid for 64-bit immediates in some platforms. Found when EU validation complained about moving 64-bit immediates with higher bits. Fixes: `e4440df2d8` ("intel/brw: Add pred/cmod/sat to brw_hw_decoded_inst") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32287>	2024-11-22 21:15:46 +00:00
Kenneth Graunke	33cd5a49f1	brw/validate: Return an error for Align16 access mode on Icelake+ Gfx11+ doesn't support Align16 instructions anymore - only Align1 mode. Bailing early for Align16 is important so that brw_hw_decode_inst doesn't try to read Align16 related instruction fields on generations where they no longer exist (which could trigger assertions). Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31834>	2024-10-25 20:31:44 +00:00
Caio Oliveira	13d99979d2	intel/brw: Remove the remaining DO_SRC macro from EU validation Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31296>	2024-10-11 04:13:48 +00:00
Caio Oliveira	f1036da345	intel/brw: Add vstride/width/hstride to brw_hw_decoded_inst Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31296>	2024-10-11 04:13:48 +00:00
Caio Oliveira	2251748aad	intel/brw: Add dst/srcs register numbers to brw_hw_decoded_inst Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31296>	2024-10-11 04:13:48 +00:00
Caio Oliveira	808b8b65b6	intel/brw: Add abs/negate to brw_hw_decoded_inst Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31296>	2024-10-11 04:13:48 +00:00
Caio Oliveira	f6dbb72219	intel/brw: Add dst/src0 address_mode to brw_hw_decoded_inst Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31296>	2024-10-11 04:13:48 +00:00
Caio Oliveira	e4440df2d8	intel/brw: Add pred/cmod/sat to brw_hw_decoded_inst Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31296>	2024-10-11 04:13:48 +00:00
Caio Oliveira	be70d1f9b1	intel/brw: Add dst/srcs type to brw_hw_decoded_inst Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31296>	2024-10-11 04:13:48 +00:00
Caio Oliveira	e0ba4ca166	intel/brw: Add dst/srcs reg file to brw_hw_decoded_inst Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31296>	2024-10-11 04:13:48 +00:00
Caio Oliveira	3db1c3fc0e	intel/brw: Add access_mode to brw_hw_decoded_inst Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31296>	2024-10-11 04:13:48 +00:00
Caio Oliveira	3dc1f64e51	intel/brw: Add exec_size to brw_hw_decoded_inst Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31296>	2024-10-11 04:13:48 +00:00
Caio Oliveira	853fe03470	intel/brw: Add has_dst to brw_hw_decoded_inst Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31296>	2024-10-11 04:13:48 +00:00
Caio Oliveira	c394eb3111	intel/brw: Add num_sources to brw_hw_decoded_inst Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31296>	2024-10-11 04:13:48 +00:00
Caio Oliveira	9cdb90e787	intel/brw: Add opcode to brw_hw_decoded_inst Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31296>	2024-10-11 04:13:48 +00:00
Caio Oliveira	76e177d87d	intel/brw: Create a struct to hold a decoded brw_inst in eu_validation For now it contains only the "raw" brw_inst. Later patches will add useful fields to it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31296>	2024-10-11 04:13:48 +00:00
Caio Oliveira	382bd4ce36	intel/brw: Add ERROR helper variant that returns to EU validation Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31296>	2024-10-11 04:13:48 +00:00
Caio Oliveira	e1b74407bb	intel/brw: Only validate GRF boundary crossing restriction for GRFs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31294>	2024-09-24 03:39:05 +00:00
Caio Oliveira	31dfb04fd3	intel/brw: Remove long register file names The long names were originally meant to map to the HW encoding but nowadays the actual encoding values depend on gfx version, whether instruction is 3src, etc. Suggested by Ken. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30704>	2024-08-25 22:08:14 +00:00
Jordan Justen	58469620d3	intel/brw/validate: Convert access mask to be grf based Our validation code doesn't need to know which bytes are accessed. It only needs to know which grfs were accessed by an element. This also helps to easily handle the Xe2 register size change. Backport-to: 24.2 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28479>	2024-08-02 22:18:51 +00:00
Jordan Justen	e62606b2ec	intel/brw/validate: Update dst grf crossing check for Xe2 Rework: * Update grf_size_shift calculation (s-b Ken) Backport-to: 24.2 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28479>	2024-08-02 22:18:51 +00:00
Jordan Justen	f2800deacb	intel/brw/validate: Simplify grf span validation check by not using a mask Previously this check would create a mask of the bytes used in the grf, and then shift the mask. This worked well when there was 32 bytes in the register because a 64-bit uint64_t could easily detect that bytes were used in the next regiter. (The next register was the high 32-bits of the `access_mask` variable.) With Xe2, the register size becomes 64 bytes, meaning this strategy doesn't work. Instead of a mask, we can just check to see if more than 1 grfs are used during each loop iteration. (Suggested by Ken.) This will make it easier to extend for Xe2 in a follow on commit. Verified this with dEQP-VK.subgroups.arithmetic.compute.subgroupexclusivemul_u64vec4_requiredsubgroupsize on Xe2, which otherwise would cause the program to fail to validate because it assumed a grf was 32 bytes. Backport-to: 24.2 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28479>	2024-08-02 22:18:51 +00:00
Caio Oliveira	8ba8e33c39	intel/brw: Simplify @file annotations Doxygen documentation says > If the file name is omitted (i.e. the line after \file is left > blank) then the documentation block that contains the \file command will > belong to the file it is located in. so we can omit the filename itself when using the annotation. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30168>	2024-07-22 22:48:03 +00:00
Kenneth Graunke	1e69ec3b8d	intel/brw: Add a lower_csel pass and allow building it for all types We can do CSEL on F, HF, W, and D on Gfx11+. Gfx9 can only do F. We can lower unsupported types to CMP+CSEL, allowing us to use CSEL in the IR and not worry about the limitations. Rework: (Sagar) - Update validation pass for CSEL Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29316>	2024-07-01 19:06:31 +00:00
Kenneth Graunke	f04bb49465	intel/brw: Delete SAD2 and SADA2 opcodes These were removed with Icelake. While they technically still exist on Skylake, which this compiler supports, we have never used these opcodes in the 14 years we could have done so. So just scrap them. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29665>	2024-06-10 16:47:50 -07:00
Ian Romanick	504b742b83	intel/brw: Update CSEL source type validation Gfx9 can only have F, but newer GPUs can have F, HF, D, or W. The source and destination types must still match in size. v2: Simplify the float vs integer logic. Suggested by Ken. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29095>	2024-05-14 01:28:20 +00:00
Kenneth Graunke	545bb8fb6f	intel/brw: Replace type_sz and brw_reg_type_to_size with brw_type_size_* Both of these helpers do the same thing. We now have brw_type_size_bits and brw_type_size_bytes and can use whichever makes sense in that place. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>	2024-04-25 11:41:48 +00:00
Kenneth Graunke	007d891239	intel/brw: Use newer brw_type_is_* shorter names Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>	2024-04-25 11:41:48 +00:00
Kenneth Graunke	873fcdff38	intel/brw: Stop using long BRW_REGISTER_TYPE enum names s/BRW_REGISTER_TYPE/BRW_TYPE/g Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>	2024-04-25 11:41:48 +00:00
Kenneth Graunke	9d8f2c4421	intel/brw: Rework BRW_REGISTER_TYPE's representation semantics In ancient days, we directly used the hardware register type encodings throughout the compiler. As more GPU generations came out, encodings shifted, and we moved to an abstract enum that we could encode/decode to a particular GPU's hardware encoding. But there was no particular meaning behind any particular value. One downside to this approach is that we end up with switch statements galore. Want to know a type's size? Switch. Convert a unsigned type to a signed one? Switch. Get a type with the same base type, but different bit size? Switch. This is both inefficient and inconvenient. In contrast, nir_alu_type takes a nicer approach - the type encoding has certain bits representing the base type, and others encoding the size of the type. Switching base types or sizes is a simple matter of masking out the relevant field and substituting a different one. Tigerlake's encoding adopts a similar approach: two bits represent the size as a 2-bit unsigned number n, where the bit size is (8 * 2^n). Two more bits represent the base type. Past encodings were a bit ad hoc as new data types were added over time, but Gfx12 is organized (mostly). This patch converts our brw_reg_type enum over to a new system that's patterned after the Tigerlake style (for easy conversion) while deviating in a few ways that make our vector immediate type size handling simpler. Should we add additional base types, we're likely to continue deviating. Still, converting is much simpler. Type size calculations (which are performed all the time) are now a simple mask and shift, instead of a switch. We also adopt the name BRW_TYPE_* instead of BRW_REGISTER_TYPE_* because it's much shorter and easier to type. Similarly, we create new helper functions named brw_type_* for working with these types, with a cleaner naming convention. Legacy names still exist but will we dropped over the next few patches as pieces get cleaned up. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>	2024-04-25 11:41:48 +00:00
Kenneth Graunke	c45e235df5	intel/brw: Drop NF type support Icelake removed the PLN instruction for interpolating fragment shader inputs, instead adding a special "Native Float" (NF) data type which was a 66-bit floating point data type that could only be used with the accumulator. On Tigerlake, they dropped NF support in favor of just doing the interpolation with MAD instructions. We stopped using NF years ago (commit `9ea90aae1e`), instead just using the fs_visitor::lower_linterp() pass to emit MADs. Since this existed only for a short time, and had very limited utility, we drop it from the compiler. One downside is that we can no longer disassemble Icelake shaders containing NF types properly, but I doubt anyone really minds. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>	2024-04-25 11:41:48 +00:00
Ian Romanick	0e817ba548	intel/brw/xe2+: Implement Wa 22016140776 HF sources to math instructions cannot be scalar. This is very similar to an old Gfx6 restriction on POW, so let's fix it in a similar way. As an extra bit of saftey, lower any occurances that might slip through in brw_fs_lower_regioning. The primary change is to prevent copy propagation from violating the restriction. With that change, nothing should be able to generate these invalid source strides. The modification to fs_visitor::validate should detect potential problems sooner rather than later. Previous attempts to implement this Wa when emitting the math instruction (in brw_eu_emit.c gfx6_math) didn't work for several reasons. The lowering happens after the SWSB pass, so the scoreboarding was incorrect (thanks to Curro for finding that). In addition, the lowering happens after register allocation, so it's impossible to allocate a non-scalar register to expand the scalar value. Fixes 113 tests in the dEQP-VK.spirv_assembly.* group on LNL. v2: Add changes to brw_fs_lower_regioning. Suggested by Curro. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28480>	2024-04-04 21:04:09 -07:00
Rohan Garg	3d68dd78d0	intel/eu/validate: Allow SIMD16 for mixed mode float operations on xe2+ Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28484>	2024-04-01 00:00:03 +00:00
Ian Romanick	6d85f7129a	intel/brw/xe2+: DPAS must be SIMD16 now Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28404>	2024-03-29 21:12:32 +00:00
Ian Romanick	cd70e49394	intel/brw: Allow SIMD16 F and HF type conversion moves On DG2, the lowering generated for these MOV instructions is awful. The original SIMD16 MOV { 18} 67: mov(16) vgrf54+0.0:HF, vgrf46+0.0:F NoMask group0 is lowered to SIMD8 MOVs: { 18} 118: mov(8) vgrf54+0.0:HF, vgrf46+0.0:F NoMask group0 { 18} 119: mov(8) vgrf54+0.16:HF, vgrf46+1.0:F NoMask group8 These MOVs violate Gfx12.5 region restrictions, so these are further lowered: { 17} 119: mov(8) vgrf83<2>:HF, vgrf46+0.0:F NoMask group0 { 19} 120: mov(8) vgrf54+0.0:UW, vgrf83<2>:UW NoMask group0 { 19} 122: mov(8) vgrf84<2>:HF, vgrf46+1.0:F NoMask group8 { 19} 123: mov(8) vgrf54+0.16:UW, vgrf84<2>:UW NoMask group8 The shader-db and fossil-db results are nothing to get excited about. However, the affect on vk_cooperative_matrix_perf is substantial. In one subtest shader: shaders/shmemfp16.spv cooperativeMatrixProps = 8x8x16 A = float16_t B = float16_t C = float16_t D = float16_t scope = subgroup TILE_M=128 TILE_N=128, TILE_K=32 BLayout=0 performance on my DG2 improved by ~60% due to a MASSIVE reduction in spills and fills: -Native code for unnamed compute shader (null) (src_hash 0x00000000) (sha1 c6a41b1c4e7aa2da327a39a70ed36c822a4b172f) -SIMD32 shader: 32484 instructions. 1 loops. 1893868 cycles. 737:1820 spills:fills, 442 sends, scheduled with mode none. Promoted 1 constants. Compacted 519744 to 492224 bytes (5%) - START B0 (20782 cycles) +Native code for unnamed compute shader (null) (src_hash 0x00000000) (sha1 621e960daad5b5579b176717f24a315e7ea560a1) +SIMD32 shader: 23918 instructions. 1 loops. 1089894 cycles. 432:1166 spills:fills, 442 sends, scheduled with mode none. Promoted 1 constants. Compacted 382688 to 353232 bytes (8%) shader-db: All Gfx9 and later platforms had similar results. (Meteor Lake shown) total instructions in shared programs: 19656270 -> 19653981 (-0.01%) instructions in affected programs: 61810 -> 59521 (-3.70%) helped: 116 / HURT: 0 total cycles in shared programs: 823368888 -> 823375854 (<.01%) cycles in affected programs: 1165284 -> 1172250 (0.60%) helped: 51 / HURT: 57 fossil-db: DG2 and Meteor Lake had similar results. (Meteor Lake shown) * Shaders only in 'before' results are ignored: fossil-db/steam-dxvk/total_war_warhammer3/2a3ed2ca632a7cb7/fs.32, fossil-db/steam-dxvk/total_war_warhammer3/18b9d4a3b1961616/fs.32, fossil-db/steam-dxvk/total_war_warhammer3/04ac9f3146a6db19/fs.32, fossil-db/steam-dxvk/total_war_warhammer3/f37ebec6aa1b379a/fs.32, fossil-db/steam-dxvk/total_war_warhammer3/255c987feb0d4310/fs.32, and 25 more from 1 apps: fossil-db/steam-dxvk/total_war_warhammer3 Totals: Instrs: 160946537 -> 160928389 (-0.01%); split: -0.01%, +0.00% Cycles: 14125908620 -> 14125873958 (-0.00%); split: -0.00%, +0.00% Totals from 1002 (0.15% of 652134) affected shaders: Instrs: 411261 -> 393113 (-4.41%); split: -4.41%, +0.00% Cycles: 16676735 -> 16642073 (-0.21%); split: -0.48%, +0.27% Tiger Lake Totals: Instrs: 164511816 -> 164497202 (-0.01%); split: -0.01%, +0.00% Cycles: 13801675722 -> 13801629397 (-0.00%); split: -0.00%, +0.00% Subgroup size: 7955168 -> 7955152 (-0.00%) Send messages: 8544494 -> 8544486 (-0.00%) Totals from 997 (0.15% of 651454) affected shaders: Instrs: 460820 -> 446206 (-3.17%); split: -3.17%, +0.00% Cycles: 16265514 -> 16219189 (-0.28%); split: -0.84%, +0.56% Subgroup size: 17552 -> 17536 (-0.09%) Send messages: 26045 -> 26037 (-0.03%) Ice Lake Totals: Instrs: 165504747 -> 165489970 (-0.01%); split: -0.01%, +0.00% Cycles: 15145244554 -> 15145149627 (-0.00%); split: -0.00%, +0.00% Subgroup size: 8107032 -> 8107016 (-0.00%) Send messages: 8598680 -> 8598672 (-0.00%) Spill count: 45427 -> 45423 (-0.01%) Fill count: 74749 -> 74747 (-0.00%) Totals from 1125 (0.17% of 656115) affected shaders: Instrs: 521676 -> 506899 (-2.83%); split: -2.83%, +0.00% Cycles: 19555434 -> 19460507 (-0.49%); split: -0.59%, +0.10% Subgroup size: 21616 -> 21600 (-0.07%) Send messages: 28623 -> 28615 (-0.03%) Spill count: 603 -> 599 (-0.66%) Fill count: 1362 -> 1360 (-0.15%) Skylake * Shaders only in 'after' results are ignored: fossil-db/steam-native/red_dead_redemption2/cef460b80bad8485/fs.16, fossil-db/steam-native/red_dead_redemption2/cd5fe081e2e5529d/fs.16 from 1 apps: fossil-db/steam-native/red_dead_redemption2 Totals: Instrs: 141607617 -> 141593776 (-0.01%); split: -0.01%, +0.00% Cycles: 14257812441 -> 14257661671 (-0.00%); split: -0.00%, +0.00% Subgroup size: 7743752 -> 7743736 (-0.00%) Send messages: 7552728 -> 7552720 (-0.00%) Spill count: 43660 -> 43661 (+0.00%) Fill count: 71301 -> 71303 (+0.00%) Totals from 1017 (0.16% of 636964) affected shaders: Instrs: 392454 -> 378613 (-3.53%); split: -3.53%, +0.00% Cycles: 16622974 -> 16472204 (-0.91%); split: -1.04%, +0.13% Subgroup size: 19840 -> 19824 (-0.08%) Send messages: 23021 -> 23013 (-0.03%) Spill count: 484 -> 485 (+0.21%) Fill count: 1155 -> 1157 (+0.17%) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28281>	2024-03-21 15:12:58 -07:00
Ian Romanick	66dc6e07f5	intel/brw: Fix handling of accumulator register numbers Folks, there's more than one accumulator. In general, when the register file is ARF, the upper 4 bits of the register number specify which ARF, and the lower 4 bits specify which one of that ARF. This can be further partitioned by the subregister number. This is already mostly handled correctly for flags register, but lots of places wanted to check the register number for equality with BRW_ARF_ACCUMULATOR. If acc1 is ever specified, that won't work. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28281>	2024-03-21 15:12:54 -07:00
Jordan Justen	6922f421f4	intel/compiler: nib_ctrl no longer exists on Xe2+ Ref: `cfb34dc695` ("intel/eu/validate: Validate that the ExecSize is a factor of chosen ChanOff") Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28191>	2024-03-15 03:01:53 +00:00
Ian Romanick	93478c095e	intel/compiler: Enforce 64-bit RepCtrl restriction in eu_validate For some reason, this wasn't always caught in fs_visitor::validate. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27552>	2024-03-12 21:31:30 +00:00
Kenneth Graunke	a18030305c	intel/brw: Delete SIMD4x2 URB opcodes Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27872>	2024-02-29 18:00:14 +00:00
Caio Oliveira	8f3c52c1da	intel/brw: Remove MRF type Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691>	2024-02-28 05:45:39 +00:00

1 2 3

137 commits