Commit graph

137 commits

Author SHA1 Message Date
Caio Oliveira
ff44f4d278 intel/brw: Update outdated comments
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32536>
2025-02-11 09:13:28 +00:00
Caio Oliveira
228aba779f intel/brw: Rename brw_inst_* helpers to brw_eu_inst_*
Acked-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32643>
2024-12-30 17:16:15 +00:00
Caio Oliveira
06ccaad5f1 intel/brw: Rename brw_compact_inst to brw_eu_compact_inst
Consistent with brw_eu_inst.

Acked-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32643>
2024-12-30 17:16:15 +00:00
Caio Oliveira
3c3f4a1235 intel/brw: Rename brw_inst to brw_eu_inst
Free the old name for the BRW IR instruction.

Acked-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32643>
2024-12-30 17:16:15 +00:00
Francisco Jerez
43d59c6186 intel/brw/xe3+: Relax SEND EOT register assignment restrictions.
These restrictions have been removed from the hardware.  Make the code
enforcing and validating them conditional.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32758>
2024-12-20 14:03:15 -08:00
Caio Oliveira
f308be16a0 intel/brw: Add validation for some Xe2 register regioning restrictions
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28636>
2024-12-14 02:15:18 +00:00
Caio Oliveira
6a5a316312 intel/brw: Extract format enum in EU validation code
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28636>
2024-12-14 02:15:18 +00:00
Caio Oliveira
57b703cec3 intel/brw: Skip some regioning EU validation for Vx1 and VxH modes
Skip the ones that check the VertStride -- which is set to a special
value in those modes.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28636>
2024-12-14 02:15:18 +00:00
Caio Oliveira
5420c027e6 intel/brw: Add validation for ARF scalar register
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32236>
2024-12-13 02:18:15 +00:00
Caio Oliveira
7acd84da51 intel/brw: Consider if SEND is gather variant when setting ex_desc
SEND instructions of gather variant will use the upcoming ARF scalar
register.  They use only Src0 and reuse the bits of Src1.Length (part of
ex_desc).  Src1.Length is (implicitly) defined as 0.

Adapt the helper functions to take the new variant into account when
manipulating ex_desc.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32236>
2024-12-13 02:18:15 +00:00
Caio Oliveira
1d704af515 intel/brw: Fix decoding of cond_modifier and saturate in EU validation
These fields are only valid in certain formats, so set them accordingly.
Note the check if !is_send is used because FORMAT_BASIC is reused for
SEND/SENDS in some platforms.  If we start to see more cases like that,
we can create a new FORMAT for it.

The cond_modifier is trickier because on top of that, it is not valid
for 64-bit immediates in some platforms.  Found when EU validation
complained about moving 64-bit immediates with higher bits.

Fixes: e4440df2d8 ("intel/brw: Add pred/cmod/sat to brw_hw_decoded_inst")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32287>
2024-11-22 21:15:46 +00:00
Kenneth Graunke
33cd5a49f1 brw/validate: Return an error for Align16 access mode on Icelake+
Gfx11+ doesn't support Align16 instructions anymore - only Align1 mode.

Bailing early for Align16 is important so that brw_hw_decode_inst
doesn't try to read Align16 related instruction fields on generations
where they no longer exist (which could trigger assertions).

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31834>
2024-10-25 20:31:44 +00:00
Caio Oliveira
13d99979d2 intel/brw: Remove the remaining DO_SRC macro from EU validation
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31296>
2024-10-11 04:13:48 +00:00
Caio Oliveira
f1036da345 intel/brw: Add vstride/width/hstride to brw_hw_decoded_inst
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31296>
2024-10-11 04:13:48 +00:00
Caio Oliveira
2251748aad intel/brw: Add dst/srcs register numbers to brw_hw_decoded_inst
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31296>
2024-10-11 04:13:48 +00:00
Caio Oliveira
808b8b65b6 intel/brw: Add abs/negate to brw_hw_decoded_inst
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31296>
2024-10-11 04:13:48 +00:00
Caio Oliveira
f6dbb72219 intel/brw: Add dst/src0 address_mode to brw_hw_decoded_inst
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31296>
2024-10-11 04:13:48 +00:00
Caio Oliveira
e4440df2d8 intel/brw: Add pred/cmod/sat to brw_hw_decoded_inst
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31296>
2024-10-11 04:13:48 +00:00
Caio Oliveira
be70d1f9b1 intel/brw: Add dst/srcs type to brw_hw_decoded_inst
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31296>
2024-10-11 04:13:48 +00:00
Caio Oliveira
e0ba4ca166 intel/brw: Add dst/srcs reg file to brw_hw_decoded_inst
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31296>
2024-10-11 04:13:48 +00:00
Caio Oliveira
3db1c3fc0e intel/brw: Add access_mode to brw_hw_decoded_inst
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31296>
2024-10-11 04:13:48 +00:00
Caio Oliveira
3dc1f64e51 intel/brw: Add exec_size to brw_hw_decoded_inst
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31296>
2024-10-11 04:13:48 +00:00
Caio Oliveira
853fe03470 intel/brw: Add has_dst to brw_hw_decoded_inst
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31296>
2024-10-11 04:13:48 +00:00
Caio Oliveira
c394eb3111 intel/brw: Add num_sources to brw_hw_decoded_inst
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31296>
2024-10-11 04:13:48 +00:00
Caio Oliveira
9cdb90e787 intel/brw: Add opcode to brw_hw_decoded_inst
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31296>
2024-10-11 04:13:48 +00:00
Caio Oliveira
76e177d87d intel/brw: Create a struct to hold a decoded brw_inst in eu_validation
For now it contains only the "raw" brw_inst.  Later patches will add
useful fields to it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31296>
2024-10-11 04:13:48 +00:00
Caio Oliveira
382bd4ce36 intel/brw: Add ERROR helper variant that returns to EU validation
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31296>
2024-10-11 04:13:48 +00:00
Caio Oliveira
e1b74407bb intel/brw: Only validate GRF boundary crossing restriction for GRFs
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31294>
2024-09-24 03:39:05 +00:00
Caio Oliveira
31dfb04fd3 intel/brw: Remove long register file names
The long names were originally meant to map to the HW encoding but
nowadays the actual encoding values depend on gfx version, whether
instruction is 3src, etc.

Suggested by Ken.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30704>
2024-08-25 22:08:14 +00:00
Jordan Justen
58469620d3 intel/brw/validate: Convert access mask to be grf based
Our validation code doesn't need to know which bytes are accessed. It
only needs to know which grfs were accessed by an element. This also
helps to easily handle the Xe2 register size change.

Backport-to: 24.2
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28479>
2024-08-02 22:18:51 +00:00
Jordan Justen
e62606b2ec intel/brw/validate: Update dst grf crossing check for Xe2
Rework:
 * Update grf_size_shift calculation (s-b Ken)

Backport-to: 24.2
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28479>
2024-08-02 22:18:51 +00:00
Jordan Justen
f2800deacb intel/brw/validate: Simplify grf span validation check by not using a mask
Previously this check would create a mask of the bytes used in the
grf, and then shift the mask. This worked well when there was 32 bytes
in the register because a 64-bit uint64_t could easily detect that
bytes were used in the next regiter. (The next register was the high
32-bits of the `access_mask` variable.)

With Xe2, the register size becomes 64 bytes, meaning this strategy
doesn't work. Instead of a mask, we can just check to see if more than
1 grfs are used during each loop iteration. (Suggested by Ken.) This
will make it easier to extend for Xe2 in a follow on commit.

Verified this with
dEQP-VK.subgroups.arithmetic.compute.subgroupexclusivemul_u64vec4_requiredsubgroupsize
on Xe2, which otherwise would cause the program to fail to validate
because it assumed a grf was 32 bytes.

Backport-to: 24.2
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28479>
2024-08-02 22:18:51 +00:00
Caio Oliveira
8ba8e33c39 intel/brw: Simplify @file annotations
Doxygen documentation says

> If the file name is omitted (i.e. the line after \file is left
> blank) then the documentation block that contains the \file command will
> belong to the file it is located in.

so we can omit the filename itself when using the annotation.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30168>
2024-07-22 22:48:03 +00:00
Kenneth Graunke
1e69ec3b8d intel/brw: Add a lower_csel pass and allow building it for all types
We can do CSEL on F, HF, *W, and *D on Gfx11+.  Gfx9 can only do F.

We can lower unsupported types to CMP+CSEL, allowing us to use CSEL
in the IR and not worry about the limitations.

Rework: (Sagar)
- Update validation pass for CSEL

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29316>
2024-07-01 19:06:31 +00:00
Kenneth Graunke
f04bb49465 intel/brw: Delete SAD2 and SADA2 opcodes
These were removed with Icelake.  While they technically still exist on
Skylake, which this compiler supports, we have never used these opcodes
in the 14 years we could have done so.  So just scrap them.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29665>
2024-06-10 16:47:50 -07:00
Ian Romanick
504b742b83 intel/brw: Update CSEL source type validation
Gfx9 can only have F, but newer GPUs can have F, HF, *D, or *W. The
source and destination types must still match in size.

v2: Simplify the float vs integer logic. Suggested by Ken.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29095>
2024-05-14 01:28:20 +00:00
Kenneth Graunke
545bb8fb6f intel/brw: Replace type_sz and brw_reg_type_to_size with brw_type_size_*
Both of these helpers do the same thing.  We now have brw_type_size_bits
and brw_type_size_bytes and can use whichever makes sense in that place.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>
2024-04-25 11:41:48 +00:00
Kenneth Graunke
007d891239 intel/brw: Use newer brw_type_is_* shorter names
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>
2024-04-25 11:41:48 +00:00
Kenneth Graunke
873fcdff38 intel/brw: Stop using long BRW_REGISTER_TYPE enum names
s/BRW_REGISTER_TYPE/BRW_TYPE/g

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>
2024-04-25 11:41:48 +00:00
Kenneth Graunke
9d8f2c4421 intel/brw: Rework BRW_REGISTER_TYPE's representation semantics
In ancient days, we directly used the hardware register type encodings
throughout the compiler.  As more GPU generations came out, encodings
shifted, and we moved to an abstract enum that we could encode/decode
to a particular GPU's hardware encoding.  But there was no particular
meaning behind any particular value.

One downside to this approach is that we end up with switch statements
galore.  Want to know a type's size?  Switch.  Convert a unsigned type
to a signed one?  Switch.  Get a type with the same base type, but
different bit size?  Switch.  This is both inefficient and inconvenient.

In contrast, nir_alu_type takes a nicer approach - the type encoding has
certain bits representing the base type, and others encoding the size of
the type.  Switching base types or sizes is a simple matter of masking
out the relevant field and substituting a different one.

Tigerlake's encoding adopts a similar approach: two bits represent the
size as a 2-bit unsigned number n, where the bit size is (8 * 2^n).
Two more bits represent the base type.  Past encodings were a bit ad hoc
as new data types were added over time, but Gfx12 is organized (mostly).

This patch converts our brw_reg_type enum over to a new system that's
patterned after the Tigerlake style (for easy conversion) while
deviating in a few ways that make our vector immediate type size
handling simpler.  Should we add additional base types, we're likely
to continue deviating.  Still, converting is much simpler.

Type size calculations (which are performed all the time) are now a
simple mask and shift, instead of a switch.

We also adopt the name BRW_TYPE_* instead of BRW_REGISTER_TYPE_* because
it's much shorter and easier to type.  Similarly, we create new helper
functions named brw_type_* for working with these types, with a cleaner
naming convention.  Legacy names still exist but will we dropped over
the next few patches as pieces get cleaned up.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>
2024-04-25 11:41:48 +00:00
Kenneth Graunke
c45e235df5 intel/brw: Drop NF type support
Icelake removed the PLN instruction for interpolating fragment shader
inputs, instead adding a special "Native Float" (NF) data type which
was a 66-bit floating point data type that could only be used with the
accumulator.  On Tigerlake, they dropped NF support in favor of just
doing the interpolation with MAD instructions.

We stopped using NF years ago (commit 9ea90aae1e),
instead just using the fs_visitor::lower_linterp() pass to emit MADs.

Since this existed only for a short time, and had very limited utility,
we drop it from the compiler.  One downside is that we can no longer
disassemble Icelake shaders containing NF types properly, but I doubt
anyone really minds.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>
2024-04-25 11:41:48 +00:00
Ian Romanick
0e817ba548 intel/brw/xe2+: Implement Wa 22016140776
HF sources to math instructions cannot be scalar. This is very similar
to an old Gfx6 restriction on POW, so let's fix it in a similar way.

As an extra bit of saftey, lower any occurances that might slip through
in brw_fs_lower_regioning.

The primary change is to prevent copy propagation from violating the
restriction. With that change, nothing should be able to generate these
invalid source strides. The modification to fs_visitor::validate should
detect potential problems sooner rather than later.

Previous attempts to implement this Wa when emitting the math
instruction (in brw_eu_emit.c gfx6_math) didn't work for several
reasons. The lowering happens after the SWSB pass, so the scoreboarding
was incorrect (thanks to Curro for finding that). In addition, the
lowering happens after register allocation, so it's impossible to
allocate a non-scalar register to expand the scalar value.

Fixes 113 tests in the dEQP-VK.spirv_assembly.* group on LNL.

v2: Add changes to brw_fs_lower_regioning. Suggested by Curro.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28480>
2024-04-04 21:04:09 -07:00
Rohan Garg
3d68dd78d0 intel/eu/validate: Allow SIMD16 for mixed mode float operations on xe2+
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28484>
2024-04-01 00:00:03 +00:00
Ian Romanick
6d85f7129a intel/brw/xe2+: DPAS must be SIMD16 now
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28404>
2024-03-29 21:12:32 +00:00
Ian Romanick
cd70e49394 intel/brw: Allow SIMD16 F and HF type conversion moves
On DG2, the lowering generated for these MOV instructions is
**awful**. The original SIMD16 MOV

    { 18}   67: mov(16) vgrf54+0.0:HF, vgrf46+0.0:F NoMask group0

is lowered to SIMD8 MOVs:

    { 18}  118: mov(8) vgrf54+0.0:HF, vgrf46+0.0:F NoMask group0
    { 18}  119: mov(8) vgrf54+0.16:HF, vgrf46+1.0:F NoMask group8

These MOVs violate Gfx12.5 region restrictions, so these are further
lowered:

    { 17}  119: mov(8) vgrf83<2>:HF, vgrf46+0.0:F NoMask group0
    { 19}  120: mov(8) vgrf54+0.0:UW, vgrf83<2>:UW NoMask group0
    { 19}  122: mov(8) vgrf84<2>:HF, vgrf46+1.0:F NoMask group8
    { 19}  123: mov(8) vgrf54+0.16:UW, vgrf84<2>:UW NoMask group8

The shader-db and fossil-db results are nothing to get excited
about. However, the affect on vk_cooperative_matrix_perf is substantial. In one subtest

shader: shaders/shmemfp16.spv

cooperativeMatrixProps = 8x8x16   A = float16_t B = float16_t C = float16_t D = float16_t scope = subgroup
TILE_M=128 TILE_N=128, TILE_K=32 BLayout=0

performance on my DG2 improved by ~60% due to a MASSIVE reduction in spills and fills:

-Native code for unnamed compute shader (null) (src_hash 0x00000000) (sha1 c6a41b1c4e7aa2da327a39a70ed36c822a4b172f)
-SIMD32 shader: 32484 instructions. 1 loops. 1893868 cycles. 737:1820 spills:fills, 442 sends, scheduled with mode none. Promoted 1 constants. Compacted 519744 to 492224 bytes (5%)
-   START B0 (20782 cycles)
+Native code for unnamed compute shader (null) (src_hash 0x00000000) (sha1 621e960daad5b5579b176717f24a315e7ea560a1)
+SIMD32 shader: 23918 instructions. 1 loops. 1089894 cycles. 432:1166 spills:fills, 442 sends, scheduled with mode none. Promoted 1 constants. Compacted 382688 to 353232 bytes (8%)

shader-db:

All Gfx9 and later platforms had similar results. (Meteor Lake shown)
total instructions in shared programs: 19656270 -> 19653981 (-0.01%)
instructions in affected programs: 61810 -> 59521 (-3.70%)
helped: 116 / HURT: 0

total cycles in shared programs: 823368888 -> 823375854 (<.01%)
cycles in affected programs: 1165284 -> 1172250 (0.60%)
helped: 51 / HURT: 57

fossil-db:

DG2 and Meteor Lake had similar results. (Meteor Lake shown)
*** Shaders only in 'before' results are ignored:
fossil-db/steam-dxvk/total_war_warhammer3/2a3ed2ca632a7cb7/fs.32,
fossil-db/steam-dxvk/total_war_warhammer3/18b9d4a3b1961616/fs.32,
fossil-db/steam-dxvk/total_war_warhammer3/04ac9f3146a6db19/fs.32,
fossil-db/steam-dxvk/total_war_warhammer3/f37ebec6aa1b379a/fs.32,
fossil-db/steam-dxvk/total_war_warhammer3/255c987feb0d4310/fs.32, and 25
more
from 1 apps: fossil-db/steam-dxvk/total_war_warhammer3

Totals:
Instrs: 160946537 -> 160928389 (-0.01%); split: -0.01%, +0.00%
Cycles: 14125908620 -> 14125873958 (-0.00%); split: -0.00%, +0.00%

Totals from 1002 (0.15% of 652134) affected shaders:
Instrs: 411261 -> 393113 (-4.41%); split: -4.41%, +0.00%
Cycles: 16676735 -> 16642073 (-0.21%); split: -0.48%, +0.27%

Tiger Lake
Totals:
Instrs: 164511816 -> 164497202 (-0.01%); split: -0.01%, +0.00%
Cycles: 13801675722 -> 13801629397 (-0.00%); split: -0.00%, +0.00%
Subgroup size: 7955168 -> 7955152 (-0.00%)
Send messages: 8544494 -> 8544486 (-0.00%)

Totals from 997 (0.15% of 651454) affected shaders:
Instrs: 460820 -> 446206 (-3.17%); split: -3.17%, +0.00%
Cycles: 16265514 -> 16219189 (-0.28%); split: -0.84%, +0.56%
Subgroup size: 17552 -> 17536 (-0.09%)
Send messages: 26045 -> 26037 (-0.03%)

Ice Lake
Totals:
Instrs: 165504747 -> 165489970 (-0.01%); split: -0.01%, +0.00%
Cycles: 15145244554 -> 15145149627 (-0.00%); split: -0.00%, +0.00%
Subgroup size: 8107032 -> 8107016 (-0.00%)
Send messages: 8598680 -> 8598672 (-0.00%)
Spill count: 45427 -> 45423 (-0.01%)
Fill count: 74749 -> 74747 (-0.00%)

Totals from 1125 (0.17% of 656115) affected shaders:
Instrs: 521676 -> 506899 (-2.83%); split: -2.83%, +0.00%
Cycles: 19555434 -> 19460507 (-0.49%); split: -0.59%, +0.10%
Subgroup size: 21616 -> 21600 (-0.07%)
Send messages: 28623 -> 28615 (-0.03%)
Spill count: 603 -> 599 (-0.66%)
Fill count: 1362 -> 1360 (-0.15%)

Skylake
*** Shaders only in 'after' results are ignored:
fossil-db/steam-native/red_dead_redemption2/cef460b80bad8485/fs.16,
fossil-db/steam-native/red_dead_redemption2/cd5fe081e2e5529d/fs.16
from 1 apps: fossil-db/steam-native/red_dead_redemption2

Totals:
Instrs: 141607617 -> 141593776 (-0.01%); split: -0.01%, +0.00%
Cycles: 14257812441 -> 14257661671 (-0.00%); split: -0.00%, +0.00%
Subgroup size: 7743752 -> 7743736 (-0.00%)
Send messages: 7552728 -> 7552720 (-0.00%)
Spill count: 43660 -> 43661 (+0.00%)
Fill count: 71301 -> 71303 (+0.00%)

Totals from 1017 (0.16% of 636964) affected shaders:
Instrs: 392454 -> 378613 (-3.53%); split: -3.53%, +0.00%
Cycles: 16622974 -> 16472204 (-0.91%); split: -1.04%, +0.13%
Subgroup size: 19840 -> 19824 (-0.08%)
Send messages: 23021 -> 23013 (-0.03%)
Spill count: 484 -> 485 (+0.21%)
Fill count: 1155 -> 1157 (+0.17%)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28281>
2024-03-21 15:12:58 -07:00
Ian Romanick
66dc6e07f5 intel/brw: Fix handling of accumulator register numbers
Folks, there's more than one accumulator. In general, when the
register file is ARF, the upper 4 bits of the register number specify
which ARF, and the lower 4 bits specify which one of that ARF. This
can be further partitioned by the subregister number.

This is already mostly handled correctly for flags register, but lots
of places wanted to check the register number for equality with
BRW_ARF_ACCUMULATOR. If acc1 is ever specified, that won't work.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28281>
2024-03-21 15:12:54 -07:00
Jordan Justen
6922f421f4 intel/compiler: nib_ctrl no longer exists on Xe2+
Ref: cfb34dc695 ("intel/eu/validate: Validate that the ExecSize is a factor of chosen ChanOff")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28191>
2024-03-15 03:01:53 +00:00
Ian Romanick
93478c095e intel/compiler: Enforce 64-bit RepCtrl restriction in eu_validate
For some reason, this wasn't always caught in fs_visitor::validate.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27552>
2024-03-12 21:31:30 +00:00
Kenneth Graunke
a18030305c intel/brw: Delete SIMD4x2 URB opcodes
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27872>
2024-02-29 18:00:14 +00:00
Caio Oliveira
8f3c52c1da intel/brw: Remove MRF type
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691>
2024-02-28 05:45:39 +00:00