Commit graph

3001 commits

Author SHA1 Message Date
Ian Romanick
6f237a23c7 intel/compiler: Track lower_dpas flag in brw_get_compiler_config_value
This user-settable flag affects compiler output, so it should be tracked
in the cache hash.

Fixes: 3756f60558 ("intel/fs: DPAS lowering")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: Lionel Landwerlin
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26993>
2024-01-18 19:20:12 +00:00
Ian Romanick
2741c6464c intel/compiler: Use u_foreach_bit64 in brw_get_compiler_config_value
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: Lionel Landwerlin
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26993>
2024-01-18 19:20:12 +00:00
Ian Romanick
951e08fc18 intel/compiler: Disable DPAS instructions on MTL
Reviewed-by: Mark Janes <markjanes@swizzler.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 3756f60558 ("intel/fs: DPAS lowering")
Closes: #10376
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26993>
2024-01-18 19:20:12 +00:00
Vinson Lee
73835874a8 intel/disasm: Remove duplicate variable reg_file
Fix defects reported by Coverity Scan.

Evaluation order violation (EVALUATION_ORDER)
write_write_typo: In reg_file = reg_file = brw_inst_dpas_3src_dst_reg_file(devinfo, inst),
reg_file is written twice with the same value.

Fixes: 1c92dad5cb ("intel/disasm: Disassembly support for DPAS")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27056>
2024-01-15 07:46:12 +00:00
Caio Oliveira
1a31970946 intel/compiler/xe2: Implement instruction compaction for DPAS.
These use different tables but map to the same bits, so it is just
a matter of picking the right tables for the instruction.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26860>
2024-01-12 20:18:03 +00:00
Francisco Jerez
6e56a4b474 intel/compiler/xe2: Fix for the removal of AccWrCtrl.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26860>
2024-01-12 20:18:03 +00:00
Francisco Jerez
7f39e51dd5 intel/compiler/xe2: Add extra flag registers.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26860>
2024-01-12 20:18:03 +00:00
Francisco Jerez
f974eacab3 intel/compiler/xe2: Fix for the removal of most predication modes.
Reworks:
* Remove changes to fixup_nomask workaround since it applies only for
  Gfx12 family.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26860>
2024-01-12 20:18:03 +00:00
Francisco Jerez
f79123e1d9 intel/compiler/xe2: Fix for NibCtrl field removal.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26860>
2024-01-12 20:18:03 +00:00
Francisco Jerez
7db3f0b1c1 intel/compiler/xe2: Implement instruction compaction.
Reworks:
* Handle DPAS in has_3src_unmapped_bits.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26860>
2024-01-12 20:18:03 +00:00
Francisco Jerez
57ba9c176c intel/compiler/xe2: Implement codegen of compact instructions.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26860>
2024-01-12 20:18:03 +00:00
Francisco Jerez
d8ba1d63bc intel/compiler: Add assume() checks to brw_compact_inst_(set_)bits().
Similar to the preconditions of brw_inst_(set_)bits().

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26860>
2024-01-12 20:18:03 +00:00
Francisco Jerez
4a24f49b57 intel/compiler/xe2: Implement codegen of three-source instructions.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26860>
2024-01-12 20:18:03 +00:00
Francisco Jerez
e10e7d5aa3 intel/compiler/xe2: Implement codegen of indirect immediates.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26860>
2024-01-12 20:18:03 +00:00
Francisco Jerez
294bdbb253 intel/compiler/xe2: Implement codegen of 2-source instruction operands.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26860>
2024-01-12 20:18:03 +00:00
Francisco Jerez
72bbfa8e8d intel/compiler/xe2: Implement codegen of general instruction controls.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26860>
2024-01-12 20:18:03 +00:00
Francisco Jerez
066e6c6234 intel/compiler/xe2: Add Xe2 bounds to FF() macro.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26860>
2024-01-12 20:18:03 +00:00
Francisco Jerez
ae29ffb637 intel/eu/gfx12.5+: Don't fail validation with ARF register restriction error for indirect addressing.
The "file" field doesn't exist for indirect operands, so it contains
garbage.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26994>
2024-01-12 00:20:38 +00:00
Francisco Jerez
32b3ea3c3d intel/eu/validate: SEND instructions don't have immediate encodings on Gen12+.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26994>
2024-01-12 00:20:38 +00:00
Francisco Jerez
dfb034853a intel/fs: Use full 32-bit sample masks when immediate.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26994>
2024-01-12 00:20:38 +00:00
Sviatoslav Peleshko
98665e024f intel/tools/i965_asm: Handle sync instruction
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25657>
2024-01-09 11:35:52 +00:00
Sviatoslav Peleshko
cfb34dc695 intel/eu/validate: Validate that the ExecSize is a factor of chosen ChanOff
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25657>
2024-01-09 11:35:52 +00:00
Sviatoslav Peleshko
dbf6f0291a intel/fs: Set group 0 for Wa_14010017096 MOV instruction
We always set exec size to 16 for this MOV, but the execution group remains
from the previous emitted instruction. This can cause emitting a group
which violates PRM restriction for ChanOff: "The execution size (ExecSize)
must be a factor of the chosen offset."

Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25657>
2024-01-09 11:35:52 +00:00
Sviatoslav Peleshko
173a991405 intel/disasm: Print src1_len correctly depending on ExDesc type
There are two "Src1.Length" with different formats in "send" description
in the PRMs. One is part of ExMsgDesc, is relevant for LSC SFIDs, and
exists if [ExDesc.IsReg]==false. The other is just a 5-bit immediate,
is relevant for other SFIDs too, and exists if ([ExDesc.IsReg]==true)
AND ([ExBSO]==true).

Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25657>
2024-01-09 11:35:52 +00:00
Sviatoslav Peleshko
b5c0b90402 intel/compiler: Set flag reg to 0 when disabling predication
Having the reg set with predication disabled shouldn't cause any problems
during the execution. But when decompiling such instruction the flag won't
be shown in the output, so the recompiling will cause
functionally-identical but binary-different code. Fixing this makes
disasm/asm testing easier.

Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25657>
2024-01-09 11:35:52 +00:00
Sviatoslav Peleshko
a129e136de intel/disasm: Print half-float values instead of placeholder
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25657>
2024-01-09 11:35:52 +00:00
Sviatoslav Peleshko
4f41c44df2 intel/compiler: Add variable to dump binaries of all compiled shaders
This can be useful for testing i965_disasm and i965_asm by comparing
bin -> asm -> bin results.

Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25657>
2024-01-09 11:35:51 +00:00
Caio Oliveira
ef88a20d96 intel/compiler: Use INTEL_DEBUG=cs to ask for brw_compiler output
This removes output like

```
CS SIMD16 shader: 2790 inst, 0 loops, 24804 cycles, 166:106 spills:fills, 35 sends,
  scheduled with mode top-down, Promoted 1 constants, compacted 44640 to 41424 bytes.
```

from the default builds.  Like other debug output in intel_clc, they can
re-enabled with INTEL_DEBUG=cs.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26939>
2024-01-09 01:26:41 +00:00
Lionel Landwerlin
4b30b46ffd intel/fs: fix depth compute state for unchanged depth layout
There is no VK CTS exercising this case. If there was we would run
into hangs as noticed in
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26876

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26923>
2024-01-08 17:28:12 +00:00
Caio Oliveira
77f4f3112d intel/fs: Use linear allocator in fs_live_variables
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25670>
2024-01-04 23:06:07 +00:00
Caio Oliveira
b5cd91501d intel/fs: Use linear allocator in opt_copy_propagation
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25670>
2024-01-04 23:06:07 +00:00
Caio Oliveira
6d2503e935 intel/fs: Only allocate acp_entry if we are adding one
In practice it seems we are always entering here, haven't looked
in detail whether at this point we could just assert.  But for now
only allocate a new acp_entry if we are going to add it.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25670>
2024-01-04 23:06:07 +00:00
Sagar Ghuge
96e0d979a7 intel/fs: Check fs_visitor instance before using it
On Xe2+, we don't build the SIMD8 shader so this check makes sure we
don't execute the uninitialized invocations.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26886>
2024-01-04 22:24:07 +00:00
Dave Airlie
56a72e014f intel/compiler: reemit boolean resolve for inverted if on gen5
Gen5 adds some boolean conversion instructions after nir emits,
but that nir srcs don't line up with them, so reemit the boolean
conversion if we reemit the inot.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 31b5f5a51f ("nir/opt_if: Simplify if's with general conditions")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26782>
2024-01-04 21:27:23 +00:00
Dave Airlie
8f73cc802c intel/compiler: revert part of "Move earlier scheduler code that is not mode-specific"
This removed a bunch of calls from the vec4 code that aren't called anywhere else.

Bring back the bits that were removed.

Fixes glxgears on gen5

Fixes: 81594d0db1 ("intel/compiler: Move earlier scheduler code that is not mode-specific")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26862>
2024-01-04 00:38:38 +00:00
Dave Airlie
37366fef68 intel/compiler: fix release build unused variable.
This is only used in an assert.

Fixes: 158ac265df ("intel/fs: Make helpers for saving/restoring instruction order")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26863>
2024-01-03 23:52:11 +00:00
Daniel Schürmann
a3ed36da1a treewide: replace calls to nir_opt_trivial_continues() with nir_opt_loop()
Totals from 850 (1.11% of 76636) affected shaders: (RADV, GFX11)
MaxWaves: 18134 -> 18130 (-0.02%)
Instrs: 3011298 -> 3008585 (-0.09%); split: -0.17%, +0.08%
CodeSize: 15836804 -> 15841972 (+0.03%); split: -0.09%, +0.12%
VGPRs: 63580 -> 63604 (+0.04%)
SpillSGPRs: 966 -> 1148 (+18.84%); split: -0.83%, +19.67%
Latency: 36102291 -> 30186144 (-16.39%); split: -16.41%, +0.02%
InvThroughput: 9058100 -> 7011821 (-22.59%); split: -22.61%, +0.02%
VClause: 65369 -> 65364 (-0.01%); split: -0.03%, +0.02%
SClause: 100309 -> 100305 (-0.00%); split: -0.04%, +0.04%
Copies: 335658 -> 336472 (+0.24%); split: -0.70%, +0.94%
Branches: 110806 -> 108945 (-1.68%); split: -1.94%, +0.26%
PreSGPRs: 73476 -> 73934 (+0.62%); split: -0.25%, +0.87%
PreVGPRs: 58809 -> 58840 (+0.05%); split: -0.01%, +0.06%

Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24940>
2024-01-03 20:48:04 +00:00
Yonggang Luo
8665ce27bc intel: Use ALIGN_POT instead of ALIGN inside macro define
These macro define is compute from literals, so use ALIGN_POT instead of ALIGN function
so that it's can be computed at compile time

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26864>
2024-01-03 12:46:10 +00:00
Mark Janes
188c349e51 intel: remove workaround for preproduction DG2 steppings
DG2_G10 was released with stepping C0.
DG2_G11 was released with stepping B1.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26845>
2024-01-02 16:06:37 -08:00
Ian Romanick
2e75d71c1f intel/cmat: Generate better code for nir_intrinsic_cmat_insert
When the source destination index is a constant, we can avoid generating
a lot of the intermediate code. At the very least, this makes initial
NIR dumps much easier to read.

v2: Simplify tracking of dst_index. Suggested by Caio.

Suggested-by: Caio
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>
2023-12-29 20:28:54 -08:00
Ian Romanick
7bfbeb79a7 anv: Set COMPUTE_WALKER systolic mode enable flag
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>
2023-12-29 20:28:54 -08:00
Ian Romanick
6b14da33ad intel/fs: nir: Add nir_intrinsic_dpas_intel
v2: Fix parameter order in nir_intrinsic_dpas_intel to DPAS conversion.

v3: Fix float16 destination DPAS on DG2.

v4: Use nir_component_mask(...) instead of 0xffff. Suggested by Caio.

v5: Rebase on !26323.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>
2023-12-29 20:28:43 -08:00
Ian Romanick
3756f60558 intel/fs: DPAS lowering
Implements integer dot product lowering both with and without
DP4A. Implements half-float dot product lowering.

There are a couple FINISHME comments describing future optimizations.

v2: Add a brw_compiler::lower_dpas flag to track when the lowering
should be applied.

v3: Use is_null() instead of checking file != ARF. Suggested by Caio.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>
2023-12-29 20:27:15 -08:00
Ian Romanick
3cb9625539 intel/fs: Fix scoreboarding for DPAS
v2: Remove all mention of DPASW. Suggested by Curro and Caio.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>
2023-12-29 20:27:15 -08:00
Ian Romanick
eb1f19d7bf intel/compiler: Validation for DPAS instructions
v2: s/regiser/register/g in messages. Noticed by Caio. Add more context
to the sub-byte precision error message. Suggested by Caio.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>
2023-12-29 20:27:15 -08:00
Ian Romanick
1c92dad5cb intel/disasm: Disassembly support for DPAS
v2: Fix regioning in src[012]_dpas_3src. Noticed by Caio. Treat DPAS as
unordered. Suggested by Curro.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>
2023-12-29 20:27:13 -08:00
Ian Romanick
e666872c75 intel/compiler: Initial bits for DPAS instruction
v2: Add brw_ir_performance.cpp and brw_fs_generator.cpp changes. Fix
overlapping register allocation (via has_source_and_destination_hazard). Fix
incorrect destination register file encoding.

v3: Prevent lower_regioning from trying to "fix" DPAS sources.

v4: Add instruction latency information for scheduling and perf
estimates.

v5: Remove all mention of DPASW. Suggested by Curro and Caio. Update
the comment in fs_inst::has_source_and_destination_hazard. Suggested
by Caio.

v6: Add some comments near the src2 calculation in
fs_inst::size_read. Suggested by Caio.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>
2023-12-29 20:24:16 -08:00
Ian Romanick
3a35f8b29b intel/cmat: Lower cmat_load and cmat_store
v2: Add support for non-constant stride.

v3: Explain B matrices (a little bit) in
get_slice_type_from_desc. Suggested by Caio.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>
2023-12-29 20:24:16 -08:00
Ian Romanick
502be565da intel/cmat: Add lowering for cmat_bitcast
v2: Use nir_component_mask(...) instead of 0xffff. Assert that source
and destination are same size. Both suggested by Caio.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>
2023-12-29 20:24:15 -08:00
Ian Romanick
7303315a8b intel/cmat: Enable packed formats for scalar ops
v2: Use nir_pack_bits and nir_unpack_bits to simplify coop_scalar
handling. This saved 13 lines of code.

v3: Allow packing factor 2 and packing factor 1 elements be stored in
16-bit integers.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>
2023-12-29 20:24:15 -08:00