We always set exec size to 16 for this MOV, but the execution group remains
from the previous emitted instruction. This can cause emitting a group
which violates PRM restriction for ChanOff: "The execution size (ExecSize)
must be a factor of the chosen offset."
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25657>
There are two "Src1.Length" with different formats in "send" description
in the PRMs. One is part of ExMsgDesc, is relevant for LSC SFIDs, and
exists if [ExDesc.IsReg]==false. The other is just a 5-bit immediate,
is relevant for other SFIDs too, and exists if ([ExDesc.IsReg]==true)
AND ([ExBSO]==true).
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25657>
Having the reg set with predication disabled shouldn't cause any problems
during the execution. But when decompiling such instruction the flag won't
be shown in the output, so the recompiling will cause
functionally-identical but binary-different code. Fixing this makes
disasm/asm testing easier.
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25657>
This can be useful for testing i965_disasm and i965_asm by comparing
bin -> asm -> bin results.
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25657>
This removes output like
```
CS SIMD16 shader: 2790 inst, 0 loops, 24804 cycles, 166:106 spills:fills, 35 sends,
scheduled with mode top-down, Promoted 1 constants, compacted 44640 to 41424 bytes.
```
from the default builds. Like other debug output in intel_clc, they can
re-enabled with INTEL_DEBUG=cs.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26939>
Make the build less chatty. The current warnings are about certain
capabilities not being fully supported, which we don't care for these
particular kernels.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26939>
We can limit the AUX-TT requirements to formats supporting CCS.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26890>
Meteorlake shipped with the b0 stepping. Remove fixes for hardware
bugs that were corrected prior to the platform release.
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26898>
INTEL_NEEDS_WA macros are valid when a workaround applies to all
platforms which have the GFX_VERx10 versions for the workaround.
Some workarounds were fixed at a stepping after the platform release.
If a workaround applies partially to any platform, then GFX_VERx10
cannot be used to correctly apply the workaround.
This change invalidates INTEL_NEEDS_WA_16014538804 and
INTEL_NEEDS_WA_22014412737, which were fixed for MTL platforms at
stepping b0. The run-time checks were already present for all uses of
these macros. Updating the poisoned macros to INTEL_WA_{num}_GFX_VER
compiles out the run-time checks on platforms where they cannot apply.
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26898>
In practice it seems we are always entering here, haven't looked
in detail whether at this point we could just assert. But for now
only allocate a new acp_entry if we are going to add it.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25670>
On Xe2+, we don't build the SIMD8 shader so this check makes sure we
don't execute the uninitialized invocations.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26886>
Gen5 adds some boolean conversion instructions after nir emits,
but that nir srcs don't line up with them, so reemit the boolean
conversion if we reemit the inot.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 31b5f5a51f ("nir/opt_if: Simplify if's with general conditions")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26782>
This removed a bunch of calls from the vec4 code that aren't called anywhere else.
Bring back the bits that were removed.
Fixes glxgears on gen5
Fixes: 81594d0db1 ("intel/compiler: Move earlier scheduler code that is not mode-specific")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26862>
These macro define is compute from literals, so use ALIGN_POT instead of ALIGN function
so that it's can be computed at compile time
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26864>
align is a function and when we want use it, the align variable will shadow it
So replace it with other names
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26864>
Xe2 allows route of LD messages from Sampler to LSC to improve
performance when some restrictions are met.
BSpec: 57023
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26801>
The indirect BO of clear color is also removed along with clear value
address and its enabling.
Other delta in struct RENDER_SURFACE_STATE are deferred to their
functional enabling changes.
Signed-off-by: Zhang, Jianxun <jianxun.zhang@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26801>
This will allow us to use it in Xe2 genxml.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26801>
This will allow us to use it in Xe2 genxml.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26801>
On Xe2+, PIPELINE_SELECT is getting deprecated (Bspec 55860), as a
result we don't have to do the stalling flushes while switching between
different pipelines.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26637>
When the source destination index is a constant, we can avoid generating
a lot of the intermediate code. At the very least, this makes initial
NIR dumps much easier to read.
v2: Simplify tracking of dst_index. Suggested by Caio.
Suggested-by: Caio
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>
Gfx12.5 (DG2) will use DPAS instructions to accelerate the
implementation. Earlier platforms will use equivalent discrete
instructions (basically subgroup operations). Gfx12 (Tigerlake) will use
DP4A for 8-bit integer matrix multiplication. Older platforms, which
lack DP4A, will use a suboptimal instruction sequence. There is plenty
of room for improvement here.
On DG2 (Gfx12.5) gets the following results from the CTS:
Test run totals:
Passed: 1642/13982 (11.7%)
Failed: 0/13982 (0.0%)
Not supported: 12340/13982 (88.3%)
Warnings: 0/13982 (0.0%)
Waived: 0/13982 (0.0%)
On DG2 (Gfx12.5) with forced lowering, Raptor Lake (Gfx12) and Ice Lake
(Gfx11):
Test run totals:
Passed: 1662/13982 (11.9%)
Failed: 0/13982 (0.0%)
Not supported: 12320/13982 (88.1%)
Warnings: 0/13982 (0.0%)
Waived: 0/13982 (0.0%)
The difference in the number of tests run is due to
saturatingAccumulation not being set on DG2 when DPAS is used. There is
a comment in "intel/dev: Advertise integer configs with
saturatingAccumulation too" that explains how this could be added should
the need arise.
v2: Prefix type names with INTEL_CMAT_. Suggested by Lionel.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>
VUID-RuntimeSpirv-saturatingAccumulation-08983 says:
For OpCooperativeMatrixMulAddKHR, the SaturatingAccumulation
cooperative matrix operand must be present if and only if
VkCooperativeMatrixPropertiesKHR::saturatingAccumulation is VK_TRUE.
As a result, we have to advertise integer configs both with and without
this flag set.
v2: Prefix type names with INTEL_CMAT_. Suggested by Lionel.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>
The commit is a little ugly. The definition of anv_fixup_subgroup_size
is moved before the added call site. In addition, the bit starting at
the "Cooperative matrix extension requires..." comment is added.
v2: Dramatic simplification of SIMD selection. Suggested by Caio.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>
Set the flag on compute shaders when the application has enabled the
cooperative matrix feature. We might still want to enable this only when
DPAS is actually used. The current method is based on many suggestions
from Lionel.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>