Commit graph

12995 commits

Author SHA1 Message Date
Lina Versace
9f2e62e2d7 anv: Fix feature pipelineProtectedAccess
We enable VK_EXT_pipeline_protected_access only if
anv_physical_device::has_protected_contexts. Therefore we should do the
same for vk_features::pipelineProtectedAccess.

Fixes: 0b5408f ("anv: expose VK_EXT_pipeline_protected_access")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32206>
(cherry picked from commit 56116c4da5)
2024-11-21 09:13:28 -08:00
Lionel Landwerlin
0bf0f66c9e anv: prevent access to destroyed vk_sync objects post submission
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 36ea90a361 ("anv: Convert to the common sync and submit framework")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12145
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
(cherry picked from commit 9b779068c3)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>
2024-11-19 14:29:11 -08:00
Francisco Jerez
f35c690b12 intel/fs/xe2: Fix up subdword integer region restriction with strided byte src and packed byte dst.
This fixes a corner case of the LNL sub-dword integer restrictions
that wasn't being detected by has_subdword_integer_region_restriction(),
specifically:

> if(Src.Type==Byte && Dst.Type==Byte && Dst.Stride==1 && W!=2) {
>    // ...
>    if(Src.Stride == 2) && (Src.UniformStride) && (Dst.SubReg%32  ==  Src.SubReg/2 ) { Allowed }
>    // ...
> }

All the other restrictions that require agreement between the SubReg
number of source and destination only affect sources with a stride
greater than a dword, which is why
has_subdword_integer_region_restriction() was returning false except
when "byte_stride(srcs[i]) >= 4" evaluated to true, but as implied by
the pseudocode above, in the particular case of a packed byte
destination, the restriction applies for source strides as narrow as
2B.

The form of the equation that relates the subreg numbers is consistent
with the existing calculations in brw_fs_lower_regioning (see
required_src_byte_offset()), we just need to enable lowering for this
corner case, and change lower_dst_region() to call lower_instruction()
recursively, since some of the cases where we break this restriction
are copy instructions introduced by brw_fs_lower_regioning() itself
trying to lower other instructions with byte destinations.

This fixes some Vulkan CTS test-cases that were hitting these
restrictions with byte data types.

Fixes: 217d412360 ("intel/fs/gfx20+: Implement sub-dword integer regioning restrictions.")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>
2024-11-19 14:28:55 -08:00
Lionel Landwerlin
7dc34f1147 anv: fix missing push constant reallocation
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 62d96a6546 ("anv: add dirty tracking for push constant data")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12151
(cherry picked from commit 8845255881)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>
2024-11-18 09:40:16 -08:00
Kenneth Graunke
8e45bd6365 brw: Fix try_rebuild_source's ult32/ushr handling to use unsigned types
We were accidentally doing a signed integer comparison here for ult32,
or a sign-extending shift for ushr.

One notable bit of fallout was that load_global_uniform_block_intel
address calculations broke on platforms that don't have native 64-bit
integer support, as the iadd64 lowering for "do I need to carry?" was
using ult32...and performing the wrong comparison.  We spotted this in
Borderlands 3 on Alchemist once we turned on other optimizations.

Thanks to Lionel Landwerlin for helping spot the problem!

Fixes: c7b312ad45 ("brw: factor out source extraction for rematerialization")
Fixes: 339630ab05 ("brw: enable A64 loads source rematerialization")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 5848035443)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>
2024-11-18 09:40:15 -08:00
Lionel Landwerlin
d857c4a418 anv: fix incorrect aspect flag for depth/stencil formats
We're asking if compression is supported and
anv_formats_ccs_e_compatible() is assuming color aspect.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 0317c44872 ("anv: add VK_EXT_host_image_copy support")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12155
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
(cherry picked from commit 431f353bfe)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>
2024-11-18 09:40:12 -08:00
Iván Briano
232c6b2d8e anv: remove unused/misleading/wrong parameters from the RT trampoline
Since the shader parameters are passed as inline data, push constants
are no longer used and so, not actually set on dispatch. But the
nr_params = 4 was still making the shader emit the code to load them,
causing page faults on simulation, and would also on HW if we didn't
always have a scratch page set.

The uses_inline_data parameter will be set from brw_compile_cs(), called
shortly after this point, so we don't need it here.

The subgroup_size is misleading, as we don't actually require that size
and the code that checks for it isn't even running for this shader.

Fixes: 97b17aa0b1 ("brw/nir: rework inline_data_intel to work with compute")

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12152

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit d32a26b3e6)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>
2024-11-18 09:39:58 -08:00
Lionel Landwerlin
9c55d78353 brw: allocate physical register sizes for spilling
All of the spilling code should work with physical register units
because for example SEND messages will expect a physical register as
destination.

So always allocate a full physical register for the spilled/unspilled
values and adjust the offsets of the registers to physical sizes too.

Cc: mesa-stable
Fixes: aa494cba ("brw: align spilling offsets to physical register sizes")
Closes: mesa/mesa#11967

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Found-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit a21cd8c5b6)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32261>
2024-11-14 09:31:39 -08:00
Lionel Landwerlin
97d974a3ad anv: update shader descriptor resource limits
Some limits got stuck to the old binding table limits. Those don't
apply anymore since EXT_descriptor_indexing was implemented.

Fixes: 6e230d7607 ("anv: Implement VK_EXT_descriptor_indexing")
Fixes: 96c33fb027 ("anv: enable direct descriptors on platforms with extended bindless offset")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
(cherry picked from commit d6acb56f11)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
2024-11-13 08:30:56 -08:00
Matt Turner
739c3615ce anv: Align anv_descriptor_pool::host_mem
Otherwise anv_descriptor_set is accessed through an unaligned pointer,
which is undefined behavior in C.

```
anv_descriptor_set.c:1620:17: runtime error: member access within misaligned address 0x61900002c2b5
               for type 'struct anv_descriptor_set', which requires 8 byte alignment 0x61900002c2b5
```

Fixes: 2570a58bcd ("anv: Implement descriptor pools")
(cherry picked from commit a2c4a34303)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
2024-11-12 09:12:17 -08:00
Iván Briano
ea9b3f928d intel/rt: fix ray_query stack address calculation
While the documentation says to use NUM_SIMD_LANES_PER_DSS for the stack
address calculation, what the HW actually uses is
NUM_SYNC_STACKID_PER_DSS. The former may vary depending on the platform,
while the latter is fixed to 2048 for all current platforms.

Fixes: 6c84cbd8c9 ("intel/dev/xe: Set max_eus_per_subslice using topology query")

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit aee04bf4fb)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
2024-11-12 09:12:11 -08:00
Ian Romanick
7994534fe9 brw/cse: Don't eliminate instructions that write flags
With other changes in my tree, I observed this code from
dEQP-VK.subgroups.vote.compute.subgroupallequal_float have the second
cmp.z removed.

    undef(8) %69:UD
    cmp.z.f0.0(8) %69:F, %37:F, %57+0.0<0>:F
    mov(1) v58+0.0:D, 0d NoMask group0
    (+f0.0) mov(1) v58+0.0:D, -1d NoMask group0
    cmp.nz.f0.0(8) null:D, v58+0.0<0>:D, 0d
    ...
    undef(8) %72:UD
    cmp.z.f0.0(8) %72:F, %37:F, %57+0.0<0>:F
    mov(1) v63+0.0:D, 0d NoMask group0
    (+f0.0) mov(1) v63+0.0:D, -1d NoMask group0

This was also fixed by running dead-code elimination before CSE. That
seems more like avoiding the problem than fixing it, though.

I believe this affects shader-db results because leaving the second
CMP in the shader can give more opportunities for cmod propagation.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes: 234c45c929 ("intel/brw: Write a new global CSE pass that works on defs")

shader-db:

All Intel platforms had similar results. (Lunar Lake shown)
total cycles in shared programs: 922097690 -> 922260862 (0.02%)
cycles in affected programs: 3178926 -> 3342098 (5.13%)
helped: 130
HURT: 88
helped stats (abs) min: 2 max: 2194 x̄: 296.71 x̃: 16
helped stats (rel) min: <.01% max: 16.56% x̄: 1.86% x̃: 0.18%
HURT stats (abs)   min: 4 max: 11992 x̄: 2292.55 x̃: 47
HURT stats (rel)   min: 0.04% max: 57.32% x̄: 11.82% x̃: 0.61%
95% mean confidence interval for cycles value: 320.36 1176.63
95% mean confidence interval for cycles %-change: 1.59% 5.73%
Cycles are HURT.

LOST:   2
GAINED: 1

fossil-db:

Lunar Lake, Meteor Lake, Tiger Lake had similar results. (Lunar Lake shown)
Totals:
Instrs: 142022960 -> 142022928 (-0.00%); split: -0.00%, +0.00%
Cycle count: 21995242782 -> 21995384040 (+0.00%); split: -0.00%, +0.00%
Max live registers: 48013385 -> 48013343 (-0.00%)

Totals from 507 (0.09% of 551441) affected shaders:
Instrs: 886191 -> 886159 (-0.00%); split: -0.01%, +0.01%
Cycle count: 69302492 -> 69443750 (+0.20%); split: -0.66%, +0.86%
Max live registers: 94413 -> 94371 (-0.04%)

DG2
Totals:
Instrs: 152856370 -> 152856093 (-0.00%); split: -0.00%, +0.00%
Cycle count: 17237159885 -> 17236804052 (-0.00%); split: -0.00%, +0.00%
Fill count: 150673 -> 150631 (-0.03%)
Max live registers: 31871520 -> 31871476 (-0.00%)

Totals from 506 (0.08% of 633197) affected shaders:
Instrs: 831795 -> 831518 (-0.03%); split: -0.04%, +0.01%
Cycle count: 55578509 -> 55222676 (-0.64%); split: -1.38%, +0.74%
Fill count: 2779 -> 2737 (-1.51%)
Max live registers: 51383 -> 51339 (-0.09%)

Ice Lake and Skylake had similar results. (Ice Lake shown)
Totals:
Instrs: 152017826 -> 152017793 (-0.00%); split: -0.00%, +0.00%
Cycle count: 15180773451 -> 15180761166 (-0.00%); split: -0.00%, +0.00%
Fill count: 106610 -> 106614 (+0.00%)
Max live registers: 32195006 -> 32194966 (-0.00%)

Totals from 411 (0.06% of 637268) affected shaders:
Instrs: 705935 -> 705902 (-0.00%); split: -0.01%, +0.01%
Cycle count: 47830019 -> 47817734 (-0.03%); split: -0.05%, +0.02%
Fill count: 2865 -> 2869 (+0.14%)
Max live registers: 42883 -> 42843 (-0.09%)

(cherry picked from commit 9aba731d03)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
2024-11-12 09:12:10 -08:00
Ian Romanick
1e792b0933 brw/copy: Don't copy propagate through smaller entry dest size
Copy propagation would incorrectly occur in this code

    mov(16) v4+2.0:UW, u0<0>:UW NoMask
    ...
    mov(8) v6+2.0:UD, v4+2.0:UD NoMask group0

to create

    mov(16) v4+2.0:UW, u0<0>:UW NoMask
    ...
    mov(8) v6+2.0:UD, u0<0>:UD NoMask group0

This has different behavior. I think I just made a mistake when I
changed this condition in e3f502e007.

It seems like this condition could be relaxed to cover cases like (note
the change of destination stride)

    mov(16) v4+2.0<2>:UW, u0<0>:UW NoMask
    ...
    mov(8) v6+2.0:UD, v4+2.0:UD NoMask group0

I'm not sure it's worth it.

No shader-db or fossil-db changes on any Intel platform. Even the code
for the test case mentioned in the original commit did not change.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes: e3f502e007 ("intel/fs: Allow copy propagation between MOVs of mixed sizes")
Closes: #12116
(cherry picked from commit 80a5d158ae)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
2024-11-12 09:12:07 -08:00
Ian Romanick
8f53de4a5d brw/emit: Add correct 3-source instruction assertions for each platform
Specifically, allow two immediate sources for BFE on Gfx12+. I stumbled
on this while trying some stuff with !31852.

v2: Don't be lazy. Add proper assertions for all the things on all the
platforms. Based on a suggestion by Ken.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes: 7bed11fbde ("intel/brw: Allow immediates in the BFE instruction on Gfx12+")
(cherry picked from commit c1c09e3c4a)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
2024-11-08 10:03:27 -08:00
Lionel Landwerlin
1ab129ba70 anv: fix extent computation in image->image host copies
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 0317c44872 ("anv: add VK_EXT_host_image_copy support")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
(cherry picked from commit 3ecf2a0518)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
2024-11-08 10:03:23 -08:00
Felix DeGrood
bf96702985 intel/measure: increase size of filename malloc to account for \0
Corrects regression caused by prior commit that created memory
overwrite by not mallocing enough space for filename string.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32013>
2024-11-06 22:12:29 +00:00
Lionel Landwerlin
0ab2849597 anv: move pipe control debug to anv_util.c
We're going to add more printing.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31928>
2024-11-06 12:20:23 +00:00
Lionel Landwerlin
b5403a4e40 anv: fix indentation
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31928>
2024-11-06 12:20:23 +00:00
Lionel Landwerlin
f9e76e8ca6 anv: add texture cache inval after binding pool update
Cc: mesa-stable
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31928>
2024-11-06 12:20:22 +00:00
Lionel Landwerlin
b3f487bd0d anv: fix even set/reset on blitter engine
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31928>
2024-11-06 12:20:22 +00:00
Matt Turner
5068a6b4ce anv: Set shader_spilling_rate=11
This has the best fossil-db results across in a sweep from 0..15.

fossil-db results on Alderlake:

Instructions in all programs: 152849904 -> 152824116 (-0.0%)
SENDs in all programs: 7677830 -> 7677830 (+0.0%)
Loops in all programs: 48470 -> 48470 (+0.0%)
Cycles in all programs: 11988670382 -> 11987530942 (-0.0%)
Spills in all programs: 42863 -> 41777 (-2.5%)
Fills in all programs: 77114 -> 73044 (-5.3%)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31990>
2024-11-06 02:47:26 +00:00
Kenneth Graunke
22b511ef02 intel: Set shader_spilling_rate=11 in intel_clc
A while back Matt enabled shader_spilling_rate by default for anv.
But intel_clc doesn't use the driconf mechanism that we use there.

The GRL shaders spill a lot, and with us now compiling additional
generations of the shaders, Mesa build time is getting prohibitively
expensive.  By setting this, we drop the time taken for a clean debug
build by approximately 35% on my current laptop.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31993>
2024-11-06 01:57:10 +00:00
José Roberto de Souza
a991935088 anv: Enable perf metrics id set syncronization
Now actually making use of new Xe KMD OA syncronization uAPI.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31283>
2024-11-05 19:25:53 +00:00
José Roberto de Souza
953abc7d1e intel/perf: Add INTEL_PERF_FEATURE_METRIC_SYNC and check if KMD supports it
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31283>
2024-11-05 19:25:53 +00:00
José Roberto de Souza
a38a98c4cb intel/perf: Extend intel_perf_stream_set_metrics_id() to syncronize metrics id changes
Xe KMD added a uAPI to syncronze metrics id changes, so we can make
it wait for all previous workloads in exec_queue and all previous
metrics id changes to finish before start change it again.
This should make Vulkan queries more robust.

So this makes use of intel_bind_timeline to syncronize the metrics id
changes and xe_queue_get_syncobj_for_idle() to syncronize with
exec_queue.

As i915 and some versions of Xe KMD will not support it, this feature
will only be used then intel_bind_timeline parameter is not NULL and
timeline has a valid syncobj id.
At this patch level all callers will set it to NULL, next patch will
add and initialize timeline in ANV when supported by Xe KMD.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31283>
2024-11-05 19:25:53 +00:00
José Roberto de Souza
27fef94851 intel/perf: Add OA support to ARL
ARL has enough differences in OA files to have its own set of files.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31685>
2024-11-05 14:56:49 +00:00
itycodes
10c92cbd39 intel: Fix a typo in intel_device_info.c:has_get_tiling
The structs are of equal size and both ioctls were added at the same
time, so the functionality is equivalent, but it's nonetheless the
incorrect type being passed.

Signed-off-by: tranquillitycodes@proton.me
Fixes: 762e601f77
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31974>
2024-11-05 04:31:50 +01:00
Felix DeGrood
99e8502013 intel/measure: defer file open until first write
Fixes abort on steam.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31938>
2024-11-04 20:25:14 +00:00
Felix DeGrood
f345019830 intel/measure: add nogl feature
Do not trigger INTEL_MEASURE for ogl apps with INTEL_MEASURE=nogl

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31938>
2024-11-04 20:25:14 +00:00
Sviatoslav Peleshko
3a962a28e7 intel/elk_asm: Add BranchCtrl support
We emit it for gfx8, so the assembler should support it too.

Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31747>
2024-11-02 18:01:20 +00:00
Sviatoslav Peleshko
cd4c328408 intel/elk: List all instructions that have BranchCtrl bit
Previously this bit was not clearly documented in PRMs, but gfx12 PRMs
finally list all the instructions where it is present.

Although it's unclear if it's functional for anything other than "if",
"else", and "goto", we probably still should acknowledge its existence
in other instructions.

Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31747>
2024-11-02 18:01:20 +00:00
Sviatoslav Peleshko
445df8d611 intel/brw_asm: Add BranchCtrl support
We emit it for gfx9, so the assembler should support it too.

Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31747>
2024-11-02 18:01:19 +00:00
Sviatoslav Peleshko
aea7366613 intel/brw: List all instructions that have BranchCtrl bit
Previously this bit was not clearly documented in PRMs, but gfx12 PRMs
finally list all the instructions where it is present.

Although it's unclear if it's functional for anything other than "if",
"else", and "goto", we probably still should acknowledge its existence
in other instructions.

Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31747>
2024-11-02 18:01:19 +00:00
Paulo Zanoni
5ca883505e brw: add a NOP in between WHILE instructions on LNL
This is a workaround that is still in progress, see HSD 22020521218.
If we don't have these NOPs we may see rendering corruption or even
GPU hangs.

While we still don't fully understand the issue from the hardware
point of view, let's have this workaround so we can pass CTS and move
things forward. If we need to change this later, we can. Besides, the
impact is minimal. Shaderdb/fossilize report no changes for this
patch.

On our Blackops trace, the lack of this patch causes corruption in fog
rendering (rectangles where fog was supposed to be shown don't show
the fog).

On dEQP-VK.graphicsfuzz.cov-array-copies-loops-with-limiters, without
this patch we get a GPU hang.

Backport-to: 24.2
Testcase: dEQP-VK.graphicsfuzz.cov-array-copies-loops-with-limiters
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11813
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31331>
2024-10-31 23:57:10 +00:00
Jordan Justen
39fab9b240 intel/dev: Set L3 bank count for Xe2+ from Xe KMD
Rather than updating intel_device_info_update_l3_banks(), the Xe KMD
provides this info via the DRM_XE_DEVICE_QUERY_GT_TOPOLOGY query item.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31894>
2024-10-31 18:40:27 +00:00
Lionel Landwerlin
1485b5659a anv: update some of the indirect invalidations
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31915>
2024-10-30 20:39:31 +00:00
Lionel Landwerlin
cb224370b6 anv: avoid L3 fabric flush in pipeline barriers
This bit is not needed for barriers and appears to trigger a
performance regression. So leave it for just for AUX-TT
flushing/invalidation.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: e3814dee1a ("anv: add plumbing/support for L3 fabric flush")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12090
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31915>
2024-10-30 20:39:31 +00:00
Sagar Ghuge
17096f87c1 intel: Switch to COMPUTE_WALKER_BODY
Stuff COMPUTE_WALKER_BODY in COMPUTER_WALKER in both iris and anv.

This also fixes the tracepoint for ray dispatches. Stuffing
COMPUTE_WALKER_BODY allow us to set the
cmd_buffer->state.last_compute_walker.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31822>
2024-10-29 15:54:43 +00:00
José Roberto de Souza
6a0f2dd44b intel/dev: Fix max_cs_threads value on simulator
intel_device_info_update_after_hwconfig() updates max_cs_threads
based on max_eus_per_subslice and num_thread_per_eu but in some
platforms simulator the hwconfig don't have the
INTEL_HWCONFIG_MAX_NUM_EU_PER_DSS value, causing max_cs_threads to
be set to a wrong value and then causing issues when programing
CFE_STATE with a invalid value.

Fortunately we can also get max_eus_per_subslice from topology query,
so here moving the hwconfig query and
intel_device_info_update_after_hwconfig() call to after topology.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31850>
2024-10-28 21:24:09 +00:00
José Roberto de Souza
6c84cbd8c9 intel/dev/xe: Set max_eus_per_subslice using topology query
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31850>
2024-10-28 21:24:09 +00:00
Nanley Chery
334b368fc9 anv: Allow more fast clear colors for layouts
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9983
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31743>
2024-10-28 17:43:21 +00:00
Nanley Chery
4e17452387 anv: Load fast clear colors more often
If a render area covers an area that is smaller than an attachment's
extent and is not aligned to the CCS block size, we must load the clear
color so that the pixels outside of that area are decompressed with the
right clear color.

Prevents the next patch from causing the following test failure on gfx9:

dEQP-VK.renderpass.suballocation.load_store_op_none.color_load_op_none_store_op_none

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31743>
2024-10-28 17:43:21 +00:00
Nanley Chery
0e6b132a75 anv: Access more colors in fast_clear_memory_range
Store an array of clear values, one for each view format of the image.
Load the clear value based on the view format.

anv_image_msaa_resolve() may override the source or destination with
ISL_FORMAT_UNSUPPORTED, so make anv_image_get_clear_color_addr() handle
that format.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31743>
2024-10-28 17:43:21 +00:00
Nanley Chery
43bc4f4576 anv: Refactor clear color loading functions
Rename the functions and update the parameters in preparation for the
next patch.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31743>
2024-10-28 17:43:21 +00:00
Nanley Chery
0d4f2a2db1 anv: Move code out of loop in anv_CmdClearColorImage
According to the spec, the clear range's aspect will always be the color
aspect.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31743>
2024-10-28 17:43:21 +00:00
Nanley Chery
8f9ed7e932 anv: Prepare dmabufs for clear color arrays
In later commits, we'll rely on the number of view formats used by an
image to determine the size allocated for an array of clear colors in
the aux-state tracking buffer. Having a single view format for dmabufs
with clear color support allows anv to transparently handle this case.

Restrict the number of view formats by explicitly setting the image
format list to incomplete. Secondly, loosen the non-zero clear color
restriction on clear color supporting dmabufs. Those images can support
any clear color even with an incomplete list because we restrict
problematic accesses for the clear color during the negotiation phase.
Lastly, update add_all_surfaces_explicit_layout() to assert that the
sizing of the imported clear color struct meets expectations.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31743>
2024-10-28 17:43:21 +00:00
Nanley Chery
f5f0354447 anv: Add an array of view formats to anv_image
Stores the format list for the image in terms of ISL formats.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31743>
2024-10-28 17:43:20 +00:00
Valentine Burley
e18733300e anv/ci: Remove additive blending fails on ADL
This was a VKCTS bug on earlier version of the CTS.

These tests have been actually passing since the VKCTS was uprevved to
1.3.9.0, which landed a bit before ADL testing in CI was turned on.

Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31862>
2024-10-27 21:43:18 +00:00
Valentine Burley
3b5e49a7f8 intel/ci: Fix Alder Lake's configuration
There's currently no GL or GLES testing on the iris gallium driver,
and the VKCTS expectations were erroneously listed under iris-*.txt.

Fix the rules set for anv-adl-full, change the GPU_VERSION to anv-adl
and move the expectations around accordingly.

Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31862>
2024-10-27 21:43:18 +00:00
Iván Briano
13db5fad27 brw: fix task/mesh push constant loading
The InlineData passed to the shader is a fixed size unrelated to the
register size. It happens to match pre-Xe2, but by considering it the
same in Xe2, we ended up reading pushed constants from the wrong place
when they didn't fit in the InlineData.

Fixes: 97b17aa0b1 ("brw/nir: rework inline_data_intel to work with compute")

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31856>
2024-10-26 18:12:41 +00:00