Commit graph

12392 commits

Author SHA1 Message Date
Matt Turner
a3714b55f4 intel/elk: Use REG_CLASS_COUNT
Fixes: d44462c08d ("intel/elk: Fork Gfx8- compiler by copying existing code")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30314>
2024-07-25 14:55:09 +00:00
Matt Turner
5e24c21625 intel/brw: Use REG_CLASS_COUNT
Fixes: 5d87f41a54 ("intel/fs/ra: Define REG_CLASS_COUNT constant specifying the number of register classes.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30314>
2024-07-25 14:55:09 +00:00
Paulo Zanoni
dd5362c78a anv/xe: try harder when the vm_bind ioctl fails
From all the many possible errors returned by the vm_bind ioctl, some
can actually happen in the wild when the system is under memory
pressure. Thomas Hellström pointed to us that, due to its asynchronous
nature, the vm_bind ioctl itself has to pin some memory, so if the
number of bind operations passed is too big, there is a probability
that it may run out of memory.

Previously the Kernel would return ENOMEM when this condition
happened.  Since commit e8babb280b5e ("drm/xe: Convert multiple bind
ops into single job") the Kernel has started returning ENOBUFS when it
doesn't have enough memory to do what it wants but thinks we'd succeed
if we tried to do one bind operation at a time (instead of doing
multiple operations in the same ioctl), and ENOMEM in some other
situations. Still-uncommitted commit "drm/xe: Return -ENOBUFS if a
kmalloc fails which is tied to an array of binds" proposes converting
a few more ENOMEM cases no ENOBUFS.

Still, even ENOMEM situations could in theory be possible to recover
from, because if we wait some amount of time, resources that may have
been consuming memory could end up being freed by other threads or
processes, allowing the operations to succeed. So our main idea in
this patch is that we treat both ENOMEM and ENOBUFS in the same way,
so our implementation can work with any xe.ko driver regardless of
having or not having the commits mentioned above.

So in this patch, when we detect the system is under memory pressure
(i.e., the vm_bind() function returns VK_ERROR_OUT_OF_HOST_MEMORY), we
throw away our performance expectations and try to go slowly and
steady. First we wait everything we're supposed to wait (hoping that
this alone could also help to alleviate the memory pressure), and then
we synchronously bind one piece at a time (as this will ensure ENOBUFS
can't be returned), hoping that this won't cause the Kernel to try to
reserve too much memory. All this while also hoping that whatever
thing that may be eating all the memory goes away in the meantime. If
even this fails, we give up and hope the upper layer will be able to
figure out what to do.

This fixes a bunch of LNL failures and flaky tests (as LNL is our
first officially supported xe.ko platform). This can be seen in dEQP
but only if multiple tests are being run parallel. Happens in multiple
tests, some of which may include:

  - dEQP-VK.sparse_resources.image_sparse_binding.2d_array.rgba8_snorm.1024_128_8
  - dEQP-VK.sparse_resources.image_sparse_binding.3d.rgba16_snorm.1024_128_8
  - dEQP-VK.sparse_resources.image_sparse_binding.3d.rgba16ui.512_256_6

I don't ever see these errors when running Alchemist/DG2 with xe.ko.

Fixes: e9f63df2f2 ("intel/dev: Enable LNL PCI IDs without INTEL_FORCE_PROBE")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30276>
2024-07-24 23:18:36 +00:00
Matt Turner
aae82061af intel/clc: Free disk_cache
Fixes: c15bf88f01 ("intel: Add a little OpenCL C compiler binary")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30313>
2024-07-24 20:46:28 +00:00
Matt Turner
1574372de4 intel/clc: Free parsed_spirv_data
This declaration shadowed a variable by the same type and name in an
outer scope. That variable is passed to clc_free_parsed_spirv().

Fixes: 4fd7495c69 ("intel/clc: add ability to output NIR")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30313>
2024-07-24 20:46:28 +00:00
José Roberto de Souza
945564e498 anv: Wait for Xe exec queue to be idle before destroying it
Paulo reported that Vulkan is also affected by the drop of permanent
exec queues in Xe KMD, Iris already have this handling.

So here using the special DRM_IOCTL_XE_EXEC with num_batch_buffer == 0
to get a syncobj signaled when the last DRM_IOCTL_XE_EXEC is completed.

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30156>
2024-07-24 16:27:30 +00:00
Yiwei Zhang
6d0273f67a anv: improve vma usage for descriptor buffer
The dynamic visible memory type (or the prior descriptor buffer memory
type) doesn't need special aux-tt alignment or additional ccs space.

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30317>
2024-07-23 17:56:20 +00:00
Marek Olšák
b2d32ae246 nir: add nir_intrinsic_load_per_primitive_input, split from io_semantics flag
Instead of having 1 bit in nir_io_semantics indicating a per-primitive
FS input, add a dedicated intrinsic for it.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29895>
2024-07-23 16:13:16 +00:00
Kenneth Graunke
c429d5025e intel/brw: Don't force g1's live range to be the entire program
The idea here was that pixel shader framebuffer writes used the g0 and
g1 thread payload register values to construct the message header.
However, most messages are headerless and don't use either.  There's a
2012-era comment that the simulator at one point had a bug where certain
headerless messages would incorrectly take the values from the g0/g1
register contents rather than using sideband.  But, that was likely
fixed eons ago.  So we really don't need to do this.

Furthermore, there are many more shader stages these days:
- VS: r1 contains output URB handles
- TCS: r1 contains ICP handles
- TES: r1 contains gl_TessCoord.x (r4 contains output URB handles)
- GS: r1 contains output URB handles
- CS: r1 contains LocalID.X on DG2+ but nothing on older hardware
- Task/Mesh: r1 contains LocalID.X
- BS: r1 contains bindless stack handles

Vertex and geometry aren't likely to benefit here because r1 is needed
for their output messages, which are also what terminate the shader.
TES will definitely benefit because we were making a value pointlessly
live for the whole program.  Same for TCS, to a lesser extent.  Compute
prior to DG2 was the worst, as g1 literally has no meaningful content,
so there is no point to keeping it live.

fossil-db on Alchemist shows substantial spill/fill improvements:

   Totals:
   Instrs: 148782351 -> 148741996 (-0.03%); split: -0.03%, +0.01%
   Cycles: 12602907531 -> 12605795191 (+0.02%); split: -0.70%, +0.72%
   Subgroup size: 7518608 -> 7518632 (+0.00%)
   Send messages: 7341727 -> 7341762 (+0.00%)
   Spill count: 54633 -> 52575 (-3.77%)
   Fill count: 104694 -> 100680 (-3.83%)
   Scratch Memory Size: 3375104 -> 3287040 (-2.61%)

   Totals from 301172 (48.21% of 624670) affected shaders:
   Instrs: 95531927 -> 95491572 (-0.04%); split: -0.05%, +0.01%
   Cycles: 9643531593 -> 9646419253 (+0.03%); split: -0.91%, +0.94%
   Subgroup size: 4492512 -> 4492536 (+0.00%)
   Send messages: 4399737 -> 4399772 (+0.00%)
   Spill count: 20034 -> 17976 (-10.27%)
   Fill count: 41530 -> 37516 (-9.67%)
   Scratch Memory Size: 1522688 -> 1434624 (-5.78%)

Assassin's Creed Odyssey in particular has 20% fewer fills.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30146>
2024-07-23 02:26:52 +00:00
Michael Cheng
60c73e09c6 anv: Remove extra hdc_flush from Perfetto
Remove extra reporting of hdc_flush when viewing a Perfetto trace for
anv.

Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30312>
2024-07-23 01:57:59 +00:00
Caio Oliveira
8ba8e33c39 intel/brw: Simplify @file annotations
Doxygen documentation says

> If the file name is omitted (i.e. the line after \file is left
> blank) then the documentation block that contains the \file command will
> belong to the file it is located in.

so we can omit the filename itself when using the annotation.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30168>
2024-07-22 22:48:03 +00:00
Lionel Landwerlin
1908d2c171 anv: split image view from anv_image.c
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30285>
2024-07-22 18:46:05 +00:00
Lionel Landwerlin
eff01c46d8 anv: split buffer view from anv_image.c
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30285>
2024-07-22 18:46:05 +00:00
Lionel Landwerlin
f5af56528b anv: split sampler from anv_device.c
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30285>
2024-07-22 18:46:05 +00:00
Lionel Landwerlin
543c726781 anv: split buffer from anv_device.c
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30285>
2024-07-22 18:46:05 +00:00
Lionel Landwerlin
c59e8e814a anv: split events from anv_device.c
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30285>
2024-07-22 18:46:05 +00:00
Lionel Landwerlin
ca51a02e7b anv: split physical_device from anv_device.c
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30285>
2024-07-22 18:46:05 +00:00
Lionel Landwerlin
c7ecf10c20 anv: split instance from anv_device.c
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30285>
2024-07-22 18:46:05 +00:00
José Roberto de Souza
69ee1c4b46 anv: Drop useless 'if (total_scratch > 0) {' block in cmd_buffer_ensure_cfe_state()
cmd_buffer_ensure_cfe_state() returns ealier if total_scratch == 0 here:

if (total_scratch <= comp_state->scratch_size)
      return;

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30271>
2024-07-22 18:17:38 +00:00
José Roberto de Souza
de5d767f9a intel/brw: Add a maximum scratch size restriction
Gfx 12.5 moved scratch to a surface and SURFTYPE_SCRATCH has this pitch
restriction:

RENDER_SURFACE_STATE::Surface Pitch
   For surfaces of type SURFTYPE_SCRATCH, valid range of pitch is:
   [63,262143] -> [64B, 256KB]

The pitch of the surface is the scratch size per thread and the surface
should be large enough to accommodate every physical thread.

So here adding a new field to intel_device_info, setting it in
intel_device_info_init_common() so even offline tools can have it set.
And finally adding a check to fail shader compilation if needed
scratch is larger than supported.

This issue can be reproduced in debug builds when running
dEQP-VK.protected_memory.stack.stacksize_1024 on Gfx 12.5 or newer
platforms.

Ref: BSpec 43862 (r52666)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30271>
2024-07-22 18:17:38 +00:00
Paulo Zanoni
c65a76db85 anv/trtt: don't just crash when we can't find device->trtt.queue
Please refer to the big comment this patch introduces.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30252>
2024-07-22 10:04:34 -07:00
Paulo Zanoni
3ab8ff99fa anv/trtt: fix the process of picking device->trtt.queue
We want to use actual sparse-capable queues as the default
trtt->queue, not copy queues that may have a companion_rcs_batch.
Before this patch, if we expose more than one queue *and* the
application creates a copy queue first, we'll end up setting
trtt->queue as the copy queue, which will GPU hang when we submit the
TR-TT batches as they don't support the pipe_control commands we
issue.

The trtt->queue queue is used for binding/unbinding buffers in code
paths where there's no specific queue coming from user space, such as
when we're creating or destroying a sparse resource.

This is not a problem yet on i915.ko since we are exposing
only a single queue, and it is not a problem for xe.ko since TR-TT is
not the default there. This is also not a problem in applications
that create the render or compute queue first. We plan to expose more
queues when using TR-TT, so this would become a problem without this
patch.

None of VK-GL-CTS seems to exercise that, and none of the Steam games
I tested exercise that as well. I was able to reproduce this issue
using our internal tracing tool.

v2: New implementation that doesn't break when we only have a compute
    queue (Lionel).

Fixes: 04bfe828db ("anv/sparse: allow sparse resouces to use TR-TT as its backend")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30252>
2024-07-22 10:04:34 -07:00
Paulo Zanoni
5ca224aa0c anv/trtt: make all contexts have the same TR-TT programming
On Gen12 (the oldest we support on Mesa right now for TR-TT) we
started having per-engine TR-TT registers and we are supposed to make
all contexts share the same TR-TT programming.

On LNL+, this is documented in the BSpec page for the TRTT_CNTRL
register (68417), with more details in HSDs 14020454786 and
16022013154.

On Gen12 platforms this information is a little harder to find and
there's a whole trail of HSDs leading up to 1209977595, which links to
the documents that describe the programming. BSpec for TR-TT on Gen12
is very confusing as it still contains registers and other information
from Gen11 that were not removed.

Regarding the additional BLT and COMP registers, please notice that on
the BSpec pages for the TR-TT registers, the "Register Instance"
section only lists the GFX registers as non-privileged. However, the
"User Mode Privileged Commands" lists the other instances of the TR-TT
Regsiters as non-privileged, which matches what we see: there's no
need to put these addresses in the FORCE_TO_NONPRIV registers.

Notice that for now, when TR-TT is being used we only expose a single
queue, so this change effectively does nothing until we start exposing
extra queues. I left that part for later to help bisectability.

v2:
 - s/trtt_init_context_state/trtt_init_queues_state/ (José)
 - pass device as the argument to init_queues_state (José)
v3:
 - use async_submit_end (José)

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30252>
2024-07-22 10:04:34 -07:00
Paulo Zanoni
6415027d85 anv/trtt: submit a separate batch in anv_trtt_init_context_state()
Having this as a separate batch was the normal behavior until
7da5b1caef ("anv: move trtt submissions over to the
anv_async_submit").

While it certainly sounds better to do everything related to TR-TT
initialization in one batch, we need to revert it back to be a
separate batch (but now using the new anv_async_submit infrastructure)
because we'll want to run this batch on every engine.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30252>
2024-07-22 10:04:34 -07:00
Paulo Zanoni
abbb4b20f3 anv/trtt: check the return value of anv_trtt_init_context_state()
I haven't seen this happening anywhere, but let's have it for
correctness.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30252>
2024-07-22 10:04:34 -07:00
Paulo Zanoni
fb9d94f4ed anv/trtt: make genX(init_trtt_context_state) a little more compact
In this series we're going to further change this function, adding a
lot more lines, so this patch should make the next diffs a little
easier to comprehend and review.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30252>
2024-07-22 10:04:34 -07:00
Paulo Zanoni
6bc9a57173 intel/genxml: add the BLT and COMP_CTX0 versions of the TR-TT registers
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30252>
2024-07-22 10:04:33 -07:00
Rohan Garg
fe387e14b5 anv: use the WA infrastructure when emitting WA 16013994831
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30295>
2024-07-22 13:43:39 +00:00
Sushma Venkatesh Reddy
2f6919e6c2 intel/clflush: Utilize clflushopt in intel_invalidate_range
On MTL ChromeOS boards, during AI based video conference, we were
observing a lot of overhead from invalidations. Upon debug, it was found
that we were using clflush in this function and that isn't efficient.

With this change, while executing compute workloads like zoo models, we
are getting ~25% performance improvements in a best case scenario.

Rework:
 * Jordan: Call intel_clflushopt_range() rather than
   __builtin_ia32_clflushopt() because intel_mem.c is not compiled
   with -mclflushopt.

Backport-to: 24.1 24.2
Signed-off-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30238>
2024-07-20 16:10:16 +00:00
Francisco Jerez
ff3c3792b4 anv/gfx12.5: Pass non-empty push constant data to PS stage for TBIMR workaround.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10728
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11399
Fixes: 57decad976 ("intel/xehp: Enable TBIMR by default.")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30031>
2024-07-20 01:13:19 +00:00
Francisco Jerez
b98eebbcb2 intel/brw: Implement null push constant workaround.
This implements an undocumented workaround for a hardware bug that
affects draw calls with a pixel shader that has 0 push constant cycles
when TBIMR is enabled, which has been seen to lead to a hang with
Fallout 3 and Metal Gear Rising Revengeance.  This hardware bug has
been reported as HSDES#22020184996 which is still pending a resolution
by the hardware team.  However since this workaround found empirically
has been confirmed to fix the issue reliably and it's relatively
harmless it seems worth checking in already even though no final W/A
number is available nor has the W/A json file been updated.

To avoid the issue we simply pad the push constant payload to be at
least 1 register.  This is enabled via a brw_wm_prog_key since the
driver needs to be in agreement with the compiler on whether the dummy
push constant cycle is present, and it can be avoided in cases where
the driver knows that TBIMR will be disabled (e.g. for BLORP).

Related: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10728
Related: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11399
Fixes: 57decad976 ("intel/xehp: Enable TBIMR by default.")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30031>
2024-07-20 01:13:19 +00:00
Francisco Jerez
bb2513918a intel/dev: Add devinfo flag for TBIMR push constant workaround.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30031>
2024-07-20 01:13:19 +00:00
Jordan Justen
bb8063e1f4 anv/generated_indirect_draws: Adjust xe2 simd32 sends_count_expectation
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: e9f63df ('intel/dev: Enable LNL PCI IDs without INTEL_FORCE_PROBE')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30093>
2024-07-19 16:09:06 +00:00
Daniel Stone
e05415a82e format: Generate endian-independent format aliases
Instead of having a hardcoded list of endian-independent format aliases
in the header, generate them from the format definitions.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29649>
2024-07-19 13:50:42 +00:00
Lionel Landwerlin
692e1ab2c1 anv: get rid of the second dynamic state heap
Pretty big change... Sorry for that.

I can't exactly remember why I created 2 heaps. I think it's because I
mistakenly thought the samplers in the binding sampler pointers needed
to be indexed from the binding table. But that's not the case, they
just need to be in the dynamic state heap.

In the future, this change will allow to also allocate buffers for
push constant data in the newly created dynamic_visible_pool which
will be useful on < Gfx12.0 where this is the only place push constant
data can live for compute shaders.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30047>
2024-07-19 12:21:46 +00:00
David Heidelberg
decc040abe intel/debug: allow silencing CL warnings
Useful for CI and users previously aware of the warning.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: David Heidelberg <david@ixit.cz>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29691>
2024-07-19 00:24:29 +00:00
Lionel Landwerlin
67b778445a brw: fix uniform rebuild of sources
If you have something like this :

con 32    %66 = @load_reg (%62) (base=0, legacy_fabs=0, legacy_fneg=0)
con 32    %27 = @resource_intel (%22 (0xdeaddead), %66, %67, %17 (0x0)) (desc_set=2, binding=96, resource_intel=0, resource_block_intel=-1)

Just copying the brw_reg in ssa_values[] is not enough for the
load_reg intrinsic. We need to call get_nir_src() to force some logic
to create the register correct.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: b8209d69ff ("intel/fs: Add support for new-style registers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30050>
2024-07-18 19:58:46 +00:00
Kenneth Graunke
d630ff1f79 intel/brw: Disallow scalar byte to float conversions on DG2+
I haven't been able to find this restriction mentioned anywhere in the
hardware documentation, but the simulator has code to reject this case
as invalid, and it doesn't appear to work on hardware anymore.

Having lower_regioning() handle this takes care of the issue so we
don't have to worry about generating it in random places.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11489
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30140>
2024-07-18 18:51:35 +00:00
Sushma Venkatesh Reddy
7ca77370d2 anv: Fix I915_PARAM_HAS_CONTEXT_FREQ_HINT check
When I915_PARAM_HAS_CONTEXT_FREQ_HINT is not supported the
intel_ioctl(fd, DRM_IOCTL_I915_GETPARAM, &gp) will return -1 and that
will cause i915_gem_get_param() to return false.

val will be different than 1 when not using GuC submission, so we are
forcing val check to ensure this holds good in platforms that doesn't
support GuC submission.

Fixes: d52dd5a9 ("anv/drirc: add option to provide low latency hint")

Signed-off-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30234>
2024-07-18 18:26:38 +00:00
Kenneth Graunke
534f0019d7 intel/brw: Don't mix types for unary extended math instructions
We were generating odd instructions like:

   math inv(8) g93<1>HF g85<8,8,1>HF null<8,8,1>F { align1 1Q @7 $4 };

It's unclear whether the type of the null operand matters, but sometimes
these things don't get ignored properly.  Out of caution, retype the
null source to match the actual operand's type.  It'll at least look
less surprising in assembly dumps.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30193>
2024-07-18 03:25:06 +00:00
Iván Briano
c8d64860ec anv: set MOCS for protected memory when needed
We were missing setting the EncryptedData bit in the MOCS field when
emitting the surface states for protected buffer/images. How this works
on ADL remains a mystery to me.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11313

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30097>
2024-07-17 22:56:51 +00:00
Iván Briano
ece7abb599 anv: get scratch surface from the correct pool
Fixes: 3ccf80f9b1 ("anv: prepare 2 variants of all shader instructions")

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30097>
2024-07-17 22:56:51 +00:00
José Roberto de Souza
0500e35165 intel/dev: Drop writeback_incoherent from Xe2
Xe2 platforms are only supported by Xe KMD that do not support
CPU WB + 0 way coherent.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29950>
2024-07-17 17:41:32 +00:00
José Roberto de Souza
6d77dfa75d intel/dev: Use GPU WB PAT for Xe2 writecombining
So for this entry we want the CPU mapping to be WC but GPU caches
can be WB.
This way GPU don't need to snoop to CPU caches and at the end of
workloads L3 cache is flushed, so CPU access is coherent after get
the signal that workload was finished.

With this the transient(XD) L3 flushes will only affect displayable
buffers.

Ref: Bspec 71582 (r59285)
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29950>
2024-07-17 17:41:32 +00:00
José Roberto de Souza
48da8eab55 intel/dev: Add comment documenting the PAT entries
Like said in the past patch, coherency is not needed and there
was a miss understating about caching used by CPU and GPU.
With this new comment it much better explained.

Ref: Bspec 45101 (r51017)
Ref: Bspec 71582 (r59285)
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29950>
2024-07-17 17:41:32 +00:00
José Roberto de Souza
7295e09b53 intel/dev: Drop coherency from intel_device_info_pat_entry
It is not used in run-time so we can drop from the struct.
It might have value as PAT entries documentation but that will be done
in the next patch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29950>
2024-07-17 17:41:32 +00:00
José Roberto de Souza
fa1129540a intel/dev: Add documentation about intel_device_info_pat_entry::mmap
My initial understating was that L3_CACHE_POLICY would be the CPU
caching mode but that has nothing to do with CPU caching, it is the
GPU caching mode.

Due this miss understating we were using a not optimal PAT index that
will be fixed in the next patches, so to avoid such issues in future
adding comments to intel_device_info_pat_entry struct.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29950>
2024-07-17 17:41:32 +00:00
José Roberto de Souza
4173e0f910 intel/dev: Drop DG1 PAT entries
It inherents that table from TGL.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29950>
2024-07-17 17:41:32 +00:00
José Roberto de Souza
178950bf9b anv: Fix return of PAT index for compressed bos for discrete GPUs
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29950>
2024-07-17 17:41:32 +00:00
José Roberto de Souza
4fd7cad05d intel: Rename XE_PERF to XE_OBSERVATION
Xe KMD renamed XE_PERF to XE_OBSERVATION to better match with Intel
specification and avoid confusion.
This uAPI rename will land in the same kernel version that added
the uAPI being renamed.

There is no uAPI change, just renames.

Sync xe_drm.h with 63347fe031e3 ("drm/xe/uapi: Rename xe perf layer as xe observation layer").

Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30027>
2024-07-17 01:00:34 +00:00