Commit graph

5136 commits

Author SHA1 Message Date
Paulo Zanoni
c87f7c13fa anv/sparse: reject binds that are not a multiple of the granularity
From the spec:

  "Resources can be bound at some defined (sparse block) granularity."

  "The sparse block size in bytes for sparse buffers and
   fully-resident images is reported as
   VkMemoryRequirements::alignment. alignment represents both the
   memory alignment requirement and the binding granularity (in bytes)
   for sparse resources."

Not only the upper layer (the Spec) doesn't allow this, the lower
layers (both the vm_bind ioctl and TR-TT) also work on a granularity.
Just check for this case and return an error.

Before this check, what would happen was:
  - for the vm_bind backend, the vm_bind ioctl would fail
  - for the TR-TT backend, we'd understimate l1_binds_capacity and
    fail an assertion, or we'd just silently bind 64kb instead of the
    original size

Currently, some Zink tests such as piglit/arb_sparse_buffer-basic can
trigger this behavior, but we're working to fix Zink for this case
(and that commit may be merged before this one).

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26454>
2023-12-06 00:29:58 +00:00
Lionel Landwerlin
7c76125db2 anv: use 2 different buffers for surfaces/samplers in descriptor sets
We had the unfortunate finding on a recent platform to learn that the
bindless sampler heap is not functioning as expected.

Nowhere in the documentation is the size of the heap written down. So
most people assumed that's the max number that we can program (4Gb).

The reality is that it's only 64Mb.

Though it is appearing like it's working properly for the whole 4Gb
range for most apps, this is only because the HW bounds checking
applied is broken. Instead of clamping anything beyong 64Mb, it's only
clamping the last 4Kb of each 64Mb region.

So this heap is useless for us to make a 4Gb region of both sampler &
surface states...

This change essentially turns off the bindless sampler heap on DG2+.

The only location where we can put SAMPLER_STATE elements is the
dynamic state heap. Unfortunately we cannot align the dynamic state
heap with the bindless surface state heap. So the solution is to
allocate sampler & surface states separately, each from the own heap
in the descriptor pool.

We now have to provide 2 sets of offsets for surfaces & samplers.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25897>
2023-12-04 23:06:05 +00:00
Lionel Landwerlin
09a3a93372 anv: set layout printer
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25897>
2023-12-04 23:06:05 +00:00
Lionel Landwerlin
4608de6645 anv: add missing push descriptor flush on ray tracing pipelines
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25897>
2023-12-04 23:06:05 +00:00
Lionel Landwerlin
f26e83b6a4 anv: make a couple of descriptor function private
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25897>
2023-12-04 23:06:05 +00:00
Lionel Landwerlin
1cdadbcdf6 anv: move descriptor set type selection to earlier
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25897>
2023-12-04 23:06:05 +00:00
Lionel Landwerlin
18a1234541 anv: add a sampler state pool
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25897>
2023-12-04 23:06:05 +00:00
Sviatoslav Peleshko
5cb20b5edc anv: Fix MI_ARB_CHECK calls in generated indirect draws optimization
According to PRMs, to use self-modifying code correctly we have to
disable preparser before jumping to the generated commands, and re-enable
it with a first command in that buffer.

Old implementation did it wrong: for both inplace and inring generation
it disabled preparser before running the generation shader, had it
disabled during generation, and re-enabled it just before jumping to
the generated commands.

This usually didn't cause any trouble, because the generation shader and
generated draws are in different BOs, and the jump distance is greater than
the command FIFO depth. But we allocate them from the same pool,
so there are rare cases where the end of the BO with generation commands,
and the beginning of the BO with generated draws are adjacent. In such
cases, the wrong commands might be fetched.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10162
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26427>
2023-12-04 22:02:59 +00:00
Eric Engestrom
680d5fdaf3 anv: update symbols that have become aliases for newer ones
All of these have been renamed in the spec (usually by being promoted);
renamed them in our code too.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26491>
2023-12-04 18:06:57 +00:00
Caio Oliveira
f5d15d6a06 anv/xe2+: Use Region-based Tessellation redistribution
Update to recommended value from BSpec for xe2.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26438>
2023-12-02 02:22:07 +00:00
Marcin Ślusarz
878ca75335 anv: fix minSubgroupSize for xe2
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26437>
2023-12-02 01:55:26 +00:00
Rohan Garg
8cfae77439 anv: enable VK_EXT_depth_range_unrestricted
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26426>
2023-12-01 13:23:54 +00:00
Rohan Garg
80cafa3571 anv: ensure that we clamp only when EXT_depth_range_unrestricted is not enabled
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26426>
2023-12-01 13:23:54 +00:00
Rohan Garg
2e72917923 blorp: set min/max viewport depths to -FLT_MAX/FLT_MAX when EXT_depth_range_unrestricted is enabled
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26426>
2023-12-01 13:23:54 +00:00
Jordan Justen
d95bbf35c9 anv: Set COMPUTE_WALKER Message SIMD field
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26390>
2023-12-01 02:36:12 +00:00
José Roberto de Souza
42dd48e933 anv: Fix vm bind of DRM_XE_VM_BIND_FLAG_NULL
In this case bo is NULL so application was crashing when it was trying
to get the alloc_flags of bo to get the intel_device_info_pat_entry.

Fixes: 1a0d3504d5 ("anv: Fill PAT fields in Xe KMD gem_create and vm_bind uAPIs")
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26430>
2023-11-30 22:22:41 +00:00
Rohan Garg
f3d99e3535 anv: introduce ANV_TIMESTAMP_REWRITE_INDIRECT_DISPATCH
In order to rewrite timestamps for indirect dispatch's, instroduce a
ANV_TIMESTAMP_REWRITE_INDIRECT_DISPATCH that repacks the PostSync field
for a EXECUTE_INDIRECT_DISPATCH.

Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26421>
2023-11-30 17:01:45 +00:00
Rohan Garg
9dd49e7a63 anv: memcpy the thread dimentions only when they're on the CPU
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26421>
2023-11-30 17:01:45 +00:00
Rohan Garg
580728564e anv: Emit a EXECUTE_INDIRECT_DISPATCH when available
On newer platforms (Arrowlake and above) we can issue a
EXECUTE_INDIRECT_DISPATCH that allows us to:
  * Skip issuing mi load/store instructions for indirect parameters

Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26421>
2023-11-30 17:01:45 +00:00
Rohan Garg
6d4f43f0d6 anv: Emit EXECUTE_INDIRECT_DRAW when available
On newer platforms (Arrowlake and above) we can issue a
EXECUTE_INDIRECT_DRAW that allows us to:
  * Skip issuing mi load/store instructions for indirect parameters
  * Skip doing the indirect draw unroll on the CPU side when the
    appropriate stride is passed

Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26421>
2023-11-30 17:01:45 +00:00
Rohan Garg
fa350862e9 anv: refactor kernel dispatch to use new common functions
Refactor the function to use the new common functions introduced for
indirect dispatch previously.

Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26421>
2023-11-30 17:01:44 +00:00
Rohan Garg
51d2d9a665 anv: Refactor loading indirect parameters and filling IDD
Refactor out loading the indirect parameters and filling the interface
descriptor data.

Reworks:
 * Jordan: Change anv to use get_interface_descriptor_data which
   returns the IDD struct rather than filling it.

Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26421>
2023-11-30 17:01:44 +00:00
Sagar Ghuge
4ebad93c9c anv,hasvk: Use uint32_t for queue family indices
Vulkan API uses uint32_t for the queue family indices.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26387>
2023-11-29 19:07:17 +00:00
José Roberto de Souza
c9e41f25a1 anv: Add heaps for Xe KMD in platforms without LLC
As Xe KMD don't support WB + 0 way coherency, so this are the only two
memory types possible for integrated GPUs without LLC in Xe KMD.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25462>
2023-11-29 14:57:42 +00:00
José Roberto de Souza
1a0d3504d5 anv: Fill PAT fields in Xe KMD gem_create and vm_bind uAPIs
Unlike i915, Xe KMD needs the cache parameter in gem_create
then during vm bind it request the PAT index that matches previous
parameter.
The PAT index selected could have more memory caracteristics that KMD
don't need to know.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25462>
2023-11-29 14:57:42 +00:00
José Roberto de Souza
99ae565af2 anv: Prepare anv_device_get_pat_entry() for discrete GPUs
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25462>
2023-11-29 14:57:42 +00:00
José Roberto de Souza
d491742d19 anv: Add support all possible cached and coherent memory types
This changes allow us to support HOST_COHERENT, HOST_CACHED and
HOST_COHERENT + HOST_CACHED memory types for platforms that has
the PAT uAPI.

Be aware that Xe KMD will not be able to support cached only memory
types, anv_xe_physical_device_init_memory_types() will reflect that
but internal usage should not allocate
VK_MEMORY_PROPERTY_HOST_CACHED_BIT only memory, hence the assert
added.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25462>
2023-11-29 14:57:42 +00:00
José Roberto de Souza
3baab9bb38 anv: Rename ANV_BO_ALLOC_SNOOPED to ANV_BO_ALLOC_HOST_CACHED_COHERENT
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25462>
2023-11-29 14:57:42 +00:00
Tapani Pälli
ec43c20182 anv: implement dummy blit for Wa_16018063123
Insert a dummy blit prior to MI_ARB_CHECK, MI_SEMAPHORE_WAIT,
MI_FLUSH_DW submitted on the copy engine.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26209>
2023-11-29 08:09:06 +00:00
Lionel Landwerlin
7dff232c09 intel/ds: add trace of buffer markers
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14924>
2023-11-29 01:16:22 +00:00
Kenneth Graunke
c8e122a738 anv: Implement rudimentary VK_AMD_buffer_marker support
This provides a basic implementation of VK_AMD_buffer_marker: we can
write the 32-bit markers from within a command buffer.  Unfortunately,
our hardware has several limitations that make this difficult to
implement well:

   1. We don't have insight into when specific stages finish (i.e.
      all geometry shaders are done, but pixel rasterization may
      still be occurring).

   2. We cannot perform pipelined writes of 32-bit values to arbitrary
      memory locations.  PIPE_CONTROL::Write Immediate Value would be
      the obvious way to implement this, but it only supports 64-bit
      values, and the extension doesn't allow us to do that.  We instead
      use MI_STORE_DATA_IMM to write 32-bit values, but this requires
      hard stalls.

Despite those limitations, the extension may still be useful for tools
to debug GPU hangs.  We hope to offer another extension in the future
which offers similar functionality but is more efficient on our GPUs.

v2: Updated by Lionel Landwerlin to fix a number of flushing and
    cache coherency issues with these writes.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14924>
2023-11-29 01:16:22 +00:00
José Roberto de Souza
6a245e4eea intel: Share function to do device query in Xe KMD
A "dance" is required with this uAPI, first we need to ask KMD what is
the size of the giving query id, then memory needs to be allocated to
match that size and then query again with the memory address set and
at this time Xe KMD will copy the query data to memory.

This dance was being duplicated in xe_engine_get_info() and
anv_xe_physical_device_get_parameters() and the next patch will also
use it in Iris, so here adding it common/xe and re-using as much
as possible.

There is one more implementation of this function in intel/dev but
due to how libs are linked intel/dev can't depend on to intel/common.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26325>
2023-11-28 18:17:45 +00:00
Lionel Landwerlin
b18006397b anv: remove heuristic preferring dedicated allocations
This heuristic doesn't show much difference when you have a beafy
processor but on lower end skus, it increase the number of buffers in
the execbuffer ioctl, adding significant overhead in i915.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 4cdd3178fb ("anv: Meet CCS alignment reqs with dedicated allocs")
Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26335>
2023-11-28 16:13:11 +00:00
Lionel Landwerlin
7b87e1afbc anv: track & unbind image aux-tt binding
This solves a problem when you have a big memory chunk of which some
regions are bound to images. If the image is destroyed, currently the
aux-tt mapping stays and prevent any new image aux-tt mapping within
that region, until the memory is freed.

This maps & unmaps the aux-tt region at respectively bind & destroy
time, so that the memory chunks can be map through aux-tt.

If there is aliasing of memory to 2 different images, then the first
one "wins" the aux mapping and gets compression support. The second
one doesn´t.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: ee6e2bc4a3 ("anv: Place images into the aux-map when safe to do so")
Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26335>
2023-11-28 16:13:11 +00:00
Lionel Landwerlin
b09db9d823 anv: use main image address to determine ccs compatibility
The BO address is not really a good criteria since we can bind an
image at an offset inside a BO.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: ee6e2bc4a3 ("anv: Place images into the aux-map when safe to do so")
Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26335>
2023-11-28 16:13:11 +00:00
José Roberto de Souza
7046a9e280 intel: Rename PAT entries
Here renaming the PAT entries to a name that better express each
entry.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25447>
2023-11-23 21:19:18 +00:00
Iván Briano
43cb4cb6dd anv: use the right vertexOffset on CmdDrawMultiIndexed
Fixes: c70ef757e6 ("anv: Use extended parameters on Gen11+")

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26327>
2023-11-22 13:11:34 -08:00
Sagar Ghuge
2d3f0a834a anv: Add comment to copy image code block
Anybody will be tempted to factor out the if-else block code since it
looks like duplication but else block actually handles the ycbcr images
where the aspect masks are compatible but don't need to be the same.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26294>
2023-11-22 17:42:43 +00:00
Tapani Pälli
d3e3c30d36 anv: implement Wa_18020335297
Set some state and implement dummy draws whenever viewport pointer
is being reprogrammed.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25987>
2023-11-22 05:23:12 +00:00
Tapani Pälli
418299c120 anv: refactor state emission
Add a helper that only emits hw_state, this makes it easier to modify
dirty state and call helper to emit only wanted state.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25987>
2023-11-22 05:23:12 +00:00
Francisco Jerez
6a810b0ba8 intel: Improve N-way pixel hashing computation to handle pixel pipes with asymmetric processing power.
This reworks the intel_compute_pixel_hash_table_nway() pixel pipe
hashing table computation helper to handle cases where some pixel
pipes have processing power different from the others, this is helpful
for Gfx12.7+ platforms where there are pixel pipes with 1 DSS as well
as pixel pipes with 2 DSSes, which currently can lead to a serious
performance bottleneck in the pixel pipes with lower processing power.

In order to avoid such a load imbalance the
intel_compute_pixel_hash_table_nway() function will now take two pixel
pipe bitsets instead of one: Pixel pipes enabled on both bitsets will
appear with twice the frequency on the table as pixel pipes which only
appear on one bitset.  See the comments below for more details on the
algorithm used to construct a pixel hashing table with the desired
properties.

With this change rendering performance improves by about 25% on a
fused MTL platform -- The list of specific configs this is expected to
show an improvement on is not included here since the list is rather
long and some of the configs may still be embargoed or may never be
productized, but in order to find out whether your Gfx12.7+ device
could be affected by this you can check the output of the
intel_dev_info tool from the Mesa tree and see if there are multiple
"pixel pipe" entries with different DSS count.  That isn't expected to
occur on any DG2 configuration, only on MTL+ platforms, so this change
should have no effect at all on DG2 (it's easy to convince oneself
that it won't since for DG2 mask1 should equal mask2 so mask2 will be
set to zero at the beginning of intel_compute_pixel_hash_table_nway()
and the new swzx[] permutation will be set to the identity).

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26266>
2023-11-20 23:48:34 +00:00
José Roberto de Souza
205c5874d4 intel: Sync xe_drm.h
Sync xe_drm.h with commit 3b8183b7efad ("drm/xe/uapi: Be more specific
about the vm_bind prefetch region").

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26238>
2023-11-20 17:57:34 +00:00
Shuicheng Lin
dddab9fa77 intel/xe: Correct DRM_XE_EXEC_QUEUE_SET_PROPERTY's ioctl
DRM_XE_EXEC_QUEUE_SET_PROPERTY is the offset,
while DRM_IOCTL_XE_EXEC_QUEUE_SET_PROPERTY is the real number.

Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26253>
2023-11-18 10:17:45 +00:00
Paulo Zanoni
563678f310 anv/sparse: don't support YCBCR 2x1 compressed formats
Regarding supporting these formats, the spec says:

  "A sparse image created using VK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
   supports all non-compressed color formats with power-of-two element
   size that non-sparse usage supports. Additional formats may also be
   supported and can be queried via
   vkGetPhysicalDeviceSparseImageFormatProperties.
   VK_IMAGE_TILING_LINEAR tiling is not supported."

Regarding the formats themselves, the spec says:
  "VK_FORMAT_B8G8R8G8_422_UNORM specifies a four-component, 32-bit
   format containing a pair of G components, an R component, and a B
   component, collectively encoding a 2×1 rectangle of unsigned
   normalized RGB texel data. One G value is present at each i
   coordinate, with the B and R values shared across both G values and
   thus recorded at half the horizontal resolution of the image. This
   format has an 8-bit B component in byte 0, an 8-bit G component for
   the even i coordinate in byte 1, an 8-bit R component in byte 2,
   and an 8-bit G component for the odd i coordinate in byte 3. This
   format only supports images with a width that is a multiple of two.
   For the purposes of the constraints on copy extents, this format is
   treated as a compressed format with a 2×1 compressed texel block."

Since these formats are to be considered compressed 2x1 blocks and we
don't necessarily have to support non-compressed formats that
non-sparse support, we can claim them as not supported with sparse.

In addition to all of that, if you look at isl_gfx125_filter_tiling()
you'll see that we don't even support Tile64 for these formats, so
sparse residency (i.e., non-opaque image binds) doesn't really make
sense for them yet.

The Vulkan spec defines 4 other YCBCR "2x1 compressed" formats like
the ones we have in this commit, but we don't support them even
without sparse, so there's no reason to check them here.

A recent change in VK-GL-CTS made tests that use these formats go from
unsupported to failures:
  7ecc7716a983 ("Do not use and check for STORAGE image support, when
  it is not used in the test")

This commit "fixes" the following VK-GL-CTS failures (by making them
return NotSupported):
  dEQP-VK.sparse_resources.image_block_shapes.2d.b8g8r8g8_422_unorm.samples_1
  dEQP-VK.sparse_resources.image_block_shapes.2d.g8b8g8r8_422_unorm.samples_1
  dEQP-VK.sparse_resources.image_block_shapes.2d_array.b8g8r8g8_422_unorm.samples_1
  dEQP-VK.sparse_resources.image_block_shapes.2d_array.g8b8g8r8_422_unorm.samples_1

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25512>
2023-11-17 17:58:29 +00:00
Paulo Zanoni
a0559768db anv: enable sparse by default on i915.ko
On i915.ko we don't have the vm_bind ioctl, so sparse requires TR-TT.
Unfortunately, on gfx < 20 TR-TT is not compatible with non-render
queues, so we have to disable those when sparse is enabled. Notice
that although we don't have TR-TT for non-render queues on gfx >= 20,
vm_bind is the default, and it doesn't have this restriction.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25512>
2023-11-17 17:58:29 +00:00
Paulo Zanoni
fda5163f34 anv/trtt: properly handle the lifetime of TR-TT batch BOs
We need to wait for the batches to complete before we return the BOs
to the pool. We were previously doing this completely synchronously,
which made the code unnecessarily wait. Now we have a timeline syncobj
that signals completion of the previous BOs, so sometimes we check
where we are in the timeline and then return the BOs that we know are
unused.

This, in addition to the previous patch that made us wait for the
other syncobjs through the execbuf ioctl instead of through the CPU,
makes TR-TT batches at least an order of magnitude faster. Still, I
don't think we'll notice any changes in games's FPS as they don't bind
sparse resources that often.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25512>
2023-11-17 17:58:29 +00:00
Paulo Zanoni
0f21836272 anv/trtt: add support for queue->sync to the TR-TT batches
At this moment this patch won't buy us anything since we're already
being completely synchronous, but the next patch is going to change
this and so queue->sync will start making sense.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25512>
2023-11-17 17:58:29 +00:00
Paulo Zanoni
1534ee46b8 anv/trtt: add struct anv_trtt_batch_bo and pass it around
For now it just wraps the bo and size, so there's really no value to
having it. In the next commit we'll add more elements to the struct.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25512>
2023-11-17 17:58:29 +00:00
Paulo Zanoni
18bd00c024 anv/trtt: don't wait/signal syncobjs using the CPU anymore
Pass them as part of the TR-TT batch. This is what a lot of the
previous commits were building up to.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25512>
2023-11-17 17:58:28 +00:00
Paulo Zanoni
f2206a0eb1 anv/xe: allow passing extra syncs to xe_exec_process_syncs()
We're going to use this in two different patches.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25512>
2023-11-17 17:58:28 +00:00