Commit graph

10669 commits

Author SHA1 Message Date
Rohan Garg
f3d99e3535 anv: introduce ANV_TIMESTAMP_REWRITE_INDIRECT_DISPATCH
In order to rewrite timestamps for indirect dispatch's, instroduce a
ANV_TIMESTAMP_REWRITE_INDIRECT_DISPATCH that repacks the PostSync field
for a EXECUTE_INDIRECT_DISPATCH.

Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26421>
2023-11-30 17:01:45 +00:00
Rohan Garg
9dd49e7a63 anv: memcpy the thread dimentions only when they're on the CPU
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26421>
2023-11-30 17:01:45 +00:00
Rohan Garg
580728564e anv: Emit a EXECUTE_INDIRECT_DISPATCH when available
On newer platforms (Arrowlake and above) we can issue a
EXECUTE_INDIRECT_DISPATCH that allows us to:
  * Skip issuing mi load/store instructions for indirect parameters

Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26421>
2023-11-30 17:01:45 +00:00
Rohan Garg
6d4f43f0d6 anv: Emit EXECUTE_INDIRECT_DRAW when available
On newer platforms (Arrowlake and above) we can issue a
EXECUTE_INDIRECT_DRAW that allows us to:
  * Skip issuing mi load/store instructions for indirect parameters
  * Skip doing the indirect draw unroll on the CPU side when the
    appropriate stride is passed

Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26421>
2023-11-30 17:01:45 +00:00
Rohan Garg
7a9e82e82f genxml/12.5: Add the EXECUTE_INDIRECT_DISPATCH instruction
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26421>
2023-11-30 17:01:45 +00:00
Rohan Garg
4229757309 genxml/12.5: Add the EXECUTE_INDIRECT_DRAW instruction
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26421>
2023-11-30 17:01:44 +00:00
Rohan Garg
6e060d99ba intel/dev: Add a bit for when the HW can do a indirect draw/dispatch unroll
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26421>
2023-11-30 17:01:44 +00:00
Rohan Garg
fa350862e9 anv: refactor kernel dispatch to use new common functions
Refactor the function to use the new common functions introduced for
indirect dispatch previously.

Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26421>
2023-11-30 17:01:44 +00:00
Rohan Garg
51d2d9a665 anv: Refactor loading indirect parameters and filling IDD
Refactor out loading the indirect parameters and filling the interface
descriptor data.

Reworks:
 * Jordan: Change anv to use get_interface_descriptor_data which
   returns the IDD struct rather than filling it.

Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26421>
2023-11-30 17:01:44 +00:00
José Roberto de Souza
b27ca68143 intel/dev: Adjust prefetch_size values for Xe2 engines
Xe2 follows MTL and has different prefetch sizes for different
types of engines.

BSpec: 60223
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26396>
2023-11-30 14:54:04 +00:00
Sagar Ghuge
4ebad93c9c anv,hasvk: Use uint32_t for queue family indices
Vulkan API uses uint32_t for the queue family indices.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26387>
2023-11-29 19:07:17 +00:00
José Roberto de Souza
c9e41f25a1 anv: Add heaps for Xe KMD in platforms without LLC
As Xe KMD don't support WB + 0 way coherency, so this are the only two
memory types possible for integrated GPUs without LLC in Xe KMD.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25462>
2023-11-29 14:57:42 +00:00
José Roberto de Souza
1a0d3504d5 anv: Fill PAT fields in Xe KMD gem_create and vm_bind uAPIs
Unlike i915, Xe KMD needs the cache parameter in gem_create
then during vm bind it request the PAT index that matches previous
parameter.
The PAT index selected could have more memory caracteristics that KMD
don't need to know.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25462>
2023-11-29 14:57:42 +00:00
José Roberto de Souza
99ae565af2 anv: Prepare anv_device_get_pat_entry() for discrete GPUs
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25462>
2023-11-29 14:57:42 +00:00
José Roberto de Souza
05b3967ddc intel: Enable has_set_pat_uapi for Xe
Xe KMD requires that all platforms supported by it set PAT information.
This will be implemented in Iris and ANV in the next patches.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25462>
2023-11-29 14:57:42 +00:00
José Roberto de Souza
500e037661 intel: Add PAT entries for gfx12 and newer
Xe KMD requires PAT for all platforms so here adding PAT entries to
all platforms supported by Xe KMD.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25462>
2023-11-29 14:57:42 +00:00
José Roberto de Souza
d491742d19 anv: Add support all possible cached and coherent memory types
This changes allow us to support HOST_COHERENT, HOST_CACHED and
HOST_COHERENT + HOST_CACHED memory types for platforms that has
the PAT uAPI.

Be aware that Xe KMD will not be able to support cached only memory
types, anv_xe_physical_device_init_memory_types() will reflect that
but internal usage should not allocate
VK_MEMORY_PROPERTY_HOST_CACHED_BIT only memory, hence the assert
added.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25462>
2023-11-29 14:57:42 +00:00
José Roberto de Souza
3baab9bb38 anv: Rename ANV_BO_ALLOC_SNOOPED to ANV_BO_ALLOC_HOST_CACHED_COHERENT
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25462>
2023-11-29 14:57:42 +00:00
Jan Beich
112093f9e2 intel: make CLOCK_BOOTTIME optional for non-Linux
src/intel/common/xe/intel_gem.c:71:9: error: use of undeclared identifier 'CLOCK_BOOTTIME'
   case CLOCK_BOOTTIME:
        ^

Fixes: ae0df368a8 ("intel/common: Add intel_gem_read_correlate_cpu_gpu_timestamp()")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26392>
2023-11-29 10:14:01 +00:00
Jan Beich
5c32c41f65 intel: make CLOCK_TAI optional for non-Linux
src/intel/common/xe/intel_gem.c:72:9: error: use of undeclared identifier 'CLOCK_TAI'
   case CLOCK_TAI:
        ^

Fixes: ae0df368a8 ("intel/common: Add intel_gem_read_correlate_cpu_gpu_timestamp()")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26392>
2023-11-29 10:14:01 +00:00
Tapani Pälli
ec43c20182 anv: implement dummy blit for Wa_16018063123
Insert a dummy blit prior to MI_ARB_CHECK, MI_SEMAPHORE_WAIT,
MI_FLUSH_DW submitted on the copy engine.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26209>
2023-11-29 08:09:06 +00:00
Lionel Landwerlin
7dff232c09 intel/ds: add trace of buffer markers
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14924>
2023-11-29 01:16:22 +00:00
Kenneth Graunke
c8e122a738 anv: Implement rudimentary VK_AMD_buffer_marker support
This provides a basic implementation of VK_AMD_buffer_marker: we can
write the 32-bit markers from within a command buffer.  Unfortunately,
our hardware has several limitations that make this difficult to
implement well:

   1. We don't have insight into when specific stages finish (i.e.
      all geometry shaders are done, but pixel rasterization may
      still be occurring).

   2. We cannot perform pipelined writes of 32-bit values to arbitrary
      memory locations.  PIPE_CONTROL::Write Immediate Value would be
      the obvious way to implement this, but it only supports 64-bit
      values, and the extension doesn't allow us to do that.  We instead
      use MI_STORE_DATA_IMM to write 32-bit values, but this requires
      hard stalls.

Despite those limitations, the extension may still be useful for tools
to debug GPU hangs.  We hope to offer another extension in the future
which offers similar functionality but is more efficient on our GPUs.

v2: Updated by Lionel Landwerlin to fix a number of flushing and
    cache coherency issues with these writes.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14924>
2023-11-29 01:16:22 +00:00
Caio Oliveira
5de5a0d475 intel/compiler: Don't use fs_visitor::bld in thread payload classes
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26301>
2023-11-28 19:53:51 +00:00
Caio Oliveira
2d6240ab14 intel/compiler: Don't use fs_visitor::bld in fs_reg_alloc
Just set up the builder without relying on the pre-existing one.  Moves
one step close to remove bld from fs_visitor.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26301>
2023-11-28 19:53:51 +00:00
Caio Oliveira
f55867b56c intel/compiler: Don't use fs_visitor::bld in tests
Tests create their own fs_builder now.  Moves one step closer to remove
bld from fs_visitor.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26301>
2023-11-28 19:53:51 +00:00
Caio Oliveira
9540259e1c intel/compiler: Prefer ctor/dtors in some Google Tests
Per Google Test FAQ recommendation, prefer consutrctors and destructors
unless there's a need to use SetUp/TearDown.

We will take advantage of this later to initialize an fs_builder.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26301>
2023-11-28 19:53:51 +00:00
José Roberto de Souza
6a245e4eea intel: Share function to do device query in Xe KMD
A "dance" is required with this uAPI, first we need to ask KMD what is
the size of the giving query id, then memory needs to be allocated to
match that size and then query again with the memory address set and
at this time Xe KMD will copy the query data to memory.

This dance was being duplicated in xe_engine_get_info() and
anv_xe_physical_device_get_parameters() and the next patch will also
use it in Iris, so here adding it common/xe and re-using as much
as possible.

There is one more implementation of this function in intel/dev but
due to how libs are linked intel/dev can't depend on to intel/common.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26325>
2023-11-28 18:17:45 +00:00
Lionel Landwerlin
b18006397b anv: remove heuristic preferring dedicated allocations
This heuristic doesn't show much difference when you have a beafy
processor but on lower end skus, it increase the number of buffers in
the execbuffer ioctl, adding significant overhead in i915.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 4cdd3178fb ("anv: Meet CCS alignment reqs with dedicated allocs")
Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26335>
2023-11-28 16:13:11 +00:00
Lionel Landwerlin
7b87e1afbc anv: track & unbind image aux-tt binding
This solves a problem when you have a big memory chunk of which some
regions are bound to images. If the image is destroyed, currently the
aux-tt mapping stays and prevent any new image aux-tt mapping within
that region, until the memory is freed.

This maps & unmaps the aux-tt region at respectively bind & destroy
time, so that the memory chunks can be map through aux-tt.

If there is aliasing of memory to 2 different images, then the first
one "wins" the aux mapping and gets compression support. The second
one doesn´t.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: ee6e2bc4a3 ("anv: Place images into the aux-map when safe to do so")
Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26335>
2023-11-28 16:13:11 +00:00
Lionel Landwerlin
b09db9d823 anv: use main image address to determine ccs compatibility
The BO address is not really a good criteria since we can bind an
image at an offset inside a BO.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: ee6e2bc4a3 ("anv: Place images into the aux-map when safe to do so")
Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26335>
2023-11-28 16:13:11 +00:00
Lionel Landwerlin
7c6faa1efe intel/aux_map: introduce ref count of L1 entries
To implement this feature, we need to do CPU side tracking of all
L3/L2/L1 entries. This does add a little bit of CPU allocations, but
the advantage is that the traversal of the page table tree is faster.
No more need for the linear seach of find_buffer().

With this feature, we can have multiple VkImage bind to the same main
memory address, as long as they share exact same mapping parameters.
The AUX mapping will be removed when the last VkImage is destroyed.

As previously, if the L1 mapping entry parameters don't match, the
mapping fails. Anv handles this nicely by just disabling AUX on the
image.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26335>
2023-11-28 16:13:11 +00:00
Lionel Landwerlin
e22e88f8ce intel/fs: reuse set_predicate()
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26306>
2023-11-28 13:40:07 +00:00
Lionel Landwerlin
83a1657b6c intel/fs: fix incorrect register flag interaction with dynamic interpolator mode
Once NIR code is lowered and a few optimization passes have run, there
might be flag register interactions between instructions quite far
away from one another.

In the following case :

   f0 = and r0, r1
   ...
   fs_interpolate r2, r3
   ...
   if f0
      ...
   endif

If we lower fs_inteporlate while using the f0 register, we completely
garble the value meant for the if block.

To fix this, emit the predication for fs_interpolate in brw_fs_nir.cpp
when doing the NIR translation to the backend IR. This will guarantee
that the flag register interactions are visible to the optimization
passes, avoiding the problem above.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 68027bd38e ("intel/fs: implement dynamic interpolation mode for dynamic persample shaders")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9757
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26306>
2023-11-28 13:40:07 +00:00
Iván Briano
6f9be9a2a0 hasvk: ensure we reapply always pipeline dynamic state in runtime state
Backport of 24631d308c

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26341>
2023-11-27 20:36:07 +00:00
Eric Engestrom
cf510e38a5 intel/ci: fix .hasvk-manual-rules
Fixes: 570acf5655 ("ci: Add a manual full and 1/10th hasvk CTS runs.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26259>
2023-11-27 12:55:18 +00:00
Eric Engestrom
1942073112 intel/perf: fix regex escaping
`\$` is interpreted before being passed to `re.search()`, but luckily
for us the escape is also invalid and because of that, python 3.12+
warns us about it.

Use a raw string instead, so that the `\` is passed untouched to
`re.search()`.

Fixes: aa04b47c6e ("intel/perf: add support for GtSlice/GtSliceXDualsubsliceY variables")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26355>
2023-11-27 11:58:03 +00:00
José Roberto de Souza
7046a9e280 intel: Rename PAT entries
Here renaming the PAT entries to a name that better express each
entry.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25447>
2023-11-23 21:19:18 +00:00
Iván Briano
43cb4cb6dd anv: use the right vertexOffset on CmdDrawMultiIndexed
Fixes: c70ef757e6 ("anv: Use extended parameters on Gen11+")

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26327>
2023-11-22 13:11:34 -08:00
Sagar Ghuge
2d3f0a834a anv: Add comment to copy image code block
Anybody will be tempted to factor out the if-else block code since it
looks like duplication but else block actually handles the ycbcr images
where the aspect masks are compatible but don't need to be the same.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26294>
2023-11-22 17:42:43 +00:00
Daniel Schürmann
1179d83a89 nir: remove info.fs.needs_all_helper_invocations
Use info.uses_wide_subgroup_intrinsics instead.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26026>
2023-11-22 11:31:11 +01:00
Tapani Pälli
d3e3c30d36 anv: implement Wa_18020335297
Set some state and implement dummy draws whenever viewport pointer
is being reprogrammed.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25987>
2023-11-22 05:23:12 +00:00
Tapani Pälli
418299c120 anv: refactor state emission
Add a helper that only emits hw_state, this makes it easier to modify
dirty state and call helper to emit only wanted state.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25987>
2023-11-22 05:23:12 +00:00
Caio Oliveira
e8220b9319 intel/compiler: Simplify allocation of NIR related arrays
Those are not reused, so this will be the first and only allocation, so
no need to use the "realloc" variants.

For the fs_reg arrays, there's currently no particular reason to keep
them uninitialized, so zero-initialize them too -- not ideal but better
than random values.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26302>
2023-11-21 18:31:05 +00:00
Francisco Jerez
6a810b0ba8 intel: Improve N-way pixel hashing computation to handle pixel pipes with asymmetric processing power.
This reworks the intel_compute_pixel_hash_table_nway() pixel pipe
hashing table computation helper to handle cases where some pixel
pipes have processing power different from the others, this is helpful
for Gfx12.7+ platforms where there are pixel pipes with 1 DSS as well
as pixel pipes with 2 DSSes, which currently can lead to a serious
performance bottleneck in the pixel pipes with lower processing power.

In order to avoid such a load imbalance the
intel_compute_pixel_hash_table_nway() function will now take two pixel
pipe bitsets instead of one: Pixel pipes enabled on both bitsets will
appear with twice the frequency on the table as pixel pipes which only
appear on one bitset.  See the comments below for more details on the
algorithm used to construct a pixel hashing table with the desired
properties.

With this change rendering performance improves by about 25% on a
fused MTL platform -- The list of specific configs this is expected to
show an improvement on is not included here since the list is rather
long and some of the configs may still be embargoed or may never be
productized, but in order to find out whether your Gfx12.7+ device
could be affected by this you can check the output of the
intel_dev_info tool from the Mesa tree and see if there are multiple
"pixel pipe" entries with different DSS count.  That isn't expected to
occur on any DG2 configuration, only on MTL+ platforms, so this change
should have no effect at all on DG2 (it's easy to convince oneself
that it won't since for DG2 mask1 should equal mask2 so mask2 will be
set to zero at the beginning of intel_compute_pixel_hash_table_nway()
and the new swzx[] permutation will be set to the identity).

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26266>
2023-11-20 23:48:34 +00:00
José Roberto de Souza
205c5874d4 intel: Sync xe_drm.h
Sync xe_drm.h with commit 3b8183b7efad ("drm/xe/uapi: Be more specific
about the vm_bind prefetch region").

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26238>
2023-11-20 17:57:34 +00:00
Lionel Landwerlin
f9bab3566b intel/perf: fix querying of configurations
Using the unsized data field is incorrect. The data is located behind
the entire drm_i915_query_perf_config structure.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26285>
2023-11-20 16:00:05 +00:00
Eric Engestrom
4de3ce1f2c ci/piglit: specify only the traces file in the job config
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26278>
2023-11-20 15:23:40 +00:00
Shuicheng Lin
dddab9fa77 intel/xe: Correct DRM_XE_EXEC_QUEUE_SET_PROPERTY's ioctl
DRM_XE_EXEC_QUEUE_SET_PROPERTY is the offset,
while DRM_IOCTL_XE_EXEC_QUEUE_SET_PROPERTY is the real number.

Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26253>
2023-11-18 10:17:45 +00:00
Paulo Zanoni
563678f310 anv/sparse: don't support YCBCR 2x1 compressed formats
Regarding supporting these formats, the spec says:

  "A sparse image created using VK_IMAGE_CREATE_SPARSE_RESIDENCY_BIT
   supports all non-compressed color formats with power-of-two element
   size that non-sparse usage supports. Additional formats may also be
   supported and can be queried via
   vkGetPhysicalDeviceSparseImageFormatProperties.
   VK_IMAGE_TILING_LINEAR tiling is not supported."

Regarding the formats themselves, the spec says:
  "VK_FORMAT_B8G8R8G8_422_UNORM specifies a four-component, 32-bit
   format containing a pair of G components, an R component, and a B
   component, collectively encoding a 2×1 rectangle of unsigned
   normalized RGB texel data. One G value is present at each i
   coordinate, with the B and R values shared across both G values and
   thus recorded at half the horizontal resolution of the image. This
   format has an 8-bit B component in byte 0, an 8-bit G component for
   the even i coordinate in byte 1, an 8-bit R component in byte 2,
   and an 8-bit G component for the odd i coordinate in byte 3. This
   format only supports images with a width that is a multiple of two.
   For the purposes of the constraints on copy extents, this format is
   treated as a compressed format with a 2×1 compressed texel block."

Since these formats are to be considered compressed 2x1 blocks and we
don't necessarily have to support non-compressed formats that
non-sparse support, we can claim them as not supported with sparse.

In addition to all of that, if you look at isl_gfx125_filter_tiling()
you'll see that we don't even support Tile64 for these formats, so
sparse residency (i.e., non-opaque image binds) doesn't really make
sense for them yet.

The Vulkan spec defines 4 other YCBCR "2x1 compressed" formats like
the ones we have in this commit, but we don't support them even
without sparse, so there's no reason to check them here.

A recent change in VK-GL-CTS made tests that use these formats go from
unsupported to failures:
  7ecc7716a983 ("Do not use and check for STORAGE image support, when
  it is not used in the test")

This commit "fixes" the following VK-GL-CTS failures (by making them
return NotSupported):
  dEQP-VK.sparse_resources.image_block_shapes.2d.b8g8r8g8_422_unorm.samples_1
  dEQP-VK.sparse_resources.image_block_shapes.2d.g8b8g8r8_422_unorm.samples_1
  dEQP-VK.sparse_resources.image_block_shapes.2d_array.b8g8r8g8_422_unorm.samples_1
  dEQP-VK.sparse_resources.image_block_shapes.2d_array.g8b8g8r8_422_unorm.samples_1

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25512>
2023-11-17 17:58:29 +00:00