Commit graph

62838 commits

Author SHA1 Message Date
Karol Herbst
7f08036abc rusticl/mesa: pass PIPE_BIND_LINEAR in resource_create_texture_from_user
Host pointer allocations are all linear laid out, so just tell the drivers
in case they don't assume this implicitly.

Fixes: 71a9af4910 ("rusticl/mem: support read/write/copy ops for images")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25937>
2023-10-28 14:38:28 +02:00
Karol Herbst
398fadf1cf rusticl/device: restrict const max size to 1 << 26 bytes
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25937>
2023-10-28 14:38:25 +02:00
Marek Olšák
276b9b13cf radeonsi: initialize perfetto in the right place
Compute contexts don't execute the second half of the function.

Fixes: a164e147e9 - radeonsi: Add perfetto support in radeonsi
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10043

Tested-by: Mike Lothian <mike@fireburn.co.uk>
Tested-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25873>
2023-10-27 23:03:04 +00:00
Francisco Jerez
57decad976 intel/xehp: Enable TBIMR by default.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25493>
2023-10-27 14:50:42 -07:00
Francisco Jerez
ed9886321c intel/xehp+: Use TBIMR tile box check in order to avoid performance regressions.
This allows the hardware to behave as if TBIMR was disabled until a
polygon is processed which spans at least one tile.  This is a rather
heavy-handed heuristic meant to prevent regressions in heavily
geometry-bound workloads that render large numbers of tiny primitives
much smaller than a TBIMR tile.

A particularly bad example of this was observed in SoTR, where certain
draw calls with a long-running VS and a mostly trivial PS render more
triangles than pixels, filling up the URB and TBIMR batch pretty
quickly, which causes EU utilization to tank (since once the URB has
filled up the parallelism of the VS is limited by the number of
polygons that fit in a TBIMR batch at the completion of each tile
walk, which isn't a lot in relation to the total EU count of a DG2),
and causes the bottleneck to be the rate at which the tile sequencer
performs additional tile passes, each one processing a small number
(<1024 polygons) of the hundreds of thousands of triangles of the
draw call.

Enabling this heuristic seems effective at avoiding that scenario in
SoTR among other titles (e.g. Total War Warhammer 3), but it's a bit
of a compromise since one could imagine cases where TBIMR is helpful
even if the geometry doesn't pass the box check, so a better heuristic
or a driconf rule may be useful in the future.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25493>
2023-10-27 14:50:42 -07:00
Francisco Jerez
f0d24b155b intel/xehp+: Adjust TBIMR batch size based on slice count.
This programs a TBIMR batch size equal to 128 polygons per slice in
order to match the hardware spec recommendation (BSpec 68436).  This
has been confirmed to improve performance slightly relative to the
hardware default batch size of 256 polygons.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25493>
2023-10-27 14:50:42 -07:00
Francisco Jerez
7cdacaf493 intel/xehp: Adjust TBIMR performance chicken bits.
This enables a couple of TBIMR performance tunables in
CHICKEN_RASTER_2 that default to disabled.  TBIMR fast clip appears to
help slightly with some geometry-bound workloads.  TBIMR open batch
allows the rasterizer to start working immediately on the first tile
of the framebuffer, even before the batch has been closed, which helps
reduce the latency cost of the tile walk.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25493>
2023-10-27 14:50:42 -07:00
Francisco Jerez
d13c81a2c3 iris/xehp: Implement TBIMR tile pass setup and pipeline bandwidth estimation.
This sets up the basic parameters needed for tiled rendering based on
a back-of-the-envelope estimate of the amount of memory used by the
pixel pipeline during the tile pass.  The actual cache footprint of a
tile can vary wildly based on runtime factors which aren't easily
predictable based on static analysis, so this is only intended to
provide a rough approximation within the right order of magnitude.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25493>
2023-10-27 14:48:29 -07:00
Francisco Jerez
694d64188b intel/xehp+: Define driconf option for selectively disabling TBIMR.
This may help debugging performance problems in the possible case that
TBIMR negatively impacts the performance of some application.  It could
also allow applying application-specific band-aid fixes in the XML file
until a more general workaround is implemented.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25493>
2023-10-27 14:48:29 -07:00
Francisco Jerez
da28582eec intel/xehp+: Add dynamic state flags controlling whether TBIMR is enabled during 3D primitives.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25493>
2023-10-27 14:48:29 -07:00
Francisco Jerez
6b9583734b intel/l3: Set up L3FullWayAllocationEnable config if ALL partition has over 126 ways.
L3 configurations with an ALL partition of 128 ways per bank or more
cannot be represented with the normal L3ALLOC partitioning mechanism
since the "All L3 client pool" field would overflow, instead the
L3FullWayAllocationEnable bit has to be set, which causes the whole L3
to be used in a unified cache configuration.

That's precisely the configuration we're currently using on recent
platforms, but previously we were relying on the L3 config tables
being empty and the selected L3 configuration being a NULL pointer to
detect this condition.  This is about change, the L3 configuration
structure will be defined for gfx12.5+ platforms since they provide
useful information about the cache hierarchy to the drivers.  Instead
of checking whether the pointer is NULL in order to apply a unified L3
cache configuration, use it when there is a single ALL partition
larger than can be represented via L3ALLOC.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25493>
2023-10-27 14:48:28 -07:00
Mike Blumenkrantz
736577871b zink: check for cbuf0 writes before setting A2C
VUID-vkCmdDrawMultiIndexedEXT-alphaToCoverageEnable-08919 requires
a cbuf0 write for A2C to be active

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25938>
2023-10-27 17:33:31 +00:00
Mike Blumenkrantz
d2abb4f975 zink: make (some) vk allocation commands more robust against vram depletion
as has recently been exposed by ci, there are some cases where running
lots of tests simultaneously can temporarily result in depleted vram,
which torpedos everything

as this scenario is transient (vram will very soon become available again),
it makes more sense to add some retries at fixed intervals to try soldiering
onward instead of exploding and probably blocking a merge

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25938>
2023-10-27 17:33:31 +00:00
Mike Blumenkrantz
f8909e7d55 zink: add more locking for compute pipelines
if multiple contexts are accessing this all at once then this needs
more locking to avoid unsynchronized cache access

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25929>
2023-10-27 15:06:13 +00:00
Karol Herbst
9a3af6e1d8 rusticl/queue: Only take a weak ref to the last Event
This resolves a memory leak when the application drops its last reference
to the queue, but never waits explicitly.

The problem was, that the queue was refed by QueueState::last and that ref
only gets dropped on a blocking wait. This is problematic as non user
Event objects also hold a ref on the Queue they are created on, therefore
causing a cyclic ref relation.

In order to resolve it, just use a weak reference. A failure of upgrading
the Weak ref is not an issue as in this case we'd only wait on an already
destroyed or processed event. The worker thread already makes sure
everything stays in sync.

Fixes: 5b3ff7e3f3 ("rusticl/queue: overhaul of the queue+event handling")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: @LingMan <18294-LingMan@users.noreply.gitlab.freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25926>
2023-10-27 14:47:23 +00:00
Karol Herbst
01b6ccccc6 zink: lower fisnormal as it requires the Kernel Cap
I didn't check if it's a valid vulkan SPIR-V opcode and turns out it isn't

Fixes: 82eed326f4 ("zink: support more nir opcodes")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25837>
2023-10-27 10:52:55 +00:00
Karol Herbst
e3a0df6468 zink: emit float controls
This is required by OpenCL who relies on flushing behavior to match the
runtimes advertized feature, but also later once rusticl does support
denorms, to flush them if applications whish to do so.

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25837>
2023-10-27 10:52:55 +00:00
Karol Herbst
700a2dc648 zink: alias nir scratch memory by lowering to common bit_size
This aliases each access as required by OpenCL. It's up to the vulkan
driver to vectorize to wider loads/stores if possible.

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25837>
2023-10-27 10:52:55 +00:00
Karol Herbst
ab065d9daa zink: support CLAMP_TO_BORDER with unnormalized coords
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25837>
2023-10-27 10:52:55 +00:00
Karol Herbst
abd8ef84ff rusticl/mem: properly set pipe_image_view::access
Cc: mesa-stable
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25837>
2023-10-27 10:52:55 +00:00
Karol Herbst
694001eef7 rusticl/device: restrict param_max_size further
It's kinda pointless to have it too big, it also causes weird shaders to
be generated and causes stack overflows in `nir_opt_gcm`.

Nothing needs big values here anyway.

Cc: mesa-stable
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25837>
2023-10-27 10:52:54 +00:00
Karol Herbst
9b6ac56d72 rusticl/device: restrict image_buffer_size
It's pointless to advertise more than CL_DEVICE_MAX_MEM_ALLOC_SIZE and
also the CTS tests against this.

Cc: mesa-stable
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25837>
2023-10-27 10:52:54 +00:00
Mike Blumenkrantz
df74ea7717 zink: unset explicit_xfb_buffer for non-xfb shaders
this catches duplicated xfb when generated geometry shaders are used

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25914>
2023-10-27 00:44:49 +00:00
Mike Blumenkrantz
87e3720b66 aux/u_transfer_helper: set rendertarget bind for msaa staging resource
this matches other resources created with staging blit-like mechanics

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25914>
2023-10-27 00:44:49 +00:00
Mike Blumenkrantz
694ebe8c72 zink: only emit xfb execution mode for last vertex stage
this is otherwise illegal

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25914>
2023-10-27 00:44:49 +00:00
Mike Blumenkrantz
e8b2680045 zink: clamp resolve extents to src/dst geometry
exceeding src/dst extents is illegal

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25914>
2023-10-27 00:44:49 +00:00
Mike Blumenkrantz
009d4a5fda zink: always set VK_EXTERNAL_MEMORY_HANDLE_TYPE_HOST_ALLOCATION_BIT_EXT for usermem
required by spec

backport-to: 23.3

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25914>
2023-10-27 00:44:49 +00:00
Mike Blumenkrantz
7035b5a8e8 zink: emit SpvCapabilitySampleRateShading with SampleId
required by spec

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25914>
2023-10-27 00:44:49 +00:00
Mike Blumenkrantz
f2fb2df6a3 ci: bump VVL to 1.3.269
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25914>
2023-10-27 00:44:49 +00:00
Faith Ekstrand
d5c310899a nir: Split nir_lower_subgroup_options::lower_vote_eq into two bits
On NVIDIA, we can do a vote_ieq on bool in one hardware op so we don't
want that lowered.  We do want to lower vote_feq and other vote_ieq,
though.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25894>
2023-10-26 23:05:44 +00:00
Mike Blumenkrantz
d1d29d4f40 ci: skip zink vram test
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25624>
2023-10-26 22:31:26 +00:00
Mike Blumenkrantz
9a98d6714d zink: enable unsynchronized texture uploads using staging buffers
by not returning busy for non-HIC unsynchronized texture uploads,
the GL frontend will fall through to directly access the unsynchronized
cmdbuf

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25624>
2023-10-26 22:31:26 +00:00
Mike Blumenkrantz
846a5ea224 zink: add locking for batch refs
this is needed to handle unsynchronized access

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25624>
2023-10-26 22:31:26 +00:00
Mike Blumenkrantz
cd08b070a3 zink: add flag to restrict unsynchronized texture access
this is unset any time a texture is accessed and must be explicitly
re-set to preserve unsynchronized access

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25624>
2023-10-26 22:31:26 +00:00
Mike Blumenkrantz
8ee0d6dd71 zink: add a third cmdbuf for unsynchronized (not reordered) ops
this provides functionality for unsynchronized texture uploads without
HIC support by adding a cmdbuf which can only be accessed directly by
the frontend thread

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25624>
2023-10-26 22:31:26 +00:00
Mike Blumenkrantz
8d0eaf97db zink: rework cmdbuf submission to be more extensible
no functional changes

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25624>
2023-10-26 22:31:26 +00:00
Mike Blumenkrantz
7d0dbdeca2 zink: assert that transfer_dst is available before doing buf2img
the blitter path here was just wishful thinking anyway

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25624>
2023-10-26 22:31:25 +00:00
Mike Blumenkrantz
0b11b41fff zink: barrier_cmdbuf -> reordered_cmdbuf
this is more consistent with the current usage of the cmdbuf

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25624>
2023-10-26 22:31:25 +00:00
Mike Blumenkrantz
00206e01a4 zink: handle unsynchronized image maps from tc
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25624>
2023-10-26 22:31:25 +00:00
Mike Blumenkrantz
9cc06f817c tc: allow unsynchronized texture_subdata calls where possible
if a texture is provably idle, either by never having been used or
by exhaustively checking usage data, a texture subdata can occur
without any synchronization

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25624>
2023-10-26 22:31:25 +00:00
Mike Blumenkrantz
815ed12e3b tc: use strong refs for fb attachment tracking
this is necessary for unsynchronized texture upload tracking

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25624>
2023-10-26 22:31:25 +00:00
Mike Blumenkrantz
b385fa85db tc: add batch usage tagging to threaded_resource
this allows the tc recorder thread to tag resources to determine if
a resource has been previously seen by the current batch

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25624>
2023-10-26 22:31:25 +00:00
Mike Blumenkrantz
39de1ce660 tc: always track fb attachments
this should have no measurable impact on perf

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25624>
2023-10-26 22:31:25 +00:00
Mike Blumenkrantz
6d236917a9 tc: add non-definitive tracking for batch completion
this is useful as a hint for opportunistic optimizations

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25624>
2023-10-26 22:31:25 +00:00
Mike Blumenkrantz
782481c429 zink: add copy box locking
this can technically be accessed by multiple threads, so ensure
access is serialized

backport-to: 23.3

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25924>
2023-10-26 21:13:01 +00:00
Ruijing Dong
09a8cc0d6d radeonsi/vcn: vcn4 encoding interface dummy update
Due to some updates in vcn4 interface, add dummy members for further
development.

Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25898>
2023-10-26 20:25:01 +00:00
Alyssa Rosenzweig
2552ac360d crocus: Support building on non-Intel
Ditto.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Cc: mesa-stable
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25882>
2023-10-26 19:48:19 +00:00
Sil Vilerino
dfb9516026 d3d12: d3d12_video_buffer_create_impl - Fix resource importing
Only align resource dimensions on creation, not when importing existing D3D resource object.
Otherwise importing the resource fails since the resource descriptor does not match the aligned
dimensions passed in the template.

Fixes: 62fded5e4f ("d3d12: Allocate d3d12_video_buffer with higher alignment for compatibility")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25913>
2023-10-26 19:25:16 +00:00
Ganesh Belgur Ramachandra
2f7bc06643 radeonsi: Fix clear-render-target shader for 1darrays in NIR
There are no GL CTS tests for 1darrays, relying on OpenCL CTS instead.
This patch should fix the `clEnqueueFillImage` tests in OpenCL CTS.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25630>
2023-10-26 15:33:18 +00:00
Tapani Pälli
8ffc4bd31c iris: add required PC for Wa_14014966230
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25671>
2023-10-26 11:51:47 +00:00