Commit graph

181809 commits

Author SHA1 Message Date
Rob Clark
2132f95de0 freedreno/a6xx: Fix NV12+UBWC import
Treat R8_G8B8_420_UNORM and NV12 the same, because dri2 frontend doesn't
understand or care about the difference from the sampler PoV.

Fixes: 1e820ac128 ("freedreno: Rework supported-modifiers handling")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26601>
2023-12-08 21:06:05 +00:00
Sagar Ghuge
708d4f59f8 anv: Use RCS cmd buffer if blit src/dest has 3 components
The Blitter engine lacks support for 3 components color format so we can
just fallback to RCS companion command buffer for the blit operation.

Even though blitter supports 96-bit support it only supports linear
tiling. We can support other types of tiling by falling back to the RCS
companion command buffer.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26300>
2023-12-08 20:44:03 +00:00
Ian Romanick
87cdcbd7d7 intel/compiler: Verify that DO is alone in the block
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26439>
2023-12-08 20:21:28 +00:00
Ian Romanick
65237f8bbc intel/fs: Don't add MOV instructions to DO blocks in combine constants
There was a subtle bug related to CFG tracking. Namely, some branch
instructions may point *only* to the block after the DO instruction
for the loop. If the MOV instructions are in the DO block, the may not
have liveness properly tracked.

Like in !25132, having the MOV instructions in blocks that might
contain other instructions helps scheduling.

shader-db:

All Broadwell and newer Intel GPUs had similar results (Ice Lake shown)
total cycles in shared programs: 848577248 -> 848557268 (<.01%)
cycles in affected programs: 78256396 -> 78236416 (-0.03%)
helped: 361 / HURT: 18

fossil-db:

All Skylake and newer Intel GPUs had similar results (Ice Lake shown)
Totals:
Cycles: 15021501924 -> 15021372904 (-0.00%); split: -0.00%, +0.00%

Totals from 735 (0.11% of 656080) affected shaders:
Cycles: 676429502 -> 676300482 (-0.02%); split: -0.02%, +0.00%

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26439>
2023-12-08 20:21:28 +00:00
Sil Vilerino
23f07f4942 d3d12: Check video encode codec cap before checking encode profile/level cap
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26598>
2023-12-08 20:04:49 +00:00
Timur Kristóf
1c8c3e5a7a radv: Don't retile DCC on transfer queues.
Instead, the retile will be executed on another queue type
when the image is transitioned to another queue.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25834>
2023-12-08 14:46:17 +00:00
Timur Kristóf
5c30d462b9 radv: Disable HTILE on exclusive images with transfer queues when SDMA doesn't support it.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25834>
2023-12-08 14:46:17 +00:00
Timur Kristóf
1764259ba8 radv: Disable DCC on exclusive images with transfer queue when SDMA doesn't support it.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25834>
2023-12-08 14:46:17 +00:00
Timur Kristóf
89a6b08cba radv: disable HTILE/DCC for concurrent images with transfer queue if unsupported.
DCC and HTILE are only supported by SDMA on GFX10+ (unless disabled by a workaround).

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25834>
2023-12-08 14:46:16 +00:00
Chia-I Wu
ad6b6673be radv: convert a check in radv_get_memory_fd to assert
VUID-VkBindImageMemoryInfo-memory-02628 and
VUID-VkBindImageMemoryInfo-memory-02629 make sure the memory offset is 0
for dedicated allocations.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25964>
2023-12-08 14:21:42 +00:00
Chia-I Wu
8aa62ba240 radv: fix asserts for radv_init_metadata
radv_init_metadata hits several assert failures when the image is
multi-planar.  Make sure we use plane 0.

This change should make no difference in practice.  Also, this is done
only to follow radeonsi.  Since the opaque metadata is mainly for
validations and DCC, and we don't enable DCC for multi-planar images, we
probably don't need to call radv_query_opaque_metadata at all.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25964>
2023-12-08 14:21:42 +00:00
Chia-I Wu
035cf7ab97 radv: fix a typo in radv_image_view_make_descriptor
Only GFX8 and before have legacy_surf_level.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25964>
2023-12-08 14:21:42 +00:00
Chia-I Wu
07f575a8a6 radv: fix VkSubresourceLayout2KHR for multi-planar formats with modifiers
Memory planes and format planes are equivalent for multi-planar formats
with modifiers.  Do not return the DCC info of plane 0.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25964>
2023-12-08 14:21:42 +00:00
Chia-I Wu
8f60ccf969 radv: fix VkDrmFormatModifierProperties2EXT for multi-planar formats
Do not report DCC modifiers for multi-planar formats.  We don't support
DCC for them and drmFormatModifierPlaneCount had incorrect values.

Fix vkGetImageSubresourceLayout for multi-planar images with modifiers.
In that case, memory planes and format planes are equivalent.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25964>
2023-12-08 14:21:42 +00:00
Samuel Pitoiset
90dda31901 radv: simplify disabling MRT compaction for PS epilogs
If the fragment shader isn't compiled, the PS epilog key isn't used
at all with GPL.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26563>
2023-12-08 13:52:40 +00:00
Samuel Pitoiset
0cf00390c5 ci: uprev vkd3d-proton to a0ccc383937903f4ca0997ce53e41ccce7f2f2ec
To cover DGC mesh shaders which are only tested as part of vkd3d-proton.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26590>
2023-12-08 11:14:22 +00:00
Yonggang Luo
5bf68ab701 osmesa: Make osmesa.h compatible with Windows SDK's GL.h
For glext.h and glcorearb.h, it's already use 'APIENTRY', so for the osmesa.h

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26561>
2023-12-08 09:55:54 +00:00
Dave Airlie
10db6948da nvk/nak: fix regression with shf changes on sm70
This commit nak: implement SHL and SHR on SM50 caused a regression on
KHR-GL45.gpu_shader_fp64.* using zink.

This fixes the regression, by setting the wrap fields.

Fixes: 00be041ffc ("nak: implement SHL and SHR on SM50")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26586>
2023-12-08 05:30:09 +00:00
Marek Olšák
64b769a102 glthread: add a string table of function names
for printing glthread batches

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
2023-12-08 04:25:52 +00:00
Marek Olšák
adfab9794e mesa: deduplicate glVertexPointer and glNormalPointer vs DSA error checking
Regular and direct state access functions did the same thing. The new
functions will be used later.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
2023-12-08 04:25:52 +00:00
Marek Olšák
3a74cdcd91 glthread: pass struct marshal_cmd_DrawElementsUserBuf into Draw directly
Pass the whole structure directly instead of as separate parameters.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
2023-12-08 04:25:52 +00:00
Marek Olšák
98e42c6efb glapi: only allow deprecated="" on non-aliased functions
Merging deprecated="" of aliased and real functions isn't completely
predictable. The function (real or aliased) that's defined last overwrites
attributes of its alias defined before it.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
2023-12-08 04:25:52 +00:00
Marek Olšák
61e19c53e7 glthread: don't do "if (COMPAT)" if the function is not in the GL core profile
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
2023-12-08 04:25:52 +00:00
Marek Olšák
a3992379cb glapi: only expose GL_EXT_direct_state_access functions to GL compatibility
The extension is only exposed in GL compatibility.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
2023-12-08 04:25:52 +00:00
Marek Olšák
666d53214a glthread: rework type reduction and reduce vertex stride params to 16 bits
- add get_marshal_type(), which reduces type sizes
- rework all places to use the result of get_marshal_type()

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
2023-12-08 04:25:52 +00:00
Marek Olšák
162c890614 glthread: use autogenerated marshal structures for custom functions
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
2023-12-08 04:25:52 +00:00
Marek Olšák
e9d08bb043 glapi: rename primcount -> instance_count in a few Draw functions
In order to match the marshal structures we already have in the tree.
The next commit will depend on this.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
2023-12-08 04:25:52 +00:00
Marek Olšák
a02ed8a95f glthread: add option to put autogenerated marshal structures in the header file
This is used when we want to be able to read the calls of autogenerated
functions, or when we want to use the default structure for our custom
marshal functions.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
2023-12-08 04:25:52 +00:00
Marek Olšák
bdb771b27c glthread: eliminate push/pop calls in PushMatrix+Draw/MultMatrixf+PopMatrix
Viewperf benefits. This implements glPushMatrix marshalling manually and
looks ahead in the unmarshal function what the following calls are.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
2023-12-08 04:25:51 +00:00
Marek Olšák
c3b95d1507 glthread: add a marker at the end of batches indicating the end
Unmarshal calls that "look ahead" in the batch will use it.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
2023-12-08 04:25:51 +00:00
Marek Olšák
5af047d40a mesa: optimize setting the identity matrix
instead of memcpy from a static mutable place ("const" doesn't help
anything here), just set the values directly

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
2023-12-08 04:25:51 +00:00
Marek Olšák
5fb106c253 mesa: skip checking for identity matrix in glMultMatrixf with glthread
glMultMatrixf was doing it. glMatrixMultfEXT is the other user of
matrix_mult that needs to do it before we can skip it here.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
2023-12-08 04:25:51 +00:00
Marek Olšák
d321b1500b mesa: optimize _mesa_matrix_is_identity
+5% performance in VP13/Sw/teslaTower_shaded

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26548>
2023-12-08 04:25:51 +00:00
Yiwei Zhang
d17ddcc847 venus: dispatch background shader tasks to secondary ring
Summary:
- Add a perf option to force primary ring submission
- Let device own secondary ring(s) for ad-hoc spawn
- For threads where swapchain and command pool are created, track with
  TLS to instruct ring dispatch.
- If the pipeline creation or cache retrieval happens on the background
  threads not on the hot paths, force synchronous and dispatch to the
  secondary ring after waiting for primary ring becoming current.
- If the pipeline creation or cache retrieval happens on the hot paths
  threads, dispatch to the primary ring to avoid being blocked by those
  tasks on the secondary ring.

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
2023-12-08 04:06:37 +00:00
Yiwei Zhang
5b26bebcf4 venus: add vn_gettid helper
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
2023-12-08 04:06:37 +00:00
Yiwei Zhang
b170c1a391 venus: switch to vn_ring as the protocol interface - part 3
Sync protocol and fix all the interfaces, otherwise we have to generate
two sets of headers with both interfaces to separate protocol sync and
the driver side adaptation.

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
2023-12-08 04:06:37 +00:00
Yiwei Zhang
5943f70c7a venus: switch to vn_ring as the protocol interface - part 2
Use instance ring as the primary ring of a logical device.

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
2023-12-08 04:06:37 +00:00
Yiwei Zhang
d28ebf7b99 venus: switch to vn_ring as the protocol interface - part 1
No functional change but just preparations for switching instance
to ring to interface with the venus protocol headers.

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
2023-12-08 04:06:37 +00:00
Yiwei Zhang
a0ef347a82 venus: add vn_ring_get_id and hide vn_ring internals entirely
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
2023-12-08 04:06:37 +00:00
Yiwei Zhang
9e38c74139 venus: move the actual ring creation into ring as well
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
2023-12-08 04:06:37 +00:00
Yiwei Zhang
c779fc9fb1 venus: move ring submission into ring
At first, no behavior change in this CL.

The instance level helper for normal command submission is left to work
with the current venus protocol. Meanwhile, we leave the helper to
submit recorded command buffer inside instance to it can later redirect
to the primary ring.

We've internalized a few ring helpers that no longer need to be exposed.
Besides, indirect submission decision is on per-ring basis since the
ring buffer can vary later.

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
2023-12-08 04:06:37 +00:00
Yiwei Zhang
9229c13a2c venus: move the rest ring belongings into ring
This change only moves the fields without changing the accessors. It's
better to let ring own its own upload cs encoder (which is backed by
shmem array) to avoid lock contention between indirect submissions
across rings.

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
2023-12-08 04:06:37 +00:00
Yiwei Zhang
d1e29b7557 venus: move ring shmem into vn_ring
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
2023-12-08 04:06:37 +00:00
Yiwei Zhang
3e122014cf venus: relax ring mutex
Now we are able to break up the original lock to allow shmem alloc to be
outside the ring mutex, as long as the reply shmem set is still coupled
with ring submission.

Add and expose vn_instance_reply_shmem_alloc helper which will be used
by rings separately later.

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
2023-12-08 04:06:37 +00:00
Yiwei Zhang
b98d850efd venus: remove command_dropped tracking
The encoder must not be empty by then so switch to an assert. Failing to
get a reply shmem would end up with VK_ERROR_OUT_OF_HOST_MEMORY, thus
there's no need to track either.

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
2023-12-08 04:06:37 +00:00
Yiwei Zhang
90e64564b8 venus: make vn_renderer_shmem_pool thread-safe
This can be thread-safe only because we have dropped seeking command
stream offset, which requires comparing pool shmem to decide conditional
set stream.

This is to prepare for later sharing reply shmem pool across rings.

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
2023-12-08 04:06:37 +00:00
Yiwei Zhang
0273c9cc03 venus: always set reply command stream to avoid seek
More considerations and details here:
- The seek is a bit lighter than set, since it assumes renderer side
  resource being immutable. It does affect perf when Venus is still
  making verbose synchronous calls at runtime (e.g. descriptor set,
  buffer, device memory, etc).
- Seek still requires lock protection as the reply shmem must be
  immutable before the seek and the followed cmd are committed to the
  ring.
- Removing seek without doing set requires renderer change to always
  bump the encoder end position according to what the original request
  is instead of being ad-hoc upon what the host driver tells to write.
  The overhead and extra complexity there isn't negligible.
- Further, removing seek requires each ring to track the prior reply
  pool shmem in the multi-ring scenario. While the additional host side
  resource lookup isn't costy as the number of resources is must less
  than the vk object table.
- The nice thing is that we can make shmem pool thead safe to be more
  easily shared across rings.

So we just drop it.

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
2023-12-08 04:06:37 +00:00
Yiwei Zhang
70e8d1397e venus: further cleanup vn_relax_init to take instance instead of ring
For multi-ring, later we can just check primary ring alive status.

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
2023-12-08 04:06:37 +00:00
Yiwei Zhang
f6adc60822 venus: refactor to add vn_watchdog
Summary:
- cleanup redundant report_period_us check post 1.0 release
- add vn_watchdog and its accessors
  - vn_watchdog_init
  - vn_watchdog_fini
  - vn_watchdog_acquire
  - vn_watchdog_release
  - vn_watchdog_timeout

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
2023-12-08 04:06:37 +00:00
Yiwei Zhang
d8b059b01b venus: move ring monitor to instance for sharing across rings
Later we will base off just the primary ring alive status.

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26179>
2023-12-08 04:06:37 +00:00