Commit graph

200317 commits

Author SHA1 Message Date
Mike Blumenkrantz
3f90303eeb lavapipe: move workgraph lowering up and delete pipeline param
Konstantin Seurer <konstantin.seurer@gmail.com>

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32931>
2025-01-15 16:12:27 +00:00
Mike Blumenkrantz
4d1ed5d66d lavapipe: fix bitmask type for sampler updating
need 32bit to contain all the bits here

cc: mesa-stable

Konstantin Seurer <konstantin.seurer@gmail.com>

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32931>
2025-01-15 16:12:27 +00:00
Mike Blumenkrantz
e2023474b4 lavapipe: split out bda descriptor function params from struct
no functional changes

Konstantin Seurer <konstantin.seurer@gmail.com>

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32931>
2025-01-15 16:12:27 +00:00
Mike Blumenkrantz
596efeda33 lavapipe: split out sampler init from create
no functional changes

Konstantin Seurer <konstantin.seurer@gmail.com>

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32931>
2025-01-15 16:12:27 +00:00
Mike Blumenkrantz
4db07aeb1c vk/sampler: split out sampler init from create
no functional changes

Konstantin Seurer <konstantin.seurer@gmail.com>

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32931>
2025-01-15 16:12:27 +00:00
Mike Blumenkrantz
caf50d6723 lavapipe: stop storing texture handle for samplers
this is never used

Konstantin Seurer <konstantin.seurer@gmail.com>

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32931>
2025-01-15 16:12:27 +00:00
Nanley Chery
15e23f3781 anv: Limit slow clear heuristic to ACM and prior
It hasn't been tuned for Xe2.

Fixes: 052d7e1a9c ("anv: Slow clear if fast-clear cost is not mitigated")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33035>
2025-01-15 15:43:19 +00:00
Nanley Chery
caf007ff27 anv: Drop can_fast_clear_with_non_zero_color()
This got dropped during a rebase.

Fixes: 35f02d8f36 ("anv: Inline can_fast_clear_with_non_zero_color")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33035>
2025-01-15 15:43:19 +00:00
Lars-Ivar Hesselberg Simonsen
ee4460acf4 panvk: Fix descriptor decode
The expansion of DUMP_CL is missing parenthesis, making the dumping of
descriptors incorrect.

Fixes: 3b69edf825 ("pan/genxml: Enforce explicit packed types on pan_[un]pack")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33040>
2025-01-15 15:05:05 +00:00
Icenowy Zheng
b6c2ea4d99 zink: emit consts as uint only on IMG proprietary drivers
After the SPIR-V generator is optimized to generate multiple constant
types, the shader compiler of Imagination proprietary drivers can no
longer correctly handle these shaders and will bail out.

Handle this as a driver quirk and revert to the old behavior with only
uint constants when IMG proprietary drivers are detected.

Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32342>
2025-01-15 14:35:59 +00:00
Boris Brezillon
7755c41b3e panvk/csf: Rework the occlusion query logic to avoid draw flushes
Right now, we have a problem when we flush draws inside a render pass
and we don't have enough information to re-emit the framebuffer/tiler
descriptors.

Turns out the only situations where this happens is when an occlusion
query end happens, but we shouldn't really flush the draws in that case.
What we should do instead is record the OQ in our command buffer, so we
can signal OQ availability when the fragment job is done.

In order to solve that, we add an OQ chain to the command buffer to
track OQs ending inside the render pass. We then walk this chain at
fragment job emission time to signal the syncobjs attached to each
query.

This also simplifies the whole occlusion query synchronization model:
instead of waiting for each syncobj individually, we now wait on
the iterators to make sure all OQs have landed. Thanks to this new
synchronization, we can batch OQ reset/copy operations and make the
command stream a lot shorter when big query ranges are copied/reset.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32973>
2025-01-15 14:07:37 +00:00
Boris Brezillon
ae0534c6cc panvk/csf: Use cs_sr_reg64() instead of cs_reg64() when setting the OQ pointer
We have wrappers distinguishing staging registers from sratch registers,
so let's use cs_sr_reg64() here.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32973>
2025-01-15 14:07:37 +00:00
Boris Brezillon
cc517822e5 panvk/csf: Make all sync operations on the CSG scope
The SYSTEM scope triggers CPU interrupts we don't really need, so let's
use the CSG scope to avoid those. Note that the scope doesn't encode
the visibility aspect, meaning changes to the sync object with a CSG
scope will still be instantly visible to the CPU, it's just that the
CPU needs to poll the value to detect a change, which is basically what
we're doing for syncobjs attached to events/queries, so we're good.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32973>
2025-01-15 14:07:37 +00:00
Boris Brezillon
6a7bcff1be pan/cs: cs_{break,continue} are not for_each macros
Let's prevent clang-format from adding the semi-colon on a new line when
we use cs_{continue,break}();

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32973>
2025-01-15 14:07:37 +00:00
Boris Brezillon
622187974f pan/cs: Allow undefined value if condition=always in cs_branch_label()
We already do that in the other cs_emit(b, BRANCH, I), so let's fix this
path too.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32973>
2025-01-15 14:07:37 +00:00
Boris Brezillon
e8514fb4c4 pan/cs: Fix the tracepoint register dump loops
The increment was wrong, which ended up generating a lot more stores
than we need.

Fixes: bf05842a8d ("pan/cs: Add an event-based tracing mechanism")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32973>
2025-01-15 14:07:37 +00:00
Erik Faye-Lund
ebf9dae2e9 docs/features: fixup panvk KHR_shader_draw_parameters-support
This was enabled on Bifrost as well.

Fixes: 963e9feb8a ("panvk: enable shaderDrawParameters")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33032>
2025-01-15 14:00:00 +00:00
Karmjit Mahil
49bdd4bdc0 tu: Initialize tu_tiling_config even when tiling isn't possible
Also avoid calculations required for setting up `tu_tiling_config`
if tiling isn't possible.

Fixes valgrind issue in:
dEQP-VK.draw.renderpass.shader_layer.vertex_shader_256

Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32968>
2025-01-15 13:29:35 +00:00
Karmjit Mahil
8652516ac4 tu: Fix leaking of some descriptor sets
Descriptor sets which have `size` of `0`, such as a descriptor set
with just dynamic descriptors, weren't being freed in
`vkDestroyDescriptorPool()` since that relies on keeping track of
descriptor sets in the `entries` list. Keep track of them in the
`entries` list.

Fixes a memory leak in:
dEQP-VK.binding_model.descriptor_copy.compute.uniform_buffer_dynamic_5

Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32968>
2025-01-15 13:29:35 +00:00
Karmjit Mahil
0dd06c74d6 tu: Fix FDM patchpoint memory leak
We can disable FDM in the renderpass based on
`tu_render_pass_disable_fdm()` however a pipeline could have
been bound before starting the renderpass which had
`...RENDERING_FRAGMENT_DENSITY_MAP_ATTACHMENT_BIT_EXT` set, in
which case the shader is compiled with FDM support. We still
need to apply the patchpoints. Previously `patchpoints_ctx` was
created only based on whether FDM was enabled in the renderpass,
which was leading to the patchpoints being allocated with no
context so they were never getting freed. Now setup `patchpoint_ctx`
regardless of FDM being disabled or not.

Fixes memory leaks in some tests from:
dEQP-VK.dynamic_rendering.*_cmd_buff.fragment_density_map.*
e.g.
dEQP-VK.dynamic_rendering.partial_secondary_cmd_buff
  .fragment_density_map.2_views .render.divisible_density_size
  .1_sample.static_subsampled_1_2

Fixes: 05f96dd00f ("tu: Add core FDM patchpoint infrastructure")
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32968>
2025-01-15 13:29:35 +00:00
Karmjit Mahil
cfc09517b6 tu: Fix clear_values leak
When no attachments are used in a renderpass we should ignore the
clear values. It seems to be valid applications to still pass clear
values in such a case.

Fixes memory leaks in some tests from:
dEQP-VK.image.texel_view_compatible.graphic.basic.*d_image.texture_read.*
dEQP-VK.image.texel_view_compatible.graphic.extended.*d_image.texture_read.*

Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32968>
2025-01-15 13:29:35 +00:00
David Rosca
a1af33775e frontends/va: Set csc matrix in PutSurface
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11889
Acked-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32890>
2025-01-15 12:48:53 +00:00
Boris Brezillon
8891c2aeba panfrost: Fix instanced draws when attributes have a non-zero divisor
On Bifrost/Midgard, when an attribute has a non-zero divisors, the
attribute offset is tweaked to take the base_instance into account,
which implies we have to re-emit the attributes if the base instance
value changed.

Let's not bother tracking the last base instance and re-emit
unconditionally in that case, which is still better than what we had
before 3db963a135 ("panfrost: Emit attribs in
panfrost_update_state_3d() on bifrost/midgard") and fixes the regression
introduced by this commit.

Fixes: 3db963a135 ("panfrost: Emit attribs in panfrost_update_state_3d() on bifrost/midgard")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Tested-By: Chris Healy <healych@amazon.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33010>
2025-01-15 12:19:22 +00:00
Icenowy Zheng
1c59793d2d meson: prefer 'python3' to 'python' when finding python3
Although in some cases python 3.x will come without "3" as suffix,
"python3" should still be preferred because sometimes "python" is python
2.x instead.

Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Reviewed-by: Eric Engestrom <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33025>
2025-01-15 11:24:41 +00:00
Lucas Stach
c3662501a7 etnaviv: only emit used PA_SHADER_ATTRIBUTES states
It's quite wasteful on the command stream space to always emit all of the
PA_SHADER_ATTRIBUTES states, as many shaders use far less varyings than
the absolute max number. Limit the emission to actually used attributes.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32991>
2025-01-15 10:52:46 +00:00
Lucas Stach
65076f3ab0 etnaviv: fix flatshading on halti5 GPUs
Halti5 has no interpolation setting which will take into account the
rasterizer flatshade state. Force the interpolation qualifier to flatshade
for the color varyings when API level flatshade is enabled. Only set the
new bit in the shader key on halti5 GPUs to avoid generating unnecessary
shader variants on GPUs where the rasterizer flatshade state is properly
applied to the varyings.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32991>
2025-01-15 10:52:46 +00:00
Lucas Stach
41bd7aa9c8 etnaviv: emit varying interpolation state on halti5
Pull the interpolation qualifiers from NIR and translate them to
HALTI5_SHADER_ATTRIBUTES states.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32991>
2025-01-15 10:52:46 +00:00
Lucas Stach
89b2229c0d etnaviv: memcpy varying setup from stack
Use memcpy to transfer the varying setup from stack arrays into the
compiled shader state. Resulting code is less verbose and the
compiler is smart enough to optimize away the function call anyway.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32991>
2025-01-15 10:52:46 +00:00
Timothy Arceri
6bca0cc3d9 glsl: drop opt_dead_code_local
This does nothing useful anymore as we now convert to nir at compile
time which will handle all this for us.

shader-db radeonsi:

TOTALS FROM AFFECTED SHADERS (63/168079)
  SGPRS: 3248.00 -> 3208.00 (-1.23 %)
  VGPRS: 2224.00 -> 2228.00 (0.18 %)
  Spilled SGPRs: 0.00 -> 0.00 (0.00 %)
  Spilled VGPRs: 0.00 -> 0.00 (0.00 %)
  Private memory VGPRs: 0.00 -> 0.00 (0.00 %)
  Scratch size: 0.00 -> 0.00 (0.00 %) dwords per thread
  Code Size: 138484.00 -> 138068.00 (-0.30 %) bytes
  Max Waves: 877.00 -> 877.00 (0.00 %)
  Outputs: 0.00 -> 0.00 (0.00 %)
  Patch Outputs: 0.00 -> 0.00 (0.00 %)

shader-db Iris (BDW):

total instructions in shared programs: 17805897 -> 17805917 (<.01%)
instructions in affected programs: 1240 -> 1260 (1.61%)
helped: 0
HURT: 8
HURT stats (abs)   min: 1 max: 4 x̄: 2.50 x̃: 2
HURT stats (rel)   min: 0.39% max: 7.14% x̄: 4.26% x̃: 4.06%
95% mean confidence interval for instructions value: 1.61 3.39
95% mean confidence interval for instructions %-change: 2.01% 6.51%
Instructions are HURT.

total cycles in shared programs: 856868505 -> 856876266 (<.01%)
cycles in affected programs: 2879959 -> 2887720 (0.27%)
helped: 79
HURT: 100
helped stats (abs) min: 1 max: 742 x̄: 61.96 x̃: 12
helped stats (rel) min: <.01% max: 41.84% x̄: 1.17% x̃: 0.20%
HURT stats (abs)   min: 1 max: 1231 x̄: 126.56 x̃: 14
HURT stats (rel)   min: <.01% max: 33.98% x̄: 3.32% x̃: 0.30%
95% mean confidence interval for cycles value: 7.37 79.35
95% mean confidence interval for cycles %-change: 0.29% 2.38%
Cycles are HURT.

LOST:   1
GAINED: 4

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33008>
2025-01-15 02:01:09 +00:00
Caio Oliveira
5bd9693578 docs: Update syntax on Performance tips page
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33020>
2025-01-14 15:31:05 -08:00
Caleb Callaway
1b13f59597 docs: Intel GPU performance tips
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33019>
2025-01-14 22:27:21 +00:00
Bo Hu
26ce3b0ba1 remove the mReconstructionMutex in load
During loading of snapshot, there will be a single-threaded
decoder that aquires the same mReconstructionMutex, repeatedly.
Since the mReconstructionMutex is intentionally changed to
be non-recursive, we should not aquire it at the beginning of
load call; otherwise, we will be deadlock the decoder thread.

Reviewed-by: Marcin Radomski <dextero@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33018>
2025-01-14 19:33:13 +00:00
Bo Hu
2f3c3459a8 update decoder.py to clean up un-used ApiCallInfo
It is normal for vk decoder to consume nothing from the stream,
as it could be either gl or render control commands.

Reviewed-by: Marcin Radomski <dextero@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33018>
2025-01-14 19:33:13 +00:00
Serdar Kocdemir
36ca17cc29 gfxstream: add VK_DRIVER_FILES to devenv
Reviewed-by: Marcin Radomski <dextero@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33018>
2025-01-14 19:33:13 +00:00
Jason Macnak
bf3cdf286c Update VkDecoderSnapshot locking
Replace tryLock / unlock with regular scoped lock now that the
"extra handles" have been moved out of VkReconstruction and into
the VkSnapshotApiCallInfo.

Switch to regular std::mutex and std::lock_guard.

Annotate mReconstruction with GUARDED_BY to start to get more
thread safety analysis.

Reviewed-by: Marcin Radomski <dextero@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33018>
2025-01-14 19:33:13 +00:00
Jason Macnak
a07eb2cef0 Pass VkSnapshotApiCallInfo-s through VkDecoderGlobalState
... so that `VkDecoderGlobalState` can append additional information
needed for snapshotting. Specifically, `VkDecoderGlobalState` may create
additional boxed handles that are not visible directly in the API surface.
For example, `vkCreateDevice()` creates boxed handles for the `VkQueue`-s
and `vkCreateDescriptorPool()` creates boxed handles for pre-allocated
`VkDescriptorSets`. These boxed handles are not recoverable from the API
for `vkCreateDevice()` nor `vkCreateDescriptorPool()` directly. This was
previously worked around by just sticking the extra boxed handles in
`VkReconstruction::mExtraHandlesForNextApi` but this is not thread safe.
Instead, let's give `VkDecoderGlobalState` and `VkDecoderSnapshot` exclusive
access to individual `VkSnapshotApiCallInfo` objects.

Reviewed-by: Marcin Radomski <dextero@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33018>
2025-01-14 19:33:13 +00:00
Jason Macnak
3c03bae20c Simplify ApiInfo
Removes the additional saving of opcode and packet len as this is
all available from within the packet itself.

Removes the "trace" position from `VulkanMemReadingStream` as these
seemed to only be used for getting the packet start and packet size
but these are already available.

Reviewed-by: Marcin Radomski <dextero@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33018>
2025-01-14 19:33:13 +00:00
Bo Hu
e6fd2e1613 gfxstream-guest: update offset to correct value
In CoherentMemory::subAllocate function, we should
also update offset to the correct values.

Test: boot with skiavk enabled and the textures
should not be corrupted

Reviewed-by: Marcin Radomski <dextero@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33018>
2025-01-14 19:33:13 +00:00
sergiuferentz
82317d6d24 Use try_unbox in VkDescriptorBufferInfo
* We are currently crashing the emulator when binding to
  DescriptorBindings that have been deleted. This will WARN without
  crashing.
* A side effect of this is that it will enable a wider interaction with
  VulkanBatchUpdateDescriptorSet feature as it will not immediately
  crash if it interacts with something that was removed.
* https://registry.khronos.org/vulkan/specs/latest/man/html/vkUpdateDescriptorSetWithTemplate.html

Reviewed-by: Marcin Radomski <dextero@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33018>
2025-01-14 19:33:13 +00:00
Serdar Kocdemir
9603450ea4 The BumpPool of VkStream is not freeAll'ed
Original change from: kyoungwon.kim@bytedance.com at aosp/3310239
Moving the change into gfxstream and codegen.

The issue was found by pengzejie@bytedance.com.

Note that vkReadStream's BumpPool is effectively `freeAll`'ed by
`clearPool` calls. The same call for vkStream is not being called
while alloc is called here and there.

Reviewed-by: Marcin Radomski <dextero@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33018>
2025-01-14 19:33:13 +00:00
Serdar Kocdemir
456654f6ad Wrap queue related functions on codegen
For multiple queue emulation, we need to change how queue related
functions are working on the host side and do custom unboxing
before submitting the commands to the underlying driver.

Reviewed-by: Marcin Radomski <dextero@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33018>
2025-01-14 19:33:13 +00:00
Serdar Kocdemir
b8f38956a1 Change C style cast on extension structs
As per go/cstyle#Casting for readability analysis.

Reviewed-by: Marcin Radomski <dextero@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33018>
2025-01-14 19:33:13 +00:00
Danylo Piliaiev
2875332df5 tu: Allocate consts for driver params as early as possible
Indirect draw calls upload VS params themselves, if indirect draw call
has zero instances GPU seem to still try to upload consts, however
with zer instances it doesn't apply draw state. Without the draw state
the the HLSQ_VS_CNTL values is stale, so less constants may be specified
than draw call expects. It is found that if CP_DRAW_INDIRECT_MULTI_1_DST_OFF
is less than 0x3f - GPU is happy even if the constlen is less than that.

As a workaround we allocate driver params first and ensure that VS
constlen always has the minimum size which is enough to upload driver
params.

Fixes one of the GPU hangs in "Disney Epic Mickey: Rebrushed" on a750.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32140>
2025-01-14 19:01:50 +00:00
Danylo Piliaiev
3e5d4d50c5 ir3: Use generic const alloc for everything and call it once
With all consts going through generic allocations it's now possible
to call ir3_setup_const_state once, and have lowerings that dynamically
lower things to consts just to update the max consts being used.

The only exception for now are immediates, since they eat up the space
that was left and allocated much later.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32140>
2025-01-14 19:01:50 +00:00
Danylo Piliaiev
cf73f89ba0 tu,ir3: Make push consts be able to start from higher than c0.x offsets
Will be needed in the follow up commit.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32140>
2025-01-14 19:01:50 +00:00
Danylo Piliaiev
9be269ef25 ir3: Use generic consts alloc for driver params
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32140>
2025-01-14 19:01:50 +00:00
Danylo Piliaiev
922ef8e720 ir3: Make allocation of consts more generic and order independent
The order of allocation was backed into ir3_setup_const_state and
some other parts of ir3, which is rather brittle.

And don't assume offsets for consts in other part of code, their order
and offset calculation is not guaranteed.

This also potentially fixes indirect UBO effect on constlen size.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32140>
2025-01-14 19:01:50 +00:00
Samuel Pitoiset
fc56823cf0 radv: change the BASE_HI field for VGT_TF_MEMORY_BASE_HI on GFX12
It's similar but less confusing.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33015>
2025-01-14 17:13:23 +00:00
Alyssa Rosenzweig
401b400de3 nir,asahi,hk: add barrier argument to MESA_DISPATCH_PRECOMP
In the current API, precomp implicitly assumes full barriers both before & after
every dispatch. That's not good for performance. However, dropping the barriers
and requiring user to explicitly call barrier functions before/after would have
bad ergonomics.

So, we add a new parameter to the standard MESA_DISPATCH_PRECOMP signature
representing the barriers required around the dispatch. As usual, the actual
type & semantic is left to drivers to define what makes sense for their
hardware. We just reserve the place for it. (I think most drivers will want
bitflags here, but I don't think the actual flags are worth. If a driver wanted
to use a struct here, that would work too.)

Since the asahi stack doesn't do anything clever with barriers yet, we
mechnically add an AGX_BARRIER_ALL barrier to all precomp users in-tree. We can
optimize that later, this just gets the flag-day change in with no functional
change.

For JM panfrost, this will provide a convenient place to stash both their "job
barrier" bit and their "suppress prefetch" bit (which is really a sort of
barrier / cache flush, if you think about it).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32980>
2025-01-14 16:39:57 +00:00
Alyssa Rosenzweig
4955a68a03 libagx: add agx_barrier enum
trivial for now, but eventually this will describe interesting things :-)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32980>
2025-01-14 16:39:56 +00:00