using a screen method for this is broken since the value can change
before it is flushed. it must be passed along with the methods that use it
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35866>
it's possible for multiple user semaphores to be signaled in one batch,
and these all have the same mechanics as wait semaphores, which means
they unfortunately need their own submit in order to preserve ownership
when resetting the batch state
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35866>
functionally this is the same as other types of timeline semaphores, but
it is not actually the same as other types of timeline semaphores, e.g.,
in vulkan it would be VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_D3D12_FENCE_BIT
whereas other types of timeline semaphores would have different handle types
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35866>
Before doing register allocation, use information available from
the SSA representation to determine register pressure and to
spill registers. This spilling doesn't have to be perfect (the
register allocator is still allowed to spill) but it will be
much faster to do the SSA spilling than RA spilling. In general
this should vastly improve the performance of register allocation.
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34446>
Descriptor set layout lifetime can be shorter than what the
implementation requires. One example is :
* create descriptor set layout
* create graphics pipeline library
* destroy descriptor set layout
* link optimize library in a final pipeline
The last step might need the descriptor set layout information again.
We've so far worked around this by taking a reference on the
descriptor set layout in the pipelines. But we forgot that descriptor
set layouts have pointers to samplers (for immutable & embedded
samplers).
We could take a reference to samplers but that sucks for various
reasons :
- it consumes dynamic state heap space
- it could cause issues with capture-replay placement
So instead we copy the information from the samplers that might be
needed in cases like link optimization. This includes :
- ycbcr conversion state (used for NIR lowering)
- embedded sampler data (to recreate the sampler)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35955>
Create a hashing key on all samplers so we can just copy that anywhere
we need it. That key already contains the needed parameters for
embedded samplers, so the sha1 stuff can go away.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35955>
The executor build was failing randomly due to a missing dependency on
`idev_intel_dev`. This patch adds the required dependency to the
`meson.build` file to ensure consistent and reliable builds across
different configurations.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35928>
It is a tech debt now since NV proprietary is on sw wsi path, and
rendering to the prime blit dst buffer may never get supported there.
For later, when performance optimization is needed for venus on nv, we
can downgrade the sw wsi device workaround to a venus dri config, so
that setups with tiled explicit modifier support can be perf optimal.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35984>
The assert doesn't consider multiple queue family case where the same
blit cmd has to be recorded for each, thus hitting the assert for the
same image and buffer.
Fixes: 5535184539 ("venus: track prime blit dst buffer memory in the wsi image")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35984>
pipe_fence_handle is a refcounted object, it can't be owned by a container
which might have a different lifetime, it needs a dedicated heap allocation
so it can outlive its container.
Make sure that when we're handing out pipe_fence_handle references, that
we add a ref to them before handing them out.
Instead of assuming that a fence_wait call is for the exact fence that we
returned from a given op, mirror what's done on graphics and
opportunistically scan the batches to see what's done, and reclaim
resources for them.
Use d3d12_fence helpers to replace a lot of duplicated code.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35900>
Native sync fences represent point-in-time (fence + value) and can have
CPU wait events. Timeline semaphores represent a full timeline, do not
have a CPU wait event, and can have their value updated dynamically.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35900>
This is quite unlikely to happen, but I guess it might be possible and
it's relatively simple to work around.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35884>
bo with write usage should wait for read and write fence. bo
with read usage should wait for write fence. Currently wrote bos
are passed to write list and read bos are passed to read like.
This patch fixes the issue.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35963>
This issue was generating unwanted write accesses that
could overwrite previous operations.
Note: This functionality could also be tested with
nir_lower_wrmasks. This problem seems to only affect
the ssbos.
This change was tested on cypress, barts and cayman. Here are the tests fixed:
khr-gl4[3-6]/compute_shader/pipeline-pre-vs: fail pass
khr-gl4[5-6]/direct_state_access/queries_functional: fail pass
khr-gl4[5-6]/es_31_compatibility/shader_image_load_store/advanced-cast-cs: fail pass
khr-gl4[5-6]/es_31_compatibility/shader_image_load_store/advanced-cast-fs: fail pass
khr-gl4[5-6]/es_31_compatibility/shader_storage_buffer_object/advanced-switchbuffers-cs: fail pass
khr-gl4[5-6]/es_31_compatibility/shader_storage_buffer_object/advanced-switchprograms-cs: fail pass
khr-gl4[5-6]/es_31_compatibility/shader_storage_buffer_object/basic-operations-case1-cs: fail pass
khr-gl4[3-6]/shader_storage_buffer_object/advanced-switchbuffers-cs: fail pass
khr-gl4[3-6]/shader_storage_buffer_object/advanced-switchprograms-cs: fail pass
khr-gl4[3-6]/shader_storage_buffer_object/basic-operations-case1-cs: fail pass
khr-gl4[4-6]/texture_buffer/texture_buffer_max_size: fail pass
khr-gles31/core/compute_shader/pipeline-pre-vs: fail pass
khr-gles31/core/shader_image_load_store/advanced-cast-cs: fail pass
khr-gles31/core/shader_image_load_store/advanced-cast-fs: fail pass
khr-gles31/core/shader_storage_buffer_object/advanced-switchbuffers-cs: fail pass
khr-gles31/core/shader_storage_buffer_object/advanced-switchprograms-cs: fail pass
khr-gles31/core/shader_storage_buffer_object/basic-operations-case1-cs: fail pass
khr-gles31/core/texture_buffer/texture_buffer_max_size: fail pass
khr-glesext/texture_buffer/texture_buffer_max_size: fail pass
Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35830>
Now that we emit these nops at the beginning of block, we can merge them
with any existing nops.
Totals from 7747 (4.71% of 164575) affected shaders:
Instrs: 10458516 -> 10439473 (-0.18%)
CodeSize: 19276236 -> 19255126 (-0.11%)
NOPs: 2379189 -> 2360146 (-0.80%)
(ss)-stall: 932629 -> 932685 (+0.01%)
(sy)-stall: 3634623 -> 3635354 (+0.02%)
Cat0: 2610461 -> 2591418 (-0.73%)
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35934>
Emitting in the same block as the pred[tfe] caused helper_sched to
sometimes insert unnecessary (eq). For example:
block i:
...
prede
(eq)(rpt6)nop
block i+1:
(eq)nop
Emitting the quirk nops in the next block (i+1 in this case) prevents
this.
Note that the small number of shaders where NOPs regress, are cases
where an extra (eq)nop is inserted in a block that doesn't contain any
other nops (but did contain the quirk nop before this change).
Totals from 3814 (2.32% of 164575) affected shaders:
Instrs: 6732543 -> 6732252 (-0.00%); split: -0.01%, +0.00%
CodeSize: 11978286 -> 11978086 (-0.00%); split: -0.00%, +0.00%
NOPs: 1683239 -> 1682948 (-0.02%); split: -0.02%, +0.01%
(ss)-stall: 635237 -> 634077 (-0.18%)
(sy)-stall: 2562027 -> 2533761 (-1.10%); split: -1.10%, +0.00%
Cat0: 1849898 -> 1849607 (-0.02%); split: -0.02%, +0.01%
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35934>
Some `sm8350-hdk` DUTs are currently failing LAVA health checks in the
Collabora farm, reducing available capacity. To mitigate job delays,
temporarily reduce the parallelism of the `a660-vk` job.
Thanks to previous optimizations and further increasing the
tests_per_group setting, there is no loss in test coverage.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35939>
The sm8350-hdk has 8 threads and 12 GB of RAM, which allows increasing
`FDO_CI_CONCURRENT` to 9 to speed up all its jobs.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35939>
Fix Venus crashing when running in KMS mode and using debug build of Mesa
due to previous patch missing to adjust the assert-check, making it prepared
to handle WSI/scanout images.
Fixes: 31a8218f5b78 ("venus: wsi workaround for gamescope")
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35958>
the semaphore stage is VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,
so the src access barrier must also use this in order to ensure it happens
after the acquire
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35940>
The multiplication of 32 bits integers will be truncated before
being widened to the destination variable' size.
Reported by static analysis.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35877>
The multiplication of 32 bits integers will be truncated before
being widened to the destination variable' size.
Reported by static analysis.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35877>