Extract VkDeviceQueueShaderCoreControlCreateInfoARM from the queue
create info pNext chain and use shaderCoreCount to limit
max_compute_cores and max_fragment_cores in the panthor group create
ioctl. The core masks remain unchanged, letting the kernel pick which
cores to schedule on.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40063>
Coverity notes that we break out of the loop walking the analysis_ranges
early if n_push_ranges >= max_push_buffers, so it notes that
n_push_ranges could already be 3 or 4 (depending on whether we're doing
mesh), and that then if we need the padding we insert another, which
would write past the end of the array.
I don't think this is actually possible in practice, but we can add an
assert to both keep coverity happy and detect that this has actually
happened.
CID: 1681478
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40147>
No reason just to do this for 1.0.
Foz-DB Navi48:
Totals from 44 (0.04% of 114655) affected shaders:
CodeSize: 111620 -> 111476 (-0.13%)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138>
This structure is handled instead of ignored, so the warning shouldn't
be printed.
Supress the warning when this structure is found.
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40164>
We might need to DCE users of dead instructions removed by
process_block().
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 9e8ba10447 ("aco/vn: remove dead instructions early")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40091>
Without this libva won't be able to find the driver without
LIBVA_DRIVER_NAME trickery, because the driver has a generic name.
But in the DRI case, even LIBVA_DRIVER_NAME won't do, because the driver
name needs to end with "_drv_video.so", which it doesn't.
So let's instead set up symlinks in the build-dir, like DRIL does.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40136>
We're setting this in the non-DRI codepath, but this was missed when we
started embedding the VA driver into libgallium. This means we no longer
were able to use VA-API from meson devenv, like we could before.
Fixes: 212d57f7e6 ("targets/va: Build va driver into libgallium when building with dri")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40136>
this isn't C++ brw code, it's just a devinfo query.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40143>
This change is useful when the compute shader is called multiple
times with the atomic operations enabled. It fixes some data
coherency issues. This is done by moving
evergreen_emit_atomic_buffer_setup() after r600_flush_emit().
This change is also a partial fix for compute_shader.pipeline-compute-chain.
In this specific case, it makes the memory barrier working.
This change was tested on cayman and barts; it makes these tests
fully deterministic:
khr-gl4[2-6]/shader_atomic_counters/advanced-usage-many-dispatches: fail pass
khr-gles31/core/shader_atomic_counters/advanced-usage-many-dispatches: fail pass
deqp-gles31/functional/synchronization/inter_call/without_memory_barrier/atomic_counter_dispatch_.*_calls_.*_invocations: fail pass
Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40037>
The alpha instruction always wrote to the same rendertarget as the rgb and the
original target was ignored (surprisingly the HW docs explicitly allows rgb and
alpha to write to different targets). This makes tesseract rendering a bit
better, but there are still some remaining issues.
Fixes: 1c2c4ddbd1 ("r300g: copy the compiler from r300c")
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40128>
This commit adds support for VK_KHR_maintenance4 extension by
implementing the required function.
Makes the following tests to pass/be supported:
dEQP-VK.api.info.get_physical_device_properties2.features.maintenance4_features
dEQP-VK.api.info.vulkan1p3_limits_validation.khr_maintenance4
dEQP-VK.api.device_init.create_device_unsupported_features.maintenance4_features
dEQP-VK.memory.requirements.create_info.buffer.regular
dEQP-VK.memory.requirements.create_info.image.regular_tiling_linear
dEQP-VK.memory.requirements.create_info.image.regular_tiling_optimal
dEQP-VK.memory.requirements.create_info.image.transient_tiling_linear
dEQP-VK.memory.requirements.create_info.image.transient_tiling_optimal
Signed-off-by: Leon Perianu <leon.perianu@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39776>
With VK_KHR_maintenance4, the interface matching rules are relaxed to
allow emitted vs outputs > used fs inputs; unused I/O is typically
discarded during linking, but there are some cases with more complex
types that are currently missed, such as in
dEQP-VK.pipeline.monolithic.interface_matching.vector_length.out_ivec4_in_ivec3_member_of_array_of_structures_in_block_vert_out_frag_in
This change downgrades the assertion to a warning until the linker is
amended to handle these cases.
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39776>
When building with "-Dvideo-codecs=h264dec,h265dec,av1dec" va/encode.c
won't be built but it's still required because it's used from
picture.c
Fixes: c4f05bdf60 ("frontends/va: include picture_*.c based on selected codec")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39482>
When none of Vulkan, GL, rusticl and VA are enabled, with_gfx_compute is
false and HAVE_GFX_COMPUTE isn't defined.
This can then be used to disable parts of drivers.
For now it's not really useful, as the resulting build cannot do anything.
Later, a new option will allow disabling the VA features that require
shader support so we can build a minimal VA driver.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39482>
3 files include sid_tables.h so it means we had 3 copies of all
its static content. This removes ~260kB from libgallium and
libvulkan_radeon.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39482>
On GPUs with ETNA_FEATURE_S8, the hardware supports native 8bpp
stencil buffers. The blob driver samples these as R8I (8-bit integer).
This enables the stencil blit fallback to work with pure S8_UINT
stencil buffers, fixing
dEQP-GLES3.functional.fbo.blit.depth_stencil.stencil_index8_scale
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39936>
Previously, stencil-only blits were silently skipped with "cannot blit
stencil, skipping" because neither the BLT nor RS engines can
selectively copy individual channels from packed depth/stencil formats.
On HALTI5+ GPUs that support stencil texturing (S8X24_UINT), use
util_blitter_stencil_fallback() to perform a shader-based stencil blit.
This clears the destination stencil to zero, then copies each stencil
bit individually using draw calls with per-bit DSA write masks.
Fixes dEQP-GLES3.functional.fbo.blit.depth_stencil.depth24_stencil8_stencil_only
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39936>
After a BLT blit invalidates the destination's tile status,
ETNA_DIRTY_DERIVE_TS was not set, so etna_update_ts_config() would not
run before the next draw. This caused TS_MEM_CONFIG to retain stale
DEPTH_FAST_CLEAR/DEPTH_COMPRESSION bits from the previous draw, even
though the destination depth data was overwritten by the blit.
This fixes depth/stencil blit tests like
dEQP-GLES3.functional.fbo.blit.depth_stencil.depth24_stencil8_basic.
The RS blit path already sets this dirty bit.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39936>