No reason just to do this for 1.0.
Foz-DB Navi48:
Totals from 44 (0.04% of 114655) affected shaders:
CodeSize: 111620 -> 111476 (-0.13%)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40138>
This structure is handled instead of ignored, so the warning shouldn't
be printed.
Supress the warning when this structure is found.
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40164>
We might need to DCE users of dead instructions removed by
process_block().
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 9e8ba10447 ("aco/vn: remove dead instructions early")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40091>
Without this libva won't be able to find the driver without
LIBVA_DRIVER_NAME trickery, because the driver has a generic name.
But in the DRI case, even LIBVA_DRIVER_NAME won't do, because the driver
name needs to end with "_drv_video.so", which it doesn't.
So let's instead set up symlinks in the build-dir, like DRIL does.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40136>
We're setting this in the non-DRI codepath, but this was missed when we
started embedding the VA driver into libgallium. This means we no longer
were able to use VA-API from meson devenv, like we could before.
Fixes: 212d57f7e6 ("targets/va: Build va driver into libgallium when building with dri")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40136>
this isn't C++ brw code, it's just a devinfo query.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40143>
This change is useful when the compute shader is called multiple
times with the atomic operations enabled. It fixes some data
coherency issues. This is done by moving
evergreen_emit_atomic_buffer_setup() after r600_flush_emit().
This change is also a partial fix for compute_shader.pipeline-compute-chain.
In this specific case, it makes the memory barrier working.
This change was tested on cayman and barts; it makes these tests
fully deterministic:
khr-gl4[2-6]/shader_atomic_counters/advanced-usage-many-dispatches: fail pass
khr-gles31/core/shader_atomic_counters/advanced-usage-many-dispatches: fail pass
deqp-gles31/functional/synchronization/inter_call/without_memory_barrier/atomic_counter_dispatch_.*_calls_.*_invocations: fail pass
Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40037>
The alpha instruction always wrote to the same rendertarget as the rgb and the
original target was ignored (surprisingly the HW docs explicitly allows rgb and
alpha to write to different targets). This makes tesseract rendering a bit
better, but there are still some remaining issues.
Fixes: 1c2c4ddbd1 ("r300g: copy the compiler from r300c")
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40128>
This commit adds support for VK_KHR_maintenance4 extension by
implementing the required function.
Makes the following tests to pass/be supported:
dEQP-VK.api.info.get_physical_device_properties2.features.maintenance4_features
dEQP-VK.api.info.vulkan1p3_limits_validation.khr_maintenance4
dEQP-VK.api.device_init.create_device_unsupported_features.maintenance4_features
dEQP-VK.memory.requirements.create_info.buffer.regular
dEQP-VK.memory.requirements.create_info.image.regular_tiling_linear
dEQP-VK.memory.requirements.create_info.image.regular_tiling_optimal
dEQP-VK.memory.requirements.create_info.image.transient_tiling_linear
dEQP-VK.memory.requirements.create_info.image.transient_tiling_optimal
Signed-off-by: Leon Perianu <leon.perianu@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39776>
With VK_KHR_maintenance4, the interface matching rules are relaxed to
allow emitted vs outputs > used fs inputs; unused I/O is typically
discarded during linking, but there are some cases with more complex
types that are currently missed, such as in
dEQP-VK.pipeline.monolithic.interface_matching.vector_length.out_ivec4_in_ivec3_member_of_array_of_structures_in_block_vert_out_frag_in
This change downgrades the assertion to a warning until the linker is
amended to handle these cases.
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39776>
When building with "-Dvideo-codecs=h264dec,h265dec,av1dec" va/encode.c
won't be built but it's still required because it's used from
picture.c
Fixes: c4f05bdf60 ("frontends/va: include picture_*.c based on selected codec")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39482>
When none of Vulkan, GL, rusticl and VA are enabled, with_gfx_compute is
false and HAVE_GFX_COMPUTE isn't defined.
This can then be used to disable parts of drivers.
For now it's not really useful, as the resulting build cannot do anything.
Later, a new option will allow disabling the VA features that require
shader support so we can build a minimal VA driver.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39482>
3 files include sid_tables.h so it means we had 3 copies of all
its static content. This removes ~260kB from libgallium and
libvulkan_radeon.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39482>
On GPUs with ETNA_FEATURE_S8, the hardware supports native 8bpp
stencil buffers. The blob driver samples these as R8I (8-bit integer).
This enables the stencil blit fallback to work with pure S8_UINT
stencil buffers, fixing
dEQP-GLES3.functional.fbo.blit.depth_stencil.stencil_index8_scale
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39936>
Previously, stencil-only blits were silently skipped with "cannot blit
stencil, skipping" because neither the BLT nor RS engines can
selectively copy individual channels from packed depth/stencil formats.
On HALTI5+ GPUs that support stencil texturing (S8X24_UINT), use
util_blitter_stencil_fallback() to perform a shader-based stencil blit.
This clears the destination stencil to zero, then copies each stencil
bit individually using draw calls with per-bit DSA write masks.
Fixes dEQP-GLES3.functional.fbo.blit.depth_stencil.depth24_stencil8_stencil_only
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39936>
After a BLT blit invalidates the destination's tile status,
ETNA_DIRTY_DERIVE_TS was not set, so etna_update_ts_config() would not
run before the next draw. This caused TS_MEM_CONFIG to retain stale
DEPTH_FAST_CLEAR/DEPTH_COMPRESSION bits from the previous draw, even
though the destination depth data was overwritten by the blit.
This fixes depth/stencil blit tests like
dEQP-GLES3.functional.fbo.blit.depth_stencil.depth24_stencil8_basic.
The RS blit path already sets this dirty bit.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39936>
Adds missing FLUSH_CACHE event, and combines the flushing with existing
barrier emit to avoid duplicating flushes.
Fixes some flakeyness seen with UBWC images on gen8.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40144>
Avoid creating a dummy batch in the non-TC (async-flush) path, if we can
re-use last_fence.
This avoids extra flushes with rusticl, which is not using TC.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40144>
When we have a sequence of batches to flush (ie. some batch, and all
it's dependent batches), attach a fence to the last in sequence. This
helps avoid fd_context_flush() from creating an empty batch when it
needs to return a fence, but has nothing else left to flush.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40144>
It would matter more if we used CP_RUN_OPENCL, as CP_EXEC_CS programs
these regs from the pm4 pkt payload. But might as well at least program
correct values.
This at least makes it easier to compare cmdstream to closed cl driver,
which uses CP_RUN_OPENCL.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40144>
Currently pre-a6xx is missing ir3 support for various things that rusticl
requires, leading to segfaults.
For now, block compute-only contexts on older gens.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40144>