If you allow an unsupported component count in the callback for loads,
nir_opt_load_store_vectorize will align num_components to the next supported
vector size, essentially overfetching.
This changes all callbacks to reject it. AMD will enable it in a later commit.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>
It will be used to allow merging loads with a hole between them.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>
Names used in libconfig's configuration files only allow alphanumerics,
underscores, dashes and asterisks. Freedreno device names, used as names
in fdperf.cfg, can also contain other characters, currently spaces and
plus characters. Not accounting for those makes it impossible to store
fdperf configuration across separate runs.
Once the Freedreno device name is retrieved, it's now sanitized for use
in fdperf.cfg. Unsupported characters are converted to underscores.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31577>
Following the merge of Merge Request #31151, we encountered an issue
where the performance jobs were failing silently. Although these
failures did not cause the pipeline to fail, they resulted in warnings
for all merge requests that ran the .*-traces-performance jobs, putting
critical performance data for the [Mesa Performance Driver
dashboard](https://ci-stats-grafana.freedesktop.org/goto/G3xkvykHg?orgId=1)
at risk.
To resolve this issue, this commit updates the LAVA performance jobs to
utilize the Pyutils artifact package, which is now the only required
artifact for the jobs that run the LAVA job submitter.
Fixes: dd5d737e6c ("ci/lava: Use new pyutils container")
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31553>
We have a few different command streams we create at startup. Simplify
the initialization by creating a single sub_cs to allocate all of the
cs's out of and inlining structures where appropriate.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30544>
While the register is constant for all bins in the render pass, it is
not saved and restored with level 1 preemption with skipsaverestore=1 so
it needs to be restored. Follow what the blob does and set it before
each bin.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30544>
Add USES_GMEM flag to indicate that GMEM is in use, so that preemption can
know it needs save and restore GMEM contents.
The missing BIN_RENDER_END markers are also added, their purpose is to
clear the USES_GMEM flag once GMEM is no longer in use.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30544>
Use real names for most of a6xx_marker enum, add USES_GMEM, remove
overlapping bitfields.
Note the actual "real names" start with PM4_RENDER_MODE_ instead of RM6_
This is a small change to adreno_pm4.xml, with the corresponding
find/replace and updated ci references
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30544>
Clearing VK_FORMAT_R8G8_* with fast-clear value and certain
dimensions (e.g. 960x540), and having GMEM renderpass afterwards
may lead to a GPU fault on A7XX.
Prop driver directly clears UBWC layers for R8G8_UNORM, and
doesn't use UBWC for R8G8_UINT. It uses generic clear for R8G8 only
for renderpass, where doesn't cause issues in Turnip.
Fixes GPU fault in Limbo game running via Zink.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31258>
quiet a big pile of:
TU: error: ../src/freedreno/vulkan/tu_knl_drm_virtio.cc:1299: could not get connect vdrm: No such file or directory (VK_ERROR_INCOMPATIBLE_DRIVER)
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31243>
When processing the last instruction prior to the block terminator,
ir3_shader_folding can append a new instruction prior to the
terminator, so the `current_instruction->next == new_instruciton`
instead of `current_instruction->next == terminator` which leads
to the assert in `foreach_instr` being hit, so use
`foreach_instr_safe`.
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31478>
Buffer with indirect args wasn't passed to the function which
adds extra event args. Since function definition depends on the
common code, the definition is moved to a single place.
Fixes: 0a17035b5c
("u_trace: add support for indirect data")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31090>
A7XX supports buffer-to-images copies with a lower alignment requirement
for the pitch and start VA, this makes it unnecessary to loop over every
row and copy them individually for any previously unaligned images. The
new alignment requirements match Vulkan requirements and should cover
all cases that aren't handled by 3D copies.
This can result in a significant performance improvement, up to 10x or
more in some cases.
Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31401>
The existing scan/reduce macros (OPC_SCAN_MACRO/OPC_SCAN_CLUSTERS_MACRO)
hard code the reduction operations in ir3. Adding support for 64b
operations will blow up these already complicated macros. Implement a
simple scan loop in NIR for the few (hopefully rare) cases where the
generic passes cannot lower the reduction to 32b.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31455>
ir3_nir_lower_64b_intrinsics will blindly set the def bit size to 32 for
unknown intrinsics. Give the generic passes a chance to lower them
first.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31455>
vk_image.h has these guards, and any non-{Linux}/{BSD}
compile would hit this issue.
The alternative is just to remove the OS-specific guards
in vk_image.h, since the modifier is just 64-bit opaque
number and theoretically can work on any OS, though the
non-Linux spec language is lacking.
Acked-by: Rob Clark <robdclark@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31453>
Newer versions of the blob don't seem to expose linear features for VK_FORMAT_D32_SFLOAT_S8_UINT,
but they advertise VK_FORMAT_FEATURE_2_STORAGE_IMAGE_ATOMIC_BIT for more formats now.
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31304>
Turnip supports VK_KHR_format_feature_flags2 but has been using a mixture of VK_FORMAT_FEATURE and
VK_FORMAT_FEATURE_2 flags. Always use the new 64-bit flags.
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31304>
Assign the feature bits for depth formats and VK_FORMAT_*_PACK16 earlier.
If we configure their optimalTilingFeatures before we copy those over to linearTilingFeatures
we don't have repeat them.
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31304>
Add and use a new helper function, tu_aspects_to_plane, that combines tu6_plane_index and tu6_plane_format.
This allowed for spotting and fixing a copy-paste mistake in tu6_blit_image, in dst_format for D32_S8.
The existing code wouldn't return the right dst_format if you blitted an S8 image to the stencil aspect
of a D32_S8 image, which should be a legal thing to do.
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31304>
Use existing helpers for deciding the VK format to treat our data as for memcpy-style blits.
No need to special case these a second time when it's already done in our helpers.
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31304>
Only call tu6_plane_format for VK_FORMAT_D32_SFLOAT_S8_UINT in tu_image_view_init.
vk_format is always a single plane format here but checking the aspect mask wasn't enough.
It was possible for e.g. R8_UNORM to not have VK_IMAGE_ASPECT_COLOR_BIT apsect mask but a
PLANE aspect mask in formats like G8_B8_R8_3PLANE_420_UNORM.
This was masked by the default case in tu6_plane_format, which just returned vk_format_to_pipe_format
anyway without checking the plane index.
We need to fix this for when we switch tu6_plane_format to using vk_format_get_plane_format, where we
would otherwise trip an assert.
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31304>
Use ycbcr_info instead of checking the layout or the format directly.
Swap the order of the if statement for clarity.
These should make the code significanntly easier to read.
Also document Chia-I's findings on SEPARATE_RECONSTRUCTION_FILTER_BIT.
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31304>
This prevents folding something like this:
add.u hrA, hrB, hrC
mov.u8u32 rD, hrA
When I wrote this I assumed that because the conversion source and ALU
destination were the same register that meant the types must have the
same size, but that's not the case with u8 which is an 8-bit type in a
16-bit register, so this could've been broken with 8-bit types.
Fixes: f58e1ef7ec ("tu: enable shaderInt8 support")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31399>