We'll need to use this in radv_image for VK_EXT_image_view_min_lod.
Additionally, creates signed/unsigned variants to avoid sign-extending where we don't need to.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13820>
If entry->shaders[i] is NULL, shaders[i] should be also NULL, so the
else condition is a no-op.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13823>
Fix defect reported by Coverity Scan.
Uninitialized scalar variable (UNINIT)
uninit_use_in_call: Using uninitialized value clock_calibration.
Field clock_calibration.reserved is uninitialized when calling
fwrite.
Fixes: 1ee85e8bab ("ac/rgp: add support for clock calibration")
Suggested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13783>
No matter the format, this should return true if the instruction has an
exec operand.
Otherwise, eliminate_useless_exec_writes_in_block() could remove an exec
write in a block if it's successor begins with:
s2: %3737:s[8-9] = p_parallelcopy %0:exec
s2: %0:exec, s1: %3738:scc = s_wqm_b64 %3737:s[8-9]
Totals from 3 (0.00% of 150170) affected shaders (GFX10.3):
CodeSize: 23184 -> 23204 (+0.09%)
Instrs: 4143 -> 4148 (+0.12%)
Latency: 98379 -> 98382 (+0.00%)
Copies: 172 -> 175 (+1.74%)
Branches: 95 -> 97 (+2.11%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: bc13049747 ("aco: Eliminate useless exec writes in jump threading.")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5620
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13776>
for now only for texture dest and if there is no rounding mode required.
Totals from 2 (0.00% of 150170) affected shaders: (GFX10.3)
CodeSize: 7980 -> 7948 (-0.40%)
Instrs: 1441 -> 1422 (-1.32%)
Latency: 7703 -> 7626 (-1.00%)
InvThroughput: 2336 -> 2302 (-1.46%)
VClause: 34 -> 36 (+5.88%)
Copies: 57 -> 58 (+1.75%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13592>
If the same cmdbuf is submitted more than once, they were waiting on
the same fence value. Fix this by clearing the value when beginning
a new command buffer.
This might fix spurious GPU hangs, especially on GFX9.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5401
Cc: 21.3 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13777>
p_is_helper doesn't have any operands, so ACO's value numbering and/or
the pre-RA optimizer could incorrectly recognize two such instructions
as the same.
This patch adds exec as an operand to p_is_helper in order to achieve
correct behavior.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5570
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13577>
From VUID-VkAttachmentReference2-attachment-04755:
"If attachment is not VK_ATTACHMENT_UNUSED, and the format of the
referenced attachment is a depth/stencil format which includes both
depth and stencil aspects, and layout is
VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_OPTIMAL or
VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_OPTIMAL, the pNext chain must include
a VkAttachmentReferenceStencilLayout structure."
We did not check that there even could be a stencil layout
before fetching it.
Fixes a few tests from:
dEQP-VK.image.depth_stencil_descriptor.depth_read_only_optimal.*
Fixes: 979ea394e5 "vulkan/util: Move
helper functions for depth/stencil images to vk_iamge"
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13740>
SPM is hardware feature that allows us to dump performance counters
at a sampling interval to a buffer. It is used by RGP to report cache
counters.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13704>
Future LLVM header leads to __declspec(__restrict), which is invalid.
Just undefine the restrict macro to keep __declspec(restrict).
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13505>
Vulkan loader wants backslash for paths on Windows. Need to jump through
hoops because Meson does not support backslashes in commands.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13461>
S_008D20_FINISH_DONE is a mask of queues and 1 means "wait on the gfx
queue until the value is not 0" which can never happen when the driver
captures from compute. Instead, use the full mask of possible queues.
Cc: 21.3 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13694>
The Vulkan spec got clarified recently and it's invalid (hw can support
it though). Fixes new CTS dEQP-VK.api.buffer.invalid_buffer_features.*.
Cc: 21.3 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13701>
It got introduced by "radv: optimize subpass barrier flushes for
imageless framebuffers" but never used in the final version.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13702>
Assumes cmd_buffer != NULL for this path to eliminate the checks in the generic code.
Benchmarks:
I made a microbenchmark based on Bas' vulkan microbench suite.
https://gitlab.freedesktop.org/bnieuwenhuizen/vulkan_microbench/-/merge_requests/1
In this benchmark, this improves vkDescriptorTemplateUpdate performance consistently by 36%.
-------------------------------------------------------------------
Benchmark Time CPU Iterations
-------------------------------------------------------------------
DescriptorTemplateUpdate 81.2 ns 81.2 ns 8573169
->
-------------------------------------------------------------------
Benchmark Time CPU Iterations
-------------------------------------------------------------------
DescriptorTemplateUpdate 52.9 ns 52.9 ns 13306065
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13342>