Add the missing layout which do not need implemented anything in
mali gpu.
Fixed: dEQP-VK.image.host_image_copy.properties.properties
unifiedImageLayouts feature is supported, but layout
VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_OPTIMAL was not included in
VkPhysicalDeviceHostImageCopyProperties::pCopySrcLayouts.
Fixes: 1cd61ee ("panvk: implement VK_EXT_host_image_copy for linear color images")
Signed-off-by: Ryan Zhang <ryan.zhang@nxp.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
(cherry picked from commit 62e7120384)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
Some Max Payne 3 shaders are impacted by this and probably will fix some
issue there. The VK CTS isn't testing this, but it was verified to fix a
real problem by inserting 0 offsets into the instruction and having CTS
tests fail with the old ordering.
Totals from 3 (0.00% of 1163204) affected shaders:
CodeSize: 2496 -> 2736 (+9.62%)
Static cycle count: 732 -> 741 (+1.23%)
Fixes: ad01fbdda0 ("nak: Add a NIR texture lowering pass")
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
(cherry picked from commit e09045e26c)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
When recording secondary command buffers with occlusion queries, the
get_vis_results flag could be set for some graphics sub_cmd's job.
Propagate this flag from secondary command buffer graphics sub_cmds to
primary command buffer sub_cmds to ensure occlusion queries in secondary
command buffers being correctly executed.
Fixes: 5c34be4340 ("pvr: Process secondary buffer queries in vkCmdExecuteCommands.")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit b8c5e47949)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
There's a dynarray field inside gfx sub_cmd called sub_query_indices,
which will contain pending query indices for gfx sub_cmds inside a
secondary command buffer. It's expected that when finishing such gfx
sub_cmds, the content of query_indices is going to be moved there.
However the `util_dynarray_append_dynarray()` call is called with wrong
parameter order, thus it's copying sub_query_indices to query_indices
and then immediately wiping query_indices, forgetting all query indices
in such case.
Fix the `util_dynarray_append_dynarray()` call to fix occlusion queries
in secondary command buffers.
Fixes: 8c506c4b03 ("pvr: Use util_dynarray_append_dynarray()")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 87f4122e11)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
The last sub_cmd in the command buffer could be a graphics one, and when
ending a graphics sub_cmd, the query_indices array will be checked to
know whether a occlusion query starts during this graphics sub_cmd.
Finalize the query_indices array after ending the last sub_cmd,
otherwise the check for query initiation may have a false negative
result.
Fixes the `dEQP-VK.renderpasses.dynamic_rendering.primary_cmd_buff.
random.seed6` test case.
Fixes: 2b1992a000 ("pvr: Implement vkCmdBeginQuery API.")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 36f34a72c1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
src/gallium/drivers/zink/zink_instance.c:34:9: warning: variable 'have_moltenvk_layer' set but not used [-Wunused-but-set-variable]
Fixes: 2b4fcf0a06 ("zink: generate instance creation code with a python script")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
(cherry picked from commit 5982deb48b)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
This appears to cause some sort of prefetching which is causing
page faults for linear textures on the following page after the
texture allocation.
This might be okay for tiled, but for now just disable it.
The test crashing this was to allocate an 800x409 linear 2D texture
which gnome-initial-setup was doing.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15277
Cc: mesa-stable
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
(cherry picked from commit 7067b66846)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
The validation of branch instructions happening in branch and thrsw
delay slots has been dead code since it was introduced as the check
was after:
if (inst->type != V3D_QPU_INSTR_TYPE_ALU)
return;
Now last_branch_ip is updated and checks in_branch_delay_slots()
are active.
Fixes in_branch_delay_slots, as for branch there are always 3 delay slots.
As scheduler enforces this restrictions shader-db does not show any
regression.
Assisted-by: Claude Opus 4.6
Fixes: 90269ba353 ("broadcom/vc5: Use THRSW to enable multi-threaded shaders.")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit dd6e7c8ef0)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
remove_trivial_phi() mostly does nothing for non-array phis, but it
rewrites sources if their definining instruction are trivial phis.
In the case of trivial phis in the loop continue block (for loops with
divergent non-trivial continues), we might need to keep those if they
write a shared register, because the source of the trivial phi will not be
reachable from the loop header phi.
In this example, the predecessors of the continue block should be block2,
but the physical predecessors are block2 and block3, requiring a phi in
the continue block which will then be lowered by ir3_lower_shared_phis.
loop {
block1:
a = phi 0, b
if (divergent) {
block2:
b = a + 1
continue;
}
block3:
break;
}
Fixes RA validation error when compiling blackmythwukong/5645a84e669a6179
from radv_fossils.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 26.0
(cherry picked from commit 4f0fb5784f)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
VkPipelineRenderingCreateInfo is only required in the fragment output
interface lib. For pre-rasterization shaders and fragment shader state
libs, only the view mask is used but it's optional.
If the attachments info isn't marked invalid merging renderpass info
during lib imports wouldn't work because it would assume that the first
lib has attachment info (eg. the pre-rasterization lib).
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15241
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 1950b6c1a7)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
The spec also mentions "output Location decorated color attachments".
Signed-off-by: Mary Guillemard <mary@mary.zone>
Fixes: 564b061981 ("hk: Increase maxFragmentCombinedOutputResources to HK_MAX_DESCRIPTORS")
Reviewed-by: Janne Grunau <j@jannau.net>
(cherry picked from commit 59d9bc7bee)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
According to the GL_EXT_multisampled_render_to_texture specification,
copy operations should be allowed when the extension is supported.
Previously, glCopyTexImage* would unconditionally fail with
GL_INVALID_OPERATION when copying from any multisampled framebuffer
(samples > 0), even when using render-to-texture attachments.
Fixes: d7b9da2673 ("mesa/main: fix artifacts with GL_EXT_multisampled_render_to_texture")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Signed-off-by: Wujian Sun <wujian.sun_1@nxp.com>
(cherry picked from commit 2e340d63d2)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
When round_to_nearest_even is enabled with NEAREST filtering, texture
coordinates near texel boundaries (e.g. 0.9999999404) can be incorrectly
rounded up to the next texel instead of being floor()'d.
According to OpenCL spec section 8.2, for CLK_FILTER_NEAREST:
i = address_mode((int)floor(u))
Backport-to: *
Signed-off-by: Eric Guo <eric.guo@nxp.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
(cherry picked from commit c415134454)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
AMD APUs are hitting this case where they have very small discrete VRAM,
but a lot of staging memory, which can be used additionally.
Fixes: 7487ac2046 ("rusticl/device: support query_memory_info to retrieve available memory")
(cherry picked from commit 58d45725c7)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
Is a single triangle is selected, it can be the case that the next iteration
can't merge any pair with the triangle. In that case, the HW node with a
single triangle will not have the highest hw_node_index, triggering an
assert.
Fixes: c18a7d0 ("radv: Emit compressed primitive nodes on GFX12")
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
(cherry picked from commit db38d1a98c)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
When cross-compiling with MinGW, d3d12_resource_state.cpp fails to
compile with:
d3d12_resource_state.cpp:161:83: error: call to non-'constexpr'
function 'D3D12_RESOURCE_STATES operator|(D3D12_RESOURCE_STATES,
D3D12_RESOURCE_STATES)'
161 | D3D12_RESOURCE_STATE_ALL_SHADER_RESOURCE |
| D3D12_RESOURCE_STATE_COPY_SOURCE | D3D12_RESOURCE_STATE_COPY_DEST;
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /usr/share/mingw-w64/include/minwindef.h:163,
from /usr/share/mingw-w64/include/windef.h:9,
from /usr/share/mingw-w64/include/windows.h:69,
from /usr/share/mingw-w64/include/rpc.h:16,
from /usr/share/mingw-w64/include/unknwn.h:7,
from ../subprojects/DirectX-Headers-1.0/include/wsl/winadapter.h:6,
from ../src/gallium/drivers/d3d12/d3d12_common.h:29,
from ../src/gallium/drivers/d3d12/d3d12_bufmgr.h:31,
from ../src/gallium/drivers/d3d12/d3d12_resource_state.cpp:24:
../subprojects/DirectX-Headers-1.0/include/directx/d3d12.h:3540:1:
note: 'D3D12_RESOURCE_STATES operator|(D3D12_RESOURCE_STATES,
D3D12_RESOURCE_STATES)' declared here
3540 | DEFINE_ENUM_FLAG_OPERATORS( D3D12_RESOURCE_STATES )
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
The DEFINE_ENUM_FLAG_OPERATORS macro in the MinGW winnt.h header
defines operator| for D3D12_RESOURCE_STATES as inline but not
constexpr. (The DirectX-Headers WSL stubs do define it as constexpr,
but when building with MinGW, windows.h is pulled in via winadapter.h
and its non-constexpr definition wins.) Calling a non-constexpr
function to initialize a constexpr variable is ill-formed in C++.
Fix by changing static constexpr to static const, which avoids the
constexpr context while still giving the variable static storage
duration.
Fixes: fe48cd7c5a ("d3d12: Allow state promotion for non-simultaneous access textures")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
(cherry picked from commit 2443f3608a)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
Usually connector property IDs are acquired in
wsi_display_get_connector, which is called by wsi_get_connectors, and in
turn by vkGetPhysicalDeviceDisplayProperties2KHR and
vkGetPhysicalDeviceDisplayPlanePropertiesKHR. Except if the drm fd is
not available when these functions are called. Which will be the case if
vkAcquireXlibDisplayEXT is not called first.
So it goes like this. First, the display is created in
vkGetRandROutputDisplayEXT. Then it's used in
vkGetPhysicalDeviceDisplayPlanePropertiesKHR, but since the drm fd is
not available at this point, connector property IDs are not initialized.
Later, this display is used in vkAcquireXlibDisplayEXT, which also
doesn't touch the property IDs. Finally in drm_atomic_commit, the
atomic commit fails with EINVAL, specifically because of the
uninitialized ID of the "CRTC_ID" property. Since it's one of the
properties drm_atomic_commit tries to set.
This commit makes sure that find_connector_properties is called in
vkAcquireXlibDisplayEXT to initialize the property IDs.
Fixes: 513ffea1d3 ("wsi/display: use atomic mode setting")
Signed-off-by: Yuxuan Shui <yshui@codeweavers.com>
(cherry picked from commit 37a1986691)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
The immediate fs_clear_color shader uses IMM[0] but still declares
CONST[0][0]. That can make drivers try to read a fragment constant
buffer even though one is never uploaded on this path. Only declare
CONST[0][0] when the shader actually uses a constant buffer.
Fixes: 2ff9fa8b72 ("gallium/u_blitter: add a new fs_color_clear variant")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 79e3196320)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
Entering a loop with empty exec mask might lead to
not be able to execute the break condition and
lead to infinite loops.
Totals from 81 (0.04% of 202440) affected shaders: (Navi48)
Instrs: 3040566 -> 3040716 (+0.00%)
CodeSize: 17506768 -> 17507188 (+0.00%)
Latency: 16342966 -> 16345166 (+0.01%)
InvThroughput: 3112932 -> 3113286 (+0.01%)
Branches: 82229 -> 82365 (+0.17%)
Cc: mesa-stable
(cherry picked from commit 60b3e5b3f0)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
Add asserts this time that we don't miss any and that the buckets
actually match the enum in bifrost/compiler.h.
Fixes: 82328a5245 ("pan/bi: Generate instruction packer for new IR")
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
(cherry picked from commit fd5c6d1223)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
On v12+, IDVS no longer has separate position and varying variants, so
we only need to emit stats for one binary. Attempting to emit stats for
the nonexistent varying shader breaks shader-db.
Fixes: 7819b103fa ("pan/bi: Add support for IDVS2 on Avalon")
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
(cherry picked from commit 31ddfe26eb)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
All B-series Rogue cores seem to have USC rounding mode as RTE instead
of RTZ.
Set the has_usc_alu_roundingmode_rne feature flag for them (currently
only BXS-4-64 has it set).
Verified via testing on BXM-4-64 (36.52.104.182) by fixing CTS tests
dEQP-VK.spirv_assembly.instruction.*.float_controls.fp32.input_args.* ,
and via proprietary driver vulkaninfo result on BXE-2-32 (36.29.52.182),
BXE-4-32 (36.50.54.182) and BXM-4-64 (36.56.104.183) (checking
shaderRoundingModeRT?Float32 properties).
Fixes: 1db1038a61 ("pvr: add device info for BXM-4-64 (36.56.104.183)")
Fixes: e60e0c96ba ("pvr: add device info for BXE-2-32 (36.29.52.182)")
Fixes: 2743363a57 ("pvr: add device info for BXM-4-64 (36.52.104.182)")
Fixes: ea28791d40 ("pvr: add device info for BXE-4-32 (36.50.54.182)")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
(cherry picked from commit 9b44def4e9)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
If both src and dst are compressed formats, adjusting the extent isn't
necessary because it's required that texel block extent matches. The
previous division was also wrong because it was truncating partial
blocks causing issues in some tests.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 4e00e1c3d0)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
The sc7180-trogdor-lazor-limozeen devices have been dying off over the
past few weeks, so move the last two jobs to sc7180-trogdor-kingoftown
and retire the device type.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
(cherry picked from commit bbed00ac81)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
The sc7180-trogdor-lazor-limozeen devices are having issues, so move the
job to a different device with available capacity.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
(cherry picked from commit 17d38c9668)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
This could override data allocated by the application when shader code
is loaded from binary in vkCreateShaderObjectEXT().
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: d39e443ef8 ("anv: add infrastructure for common vk_pipeline")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
(cherry picked from commit 21952ffb07)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
Fix non-functional issue where calls to cs_run_fragment() had swapped
tile_order and enable_tem arguments. Both arguments evaluate to 0.
Hence, no functional change.
Fixes: 53f780ec91 ("panfrost: Remove progress_increment from all CS builders")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
(cherry picked from commit 0d08b197f2)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
The pvr_clear_vdm_state_get_size_in_dw() wrongly think instance count
inputs are needed when doing RTA clear for cores without the
gs_rta_support feature. However, the instance ID is exploited to output
the target layer ID, which isn't supported at all for cores w/o that
feature, so it looks that the condition is inverted. In addition, the
pvr_pack_clear_vdm_state() function seems to have similar logic deciding
whether to emit instance_count, and the logic is opposite to the logic
in pvr_clear_vdm_state_get_size_in_dw() for the part checking the
gs_rta_support feature.
Invert the condition to take instance ID inputs for cores with the
gs_rta_support feature instead of those without this feature.
Fixes: b59eb30e88 ("pvr: Fix cs corruption in pvr_pack_clear_vdm_state()")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
(cherry picked from commit 3db93bbf34)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>