For invalid streams tile cols and rows may be higher than 64.
This would overwrite data after the height_in_sbs array, but since
the maximum amount of bytes overwritten is bound by the maximum
supported decode resolution, this can't overwrite any important
fields and thus won't cause any observable issue.
As this can only happen with invalid streams, it still won't decode
correctly with this fixed.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15290
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41016>
This should only consider valid entries, not loop over the entire array.
In addition the array size was wrong before.
Fixes: 779edc0759 ("frontends/va: Correctly derive HEVC StCurrBefore, StCurrAfter and LtCurr")
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41016>
When r300_pick_vertex_shader switches to a WPOS variant, it only dirtied
rs_block_state, leaving vs_state with a stale code size. This caused
cs_count warnings (offset of -4 for one extra VS instruction) but was
mostly harmless since the emitted packet stream still used the current
shader.
Factor the VS code dirtying out of r300_bind_vs_state into a helper and
call it when selecting a new variant too.
Fixes: 806dcf9db7 ("r300: only output wpos in vertex shaders when needed")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41200>
r300_simple_msaa_resolve used to patch srcsurf->pitch with the resolve
destination's tiling bits before passing the surface to the blitter.
That worked when set_framebuffer_state kept the same pipe_surface
pointer, so r300_get_nonnull_cb returned the patched object.
After the de-pointerization, r300_framebuffer_init creates a fresh
r300_surface from the pipe_surface template, discarding the pitch
modification. The hardware then uses the MSAA source tiling for
R300_RB3D_COLORPITCH0, leading to corruption.
Move the tiling override into r300_emit_fb_state and override the tiling
bits of COLORPITCH from the destination surface at emit time.
Fixes: 2eb45daa9c ("gallium: de-pointerize pipe_surface")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15303
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41092>
Sampling coeffs with trilinear filtering will output 2x sets of data.
Whether bilinear or trilinear filtering is in use can't be determined
without checking state words, so unconditionally reserve 2x to avoid
clobbering output regs.
Fixes: 7df32ba09d ("pco: initial texture/sampler compiler support")
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Tested-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41051>
The Vulkan spec says:
"After a call to vkCmdExecuteGeneratedCommandsEXT, command buffer
state will become undefined according to the tokens executed. This
table specifies the relationship between tokens used and state
invalidation."
The application must re-bind the states that are updated using DGC.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41159>
Only if DGC emits an indexed draw without providing the index buffer
as part of the tokens. This avoids emitting useless packets.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41159>
Save protected radv_queue in device->queues_protected so it can be
relased in radv_destroy_device.
Without this, device->queues[] will point to a new queues to record
the queue created according to the queueCreateInfo. When queueCreateInfo
include both protected and unprotected queue info, the device->queues[]
will be created twice and record one queue at each time. So we will lose
either protected queue or unprotected queue which created first.
Signed-off-by: Julia Zhang <Julia.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40619>
Create encrypted fence_bo and eop_bug_bo when the radv_cmd_buffer is
created from a protected pool which is marked with flag
VK_COMMAND_POOL_CREATE_PROTECTED_BIT
Signed-off-by: Julia Zhang <Julia.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40619>
Pass secure flag to radv_update_preamble_cs() if this queue is created
with vk flag VK_QUEUE_PROTECTED_BIT and create encrypted tess/ge ring
BOs and compute_scratch_bo according to this secure flag.
Signed-off-by: Julia Zhang <Julia.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40619>
The index buffer and index buffer size don't affect whether or not we're
actually doing indexed rendering so we should just emit them whenever
they change. Otherwise, if someone sets an index buffer and then does a
non-indexed draw and then an indexed draw, the first draw will clear the
dirty bits without setting the index buffer registers and the second
draw won't know to re-emit them.
Fixes: 5544d39f44 ("panvk: Add a CSF backend for panvk_queue/cmd_buffer")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Marc Alcala Prieto <marc.alcalaprieto@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40997>
Replace the vk_meta_fill_buffer call with direct panlib precomp
dispatches: a KERNEL(32) uint4 bulk path for 16-byte-aligned fills and a
KERNEL(32) uint32 path otherwise, each with a KERNEL(1) scalar tail for
sub-workgroup remainders.
gpu-ratemeter vk.bufbw on Mali-G610 MC4 shows a 1.15-1.18x median
speedup across alignment classes and roughly 5x on fills <= 512 B,
thanks to the removed pipeline bind / descriptor-set setup that
vk_meta_fill_buffer pays per call.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41079>
vkCmdBindVertexBuffers() -> draw with mesh shaders will just segfault.
This sequence doesn't make real sense but it's possible.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41161>
The Vulkan spec says:
"VUID-vkCmdExecuteGeneratedCommandsEXT-None-11062
If a rendering pass is currently active, the view mask must be 0."
So, it's invalid with VK_EXT_device_generated_commands but it's allowed
in DX12, it seems we missed this during the spec review.
Crimson Desert uses this and emulating in vkd3d-proton would be complex,
so let's re-introduce this support only for vkd3d-proton.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41153>
Only triangles are culled, so we can't always disable rasterization here.
Fixes:
dEQP-VK.glsl.builtin_var.frontfacing.add_ubo_load.point_list.front_and_back
dEQP-VK.glsl.builtin_var.frontfacing.add_ubo_load.line_list.front_and_back
Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41176>
Fixes dEQP-VK.api.copy_and_blit.core.use_after_copy.*_msaa
These tests set both a varying and gl_Layer to gl_InstanceID. Without proper
type inference for gl_InstanceID, it would end up stored in a float temporary,
then bit-cast back to uint when stored into the gl_Layer, resulting in an
invalid destination when outputting to layer 1.
Reviewed-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41177>
For a src to be killed, not only does its SSA value need to be killed,
it also shouldn't be part of or contain an interval that isn't killed
yet.
Fixes a RA assert in Windrose: "reg pressure calculation was wrong!".
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41154>
Reported by clang tools.
See: https://clangd.llvm.org/guides/include-cleaner
struct ac_cmdbuf had to be moved to ac_cmdbuf_base.h because we can't
include ac_cmdbuf.h->sid.h->amdgfxregs.h in radeon_winsys.h for r300.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41091>
BV_RB and BV_CCU are supported on some devices (knp, but not glymur or
pakala, for ex).. we don't have a way to deal with that yet.
This doesn't yet _expose_ gen8 perfcntrs. That small patch will come
after PERFCNTR_CONFIG ioctl is supported to ensure that everything gen8
and later supports the new kernel based counter collection/reservation
(so that backwards compat of old userspace on new kernel is limited to
a7xx and earlier).
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40522>
With concurrent binning, some counter reads or SEL reg programming needs
to happen explicitly on the BR or BV ring. For the most part if there
is a "BV_FOO" counter group that should be on the BV ring and the
corresponding "FOO" group on the BR ring. There are a few exceptions
like "CP" vs "BV_CP" which have different SEL reg offsets for BR vs BV,
rather than the same offsets that should be accessed via the appropriate
aperture.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40522>
1) only use "ull" for reg64, which avoids some compiler warnings on the
kernel side.
2) use "ull" for booleans as well, if reg64
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40522>
To generate the perfctr tables we need a bit more information than what
is in the .xml, such as which groups of SELECT regs correspond to which
sets of COUNTER regs, the enum type of the countables (ie. possible
SELECT reg values), etc.
It would be awkward to shoehorn this into an xml schema that is based on
describing registers. But json is easy to consume.
Field description:
- chip: variant enum used for generating correct reg offsets
- groups: array of entries for each group of counters/countables:
- name: group name
- num: the number of counters
- reserved: array of counter indices reserved for KMD use
- select_offset: Offset of the first selector reg, used in cases
where same bank of selectors is used for both BR and
BV
- select: the selector reg name
- counter: counter name if <reg64>, otherwise use counter_lo and
counter_hi
- countable_type: name of <enum> that defines selector reg values
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40522>
We could generate the rest of the tables, other than these fields. But
they are all "UINT64, AVERAGE" (for the non-derived counters), so just
drop them.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40522>