If a secondary cmdbuf has been grown and is executed without IB2
(eg. on compute queue or when it's not allowed), the ib size ptr
contains chaining info, which means the IB size was wrong.
This fixes CPU crashes when running gl_vk_meshlet_cadscene.
Fixes: 277b2afd70 ("radv/amdgpu: add support for executing DGC cmdbuf with RADV_DEBUG=noibs")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24891>
(cherry picked from commit 9206aeb077)
Issue: For texture with multiple planes, the planes will point to the
same BO with the total size, so current vcn dt_size is incorrect.
(gdb) p/x *((struct si_resource *)(((struct vl_video_buffer *)out_surf)->resources[0]))
...
buf = 0x5555558daa30,
gpu_address = 0xffff800101000000,
bo_size = 0xa2000,
...
}
(gdb) p/x *((struct si_resource *)(((struct vl_video_buffer *)out_surf)->resources[1]))
...
buf = 0x5555558daa30,
gpu_address = 0xffff800101000000,
bo_size = 0xa2000,
...
}
This is because: in function static struct si_texture *si_texture_create_object(),
if (plane0) {
/* The buffer is shared with the first plane. */
resource->bo_size = plane0->buffer.bo_size;
...
radeon_bo_reference(sscreen->ws, &resource->buf, plane0->buffer.buf);
resource->gpu_address = plane0->buffer.gpu_address;
}
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9728
Cc: mesa-stable
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25013>
(cherry picked from commit 7876a2f685)
ISL makes a bunch of decision on programming (MOCS,
RENDER_SURFACE_STATE values) based on this flag. It's important to set
it if we're going to use an image as storage.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23620>
(cherry picked from commit 34d5db0583)
The length passed to mesa_bytes_to_hex is the one of the input, not output
data.
Fixes: fbe9a7ca3e ("rusticl/mesa: create proper build-id hash for the disk cache")
Signed-off-by: Karol Herbst <git@karolherbst.de>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24967>
(cherry picked from commit 1a20ac7891)
The GL_MIRROR_CLAMP_EXT wrap parameter is never available in GLES.
This fixes the `spec@!opengl 1.1@texwrap 2d proj` piglit test when using a GLES
host.
Signed-off-by: Corentin Noël <corentin.noel@collabora.com>
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24935>
(cherry picked from commit 9c39ea796c)
Fix defect reported by Coverity Scan.
Logically dead code (DEADCODE)
dead_error_line: Execution cannot reach this statement: return VK_ERROR_SURFACE_LOS....
Fixes: fb9f697fbb ("vk/wsi/x11: move surface alpha check from get_caps to creation")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24802>
(cherry picked from commit 71466eb863)
Huge thanks to Tapani Pälli for debugging this issue, figuring out
what was going wrong, proposing fixes, and walking me through where
things were going off the rails.
BLORP always disables tessellation and geometry shaders. Our handling
tried to look at ice->shaders.uncompiled[] to determine whether the next
draw needed those shaders. If not, we can leave BLORP's residual state
that disabled those stages in place, and skip looking at it.
Unfortunately, predicting the future is a bit fraught, in part due to
the uncompiled[] and prog[] arrays being slightly out of sync at times.
Consider the following case:
1. Draw with tessellation shaders in place
=> uncompiled[TES] and prog[TES] will both point at valid shaders.
2. Gallium calls pipe->bind_tes_state(NULL).
=> This makes uncompiled[TES] point at NULL, and flags
IRIS_STAGE_DIRTY_UNCOMPILED_TES.
Because iris_update_compiled_shaders() hasn't happened yet,
uncompiled[TES] is NULL but prog[TES] has the stale TES from
the previous draw still.
3. BLORP operations happen
=> BLORP sees uncompiled[TES] == NULL and decides that tessellation
is off for the upcoming draws. So it skips flagging tess state.
4. Gallium calls pipe->bind_tes_state(shader from step #1).
=> uncompiled[TES] points at the original shader.
IRIS_STAGE_DIRTY_UNCOMPILED_TES gets flagged again.
5. Draw again
=> This calls iris_update_compiled_shaders(), which sees that
a TES is bound, and calls iris_update_compiled_tes(). But
because the same shader was bound as before, the program it
comes up with is identical to the one already bound at
ice->shaders.prog[TES]. So, it thinks it doesn't have to
flag any tessellation state dirty because it was already
set up for the last draw.
This random unbind and rebind between draws leads to a situation
where, at step #3, BLORP thinks it can skip flagging tessellation
state (nothing is bound), and at step #5, normal state handling
thinks it can skip flagging tessellation state (nothing changed
since last time). So nobody does, and things break.
This unbind appears to be happening when st_release_variants()
decides it wants to free some shaders. Then a rebind happens to
put back the actual shader for the draw. So, it's not theoretical.
To fix this, we change BLORP to look at ice->shaders.prog[] rather
than uncompiled[]. This is equivalent to thinking about the previous
draw, rather than the next. If the last draw had tessellation off,
then BLORP's disabling was a no-op, and the GPU is still in the same
state as the previous draw. This is more reliable than predicting
the future.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8308
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9678
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24880>
(cherry picked from commit d693027a00)
The msm driver reserves the actual DMABUF size in the memory map, while
TU can request smaller memory chunk to be allocated. This potentially
can lead to a situation when next allocation IOVA will be in the middle
of the address space which is reserved for the DMABUF. Pass the
`real_size' to TU allocator instead, so that kernel and userspace have
the same picture of memory allocations.
Cc: mesa-stable
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24861>
(cherry picked from commit 2fdcc00b01)
With -Dintel-clc=system, the build system will search for an `intel_clc`
binary and use it instead of building `intel_clc` itself.
This allows Intel Vulkan ray tracing support to be built when cross
compiling without terrible hacks (that would otherwise be necessary due
to `intel_clc`'s dependence on SPIRV-LLVM-Translator, libclc, clang, and
LLVM).
(cherry picked from commit 28c1053c07)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25228>
With certain build configuration that value can be a non empty string and
needs to be used.
This will also require distributions to rebuild mesa if and only if
CLANG_RESOURCE_DIR changes between clang rebuilds or updates.
Signed-off-by: Karol Herbst <git@karolherbst.de>
(cherry picked from commit e1c278ae82)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25228>
Make sure that both per-vertex and per-primitive attribute
ring stores are finished before position or primitive export
instructions are executed.
This is necessary because we need to ensure that mesh shader
waves work correctly when they have either vertex-only or
primitive-only waves.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
(cherry picked from commit 93b4f200de)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25157>
Cleanup the code that generates the two channels of the
primitive export instruction, and move storing the built-in
per-primitive outputs out to match how vertex attributes work.
Prepares the mesh shader lowering for a workaround that
affect export instructions.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
(cherry picked from commit 0721784b78)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25157>
This is a HW bug workaround for some (all?) GFX11 chips.
On these chips, rasterization can start before the attribute ring
stores are finished, which can cause issues.
As a workaround, wait for attribute ring stores to finish
before doing the position export.
Mesh shaders will be taken care of in another commit.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
(cherry picked from commit edd51655f0)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25157>
Prepares for a workaround. Makes it possible for this function
to not emit the pos0 export at all so that it can be emitted
by a subsequent call to the function later.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
(cherry picked from commit 9c096e4ace)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25157>
This prepares for a workaround where we won't need to add
the done flag to the last export in this function, because
it will be added in a subsequent call to the same function.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
(cherry picked from commit 838d886d90)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25157>
Up until now, the mesh pipeline assumed it would be always linked to the
fragment shader, and so the calculated MUE map would always be
available.
That is not the case for fast linked pipeline libraries, so the URB
setup needs to account for this. We do this by replicating what's done
for non-mesh pipelines, defining the URB based on the FS inputs, and
always assuming they will be laid out in order of varying number, except
that we also account for per-primitive attributes.
Fixes all GPL using tests under dEQP-VK.mesh_shader.ext.smoke.*
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
(cherry picked from commit 4eddeea7bf)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25188>
The compaction introduced in a252123363 ("intel/compiler/mesh: compactify MUE layout")
is not suitable for the case where graphics pipeline libraries are fast
linked, as the fragment shader won't receive the mue_map to know where
to locate its inputs.
For that case, keep doing what we did before and lay things down in the
order varyings are defined, which is also how it works for the non-mesh
case.
Fixes dEQP-VK.fragment_shading_rate.*fast_linked_library*.ms
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
(cherry picked from commit b200e5765c)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25188>
Move mesh URB allocations together with the other stages.
This fixes a hang that started happening with mesh enabled after
419531c5d9 ("intel/blorp: add a new flag to communicate PSS sync need")
Bspec 45352 says:
L3 Space allocation can only be changed when the GPU pipeline is
completely flushed.
It's likely that the PIPE_CONTROL added in that commit was breaking that
assumption and the URB allocation happening afterwards at the end of the
pipeline emission would then hang. And before that, we were probably
just getting lucky.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
(cherry picked from commit bcde58ea86)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25188>
Both per-primitive and per-vertex space is allocated in MUE in 8 dword
chunks and those 8-dword chunks (granularity of
3DSTATE_SBE_MESH.Per[Primitive|Vertex]URBEntryOutputReadLength)
are passed to fragment shaders as inputs (either non-interpolated
for per-primitive and flat vertex attributes or interpolated
for non-flat vertex attributes).
Some attributes have a special meaning and must be placed in separate
8/16-dword slot called Primitive Header or Vertex Header.
Primitive Header contains 4 such attributes (Cull Primitive,
ViewportIndex, RTAIndex, CPS), leaving 4 dwords (the rest of 8-dword
slot) potentially unused.
Vertex Header is similar - it starts with 3 unused dwords, 1 dword for
Point Size (but if we declare that shader doesn't produce Point Size
then we can reuse it), followed by 4 dwords for Position and optionally
8 dwords for clip distances.
This means we have an interesting optimization problem - we can put
some user attributes into holes in Primitive and Vertex Headers, which
may lead to smaller MUE size and potentially more mesh threads running
in parallel, but we have to be careful to use those holes only when
we need it, otherwise we could force HW to pass too much data to
fragment shader.
Example 1:
Let's assume that Primitive Header is enabled and user defined
12 dwords of per-primitive attributes.
Without packing we would consume 8 + ALIGN(12, 8) = 24 dwords of
MUE space and pass ALIGN(12, 8) = 16 dwords to fragment shader.
With packing, we'll consume 4 + 4 + ALIGN(12 - 4, 8) = 16 dwords of
MUE space and pass ALIGN(4, 8) + ALIGN(12 - 4, 8) = 16 dwords to
fragment shader.
16/16 is better than 24/16, so packing makes sense.
Example 2:
Now let's assume that Primitive Header is enabled and user defined
16 dwords of per-primitive attributes.
Without packing we would consume 8 + ALIGN(16, 8) = 24 dwords of
MUE space and pass ALIGN(16, 16) = 16 dwords to fragment shader.
With packing, we'll consume 4 + 4 + ALIGN(16 - 4, 8) = 24 dwords of
MUE space and pass ALIGN(4, 8) + ALIGN(16 - 4, 8) = 24 dwords to
fragment shader.
24/24 is worse than 24/16, so packing doesn't make sense.
This change doesn't affect vk_meshlet_cadscene in default configuration,
but it speeds it up by up to 25% with "-extraattributes N", where
N is some small value divisible by 2 (by default N == 1) and we
are bound by URB size.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
(cherry picked from commit c1685f08dd)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25188>
Instead of using 4 dwords for each output slot, use only the amount
of memory actually needed by each variable.
There are some complications from this "obvious" idea:
- flat and non-flat variables can't be merged into the same vec4 slot,
because flat inputs mask has vec4 stride
- multi-slot variables can have different layout:
float[N] requires N 1-dword slots, but
i64vec3 requires 1 fully occupied 4-dword slot followed by 2-dword slot
- some output variables occur both in single-channel/component split
and combined variants
- crossing vec4 boundary requires generating more writes, so avoiding them
if possible is beneficial
This patch fixes some issues with arrays in per-vertex and per-primitive data
(func.mesh.ext.outputs.*.indirect_array.q0 in crucible)
and by reduction in single MUE size it allows spawning more threads at
the same time.
Note: this patch doesn't improve vk_meshlet_cadscene performance because
default layout is already optimal enough.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
(cherry picked from commit a252123363)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25188>
EXT extension was added without tests so these functions did
not work properly.
Fixes: 799710be88 ("mesa: Add EXT_texture_mirror_clamp_to_edge to extension table")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24845>
(cherry picked from commit d65fe6eff1)
The vulkan spec says all conversions are correctly rounded, so if the input
is larger than the largest fp16 value, we need to return MAX_FLOAT/inf
instead of cutting off the msbs.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24826>
(cherry picked from commit 6d949e18fd)