I have no idea why I flipped the order of these to checks vs. the C
code when I wrote the NIR helper. We need to deal with NaN first or
else the fmin will smash NaN to MAX_RGB9E5 and it won't get handled as
NaN.
Fixes: 9981709d8f ("nir/format_convert: Add a function to pack RGB9_E5 formats")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28793>
(cherry picked from commit 86aad90e2a)
[Eric]
swap `color` and `clamped`, see https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28793#note_2459681
This fix failure on "dEQP-VK.api.buffer.basic.size_max_uint64".
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: 822478ec20 ("panvk: Move the VkBuffer logic to its own source file")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29783>
(cherry picked from commit c0f8465fa8)
Fix a crash when a null handle is passed.
(dEQP-VK.api.null_handle.destroy_command_pool)
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: afbac1af77 ("panvk: Move the VkCommandPool logic to panvk_cmd_pool.{c,h}")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29783>
(cherry picked from commit 716e0e1568)
CTS tests both layered and separate DPB, but radv wasn't handling
layered properly when used with the tier 2 dpb handling.
This adjusts the addresses to use the layer index for tier2.
Fixes dEQP-VK.video.decode.*layered*
Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29758>
(cherry picked from commit b888946f7a)
The enums were mixed up. Code was working because they were being
used only for their numerical values.
Fixes: e666872c75 ("intel/compiler: Initial bits for DPAS instruction")
Acked-by: Iván Briano <ivan.briano@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29762>
(cherry picked from commit f982d2bb79)
VARYING_SLOT_PATCH0 is greater than 64 so it is wrong to use it
with BITFIELD64_BIT. Check for VARYING_SLOT_TESS_LEVEL_* properly
when mapping output locations in LDS.
Fixes: c61eb54806
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29696>
(cherry picked from commit 0f0ebd8512)
On newer devices where ZPASS_DONE events have sample count writing
abilities the firmware expects these events to come in begin-end pairs,
essentially corresponding to a typical occlusion query usage. Since this
event is also used in the autotuner we have to avoid event pairs to be
emitted in an interleaved fashion.
Additional renderpass state now tracks whether a given renderpass contains
an occlusion query. If so, autotuner will emit miscellaneous ZPASS_DONE
events in order to form its own begin-end pairs before and after the
renderpass commands.
Occlusion query behavior inside a renderpass doesn't change. But when used
outside of a renderpass, possible autotuner usage requires to again emit
ZPASS_DONE events that end up forming begin-end pairs of these events both
at the start and the end of the query.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Fixes: 4e6a1f8852 ("tu/autotune: Use `CP_EVENT_WRITE7::ZPASS_DONE` on A7XX")
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29403>
(cherry picked from commit 5653c52151)
When both an interval and some of its children would be live-in, we used
to add phis for all of them. This could lead to cases where the pressure
after spilling was higher than before.
This happens, for example, when both a split and its parent are live-in.
Before spilling, the split wouldn't add to the pressure because its
parent had already been inserted. After spilling, since we created a phi
for the split, the link with its parent would be lost and it would add
to the pressure.
Fix this by only adding phis for top-level intervals and adding splits
after them.
Fixes: 613eaac7b5 ("ir3: Initial support for spilling non-shared registers")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
(cherry picked from commit 6bc7cd6108)
There were a few places that used an instruction pointer to decide where
new instructions should be created. NULL was used to add them at the end
of the block. While fixing a spilling bug, a new option was needed to
add instructions at the beginning of the block. This will be much easier
to implement using cursors.
Fixes: 613eaac7b5 ("ir3: Initial support for spilling non-shared registers")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
(cherry picked from commit 18cd803cef)
Whenever instructions need to be created at specific locations, ir3
often passes around an instruction pointer. When set, new instructions
are added before or after it (depending on the context). When NULL, new
instructions are added at the end of the block. This whole scheme is
confusing.
This patch adds ir3_cursor and ir3_builder structs and the associated
helper functions. The API mirrors the one from nir_cursor/nir_builder.
This patch does not refactor existing code to use the new API. This will
happen in future patches.
Fixes: 613eaac7b5 ("ir3: Initial support for spilling non-shared registers")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
(cherry picked from commit 1972db36c6)
This value is usually set by ir3_merge_regs. Since we don't need to call
this again after shared RA, we have to copy it manually to the new
liveness struct.
Fixes: fa22b0901a ("ir3/ra: Add specialized shared register RA/spilling")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
(cherry picked from commit dc04fd8e62)
Similar to how ir3_spill does it. This will make it easier to optimize
this in the future. E.g., we only need to recalculate liveness when any
instruction were added.
Fixes: fa22b0901a ("ir3/ra: Add specialized shared register RA/spilling")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
(cherry picked from commit 3f3c190649)
ir3_force_merge (through merge_merge_sets) expects instructions to be
indexed. However, the instructions created during spilling would not be
automatically indexed at this point.
Fixes: 613eaac7b5 ("ir3: Initial support for spilling non-shared registers")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
(cherry picked from commit 7a5b198a44)
While fixing up merge sets after spilling, we need to index before
calling ir3_merge_regs so it would be a waste to index again.
Fixes: 613eaac7b5 ("ir3: Initial support for spilling non-shared registers")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
(cherry picked from commit 018d0ab805)
It might happen that a collect that cannot be coalesced with one of its
sources while spilling can be coalesced with it afterwards. In this
case, we might be able to remove it in remove_src_early during spilling
but not afterwards (because it may have a child interval). If this
happens, we could end up with a register pressure that is higher after
spilling than before. Prevent this by never removing collects early
while spilling.
Fixes: 613eaac7b5 ("ir3: Initial support for spilling non-shared registers")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
(cherry picked from commit 1bc3b819e6)
We used to set it MASK(elems) which would break when not all elements
are contiguous (which could happen for tex instructions after dce).
Fixes: 613eaac7b5 ("ir3: Initial support for spilling non-shared registers")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
(cherry picked from commit 70e10babea)
Early clobbers should always add to the register pressure since they
cannot overlap with sources. handle_instr in ir3_spill.c handles this
properly but calc_min_limit_pressure did not.
Fixes: 2ff5826f09 ("ir3/ra: Add IR3_REG_EARLY_CLOBBER")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
(cherry picked from commit af6f82b954)
try_evict_regs might end up calling check_dst_overlap which only works
for dst regs. Make sure this doesn't happen for src regs.
Fixes: 34803d15ab ("ir3/ra: Add proper support for multiple destinations")
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29497>
(cherry picked from commit 023c7351f2)
for drivers that don't support PIPE_CAP_SHAREABLE_SHADERS,
the zombie shader mechanism is used, storing shaders to delete after
the next flush
the zombie mechanism also calls bind_*_state(pipe, NULL) during deletion,
however, which breaks drivers in the following scenario:
* create_all_shaders(pipe_A)
* bind_vs(pipe_A, vs_A)
* bind_fs(pipe_A, fs_A)
* draw(pipe_A)
* makeCurrent(pipe_B)
* delete_vs(pipe_B, vs_B)
* vs_B must only be deleted on pipe_A
* zombie_shader_add(pipe_A, vs_B)
* makeCurrent(pipe_A)
* free_zombie_shaders(pipe_A)
* bind_vs(pipe_A, NULL)
* delete_vs(pipe_A, vs_B)
* draw(pipe_A)
* boom
the problem being that bind_vs(pipe_A, NULL) was called when deleting
vs_B, but it was actually vs_A which was bound
to solve this, just flag the shader state for updating and let st figure it out
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11122
cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29680>
(cherry picked from commit 95828d8901)
This define is used in panvk_physical_device.c as well, so it needs to
be visible there.
Fixes: ac34183ec3 ("panvk: Move the VkPhysicalDevice logic to panvk_physical_device.{c,h}")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29751>
(cherry picked from commit 9336190868)
The commit 973e6f3b implemented compaction of the stream-number space. The
functions `update_vertex_elements(_sw)` began using the post-compacted
stream-numbers/indices when maintaining the `stream_usage_mask` and
when reading from the arrays `vtxstride` and `stream_freq`.
But, the `stream_instancedata_mask`, with which the `stream_usage_mask`
is compared/bitwise-anded, maintains bits for the pre-compacted indices.
Additionally, the information within the arrays is stored using the
pre-compacted indices.
The functions have a disagreement, regarding the type (pre- vs post-
compacted) of indices, with the rest of the relevant source. This change
removes the disagreement by having them use pre-compacted indices when
maintaining the `stream_usage_mask` and when reading from the arrays.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11283
Fixes: 973e6f3b ("gallium: remove start_slot parameter from pipe_context::set_vertex_buffers")
Reviewed-by: Axel Davy <davyaxel0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29704>
(cherry picked from commit 4bf330471b)
If the application binds graphics shaders and that RADV performs an
internal operation with a compute pipeline, no shader objects would be
restored because binding the internal compute pipeline resets the ESO
state.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29678>
(cherry picked from commit 07eb970d67)
The update_stencil_mask function was removed when moving to the common
Vulkan dynamic state handling.
Fixes: 97da0a7734 ("tu: Rewrite to use common Vulkan dynamic state")
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29658>
(cherry picked from commit d9af1633a9)
It's possible if nouveau_ws_bo_destroy() races with
nouveau_ws_bo_from_dma_buf() for the BO to be found in the cache and
referenced between dropping the final reference and actually invoking
GEM_CLOSE. This would result in us having a closed BO somewhere in our
cache.
Fixes: c370260a8f ("nouveau/winsys: Add dma-buf import support")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29737>
(cherry picked from commit faaf33556e)
I've been seeing a bunch of read page faults at the end of the
shader allocation, nvk uses a full page at the end to overallocate
so align with that and see if it goes away.
ahulliet and skeggsb both said 2k was used.
Cc: mesa-stable
Reviewed-by: Arthur Huillet <ahuillet@nvidia.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29722>
(cherry picked from commit f7434d7576)
MOV_INDIRECT picks one lane from the src[0] and moves it to all lanes
in the destination. Even if we split the instruction, src[0] should
remain identical.
Noticed this while trying to use this instruction in SIMD32. All
current use cases are limited to SIMD8 shaders (or SIMD16 on Xe2). Or
maybe in SIMD32 but with a uniform src[0]. That's we think we've never
seen the issue so far.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28036>
(cherry picked from commit 13dc2a28ce)
if mesh and task shaders are bound separately, and if they have different
workgroup sizes, the setting of workgroup size will be broken if
set during shader bind
this must be deferred to draw time to pull the correct values
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29733>
(cherry picked from commit 2bb35bf489)
If we have an array of multi-planar descriptors, buffer_list was
incorrectly incremented and this could have overwritten some BO entries.
In practice, this situation should be very rare because most of the
applications enable the global BO list.
Cc: mesa-stable
Closes: #10559
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28816>
(cherry picked from commit 730ba8322f)
Fix compilation failure when image is embedded in struct when
GL_EXT_shader_image_load_formatted is enabled:
struct GpuPointShadow {
image2D RayTracedShadowMapImage;
};
layout(std140, binding = 2) uniform ShadowsUBO {
GpuPointShadow PointShadows[1];
} shadowsUBO;
Compile log:
error: image not qualified with `writeonly' must have a format layout qualifier
Fixes: 082d180a22 ("mesa, glsl: add support for EXT_shader_image_load_formatted")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29693>
(cherry picked from commit b1d0ecd00d)