This is completely broken because the PS epilog has refcount and
radv_upload_shaders() updates its VA.
This reverts commit 7c34b31db2.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18632>
We had a comment stating that we were using different program ids for render
and binning but this isn't true. We were only assigning ids to the render
stages and then we would create the binning stages and not assign a program id
to them at all, so they would remain with a program id of 0.
This change removes the comment and makes sure we assign the same program
id to the binning and render stages of the pipeline, which makes it a lot
easier to match render and binning shaders when debugging.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18630>
Port from radeonsi.
Works on both GFX11 and GFX10. Although GFX10 can do atomic
GDS add on all threads, now we just disable the NGG streamout
for GFX10, so it's OK.
There's a difference for the GFX11 implementation with radeonsi
that we do all 4 buffer/stream info calc on a single thread.
It's just because this is simple, we need to update GDS on a
single thread anyway, and streamout is not that performance
critical to loss a small amount of instruction. We may change
to a better implementation when using register based streamout.
When streamout enabled, ES threads need to save all vertex
attributes to LDS besides position. This is because we don't
know where in the streamout buffer to export the attributes to
and wheter there are space in the streamout buffer.
Streamout is done in primitives, so we need to check if there
is space and where the current primitive should be written to
by GDS atomic add, then in GS threads do the streamout.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17654>
Streamout also need barrier after culling, so move the
prim id barrier up to after culling.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17654>
When compiling only the pre-rast stages in a library, the input
assembly state might not be present and the topology would be 0.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18519>
If we have a VS that needs a prolog without using the dynamic state,
that means that it comes from a library, so we can overwrite the
cmdbuf VS input state.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18519>
With GPL it's possible to create VS prologs without this dynamic state,
so it seems better to rename.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18519>
Instead, we should just return VK_SUCCESS. The physical device
won't be initialized and vkEnumeratePhysicalDevices will not
list it as available, which is the expected behavior here.
Also, VK_ERROR_INCOMPATIBLE_DRIVER is not a valid return code
from vkEnumeratePhysicalDevices, so never return that, instead
we return VK_ERROR_INITIALIZATION_FAILED if a valid device was
found but we failed to create the physical device for it.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Tested-By: Ryan Houdek <Sonicadvance1@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18591>
Otherwise, the back-copy on same-layout push descriptor updates would read
from WC memory, which is absurdly slow. Improves performance of
vkoverhead's descriptor_template_12ubo_push from 760k/sec to 2876k/sec.
Improves submit-disabled gfxbench gl_driver2 performance on zink from 79.6
fps to 103.6.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18561>
The spec says "If VkDescriptorSetAllocateInfo::pSetLayouts[i] does not
include a variable-sized descriptor binding, then pDescriptorCounts[i] is
ignored." So, make sure that we ignore it unless there is a
variable-sized binding. And, we can keep it simple just taking the
variable-sized path for variable-sized bindings with the 0 variable_count
value to handle "If descriptorSetCount is zero or this structure is not
included in the pNext chain, then the variable lengths are considered to
be zero."
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18561>
If we override with gnu99 here, we effectively down-grade from C11,
meaning we can no longer assume static_assert support.
Fixes: 45fb815a75 ("util: implement STATIC_ASSERT using c++11 / c11 primitives")
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Suggested-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18611>
As long as the direction is still compatible, if we skip the LRZ use and
updates for this draw, then we can keep using LRZ later in the scene, as
whatever gl_FragDepth will get written by the shader later will still have
to move the depth in the right direction.
Similarly, the no_earlyz flag that contributes to DISABLE_LRZ just wants
to make sure we don't kill fragments before dispatch, not change what Z
eventually lands.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18606>
It will always be set in HW when RB.Z_WRITE_ENABLE is set (since that
implies RB.Z_TEST_ENABLE), but in the case of dynamic Z the flag gets
computed at emit time and not stored to cmd->state.rb_depth_cntl. This
bug effectively disabled LRZ for zink.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18606>
On Asahi needed to pass
dEQP-GLES3.functional.texture.specification.texsubimage2d_depth.depth24_stencil8
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18136>
Align with vkGetBufferMemoryRequirements2 and utilize the cache for
retrieving memory requirements before trying the host call.
Fixes
dEQP-VK.api.invariance.memory_requirements_matching
dEQP-VK.memory.requirements.create_info.buffer.regular
Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18603>