Add a vertex count threshold into si_shader_selector to simplify
the draw_vbo code.
The new option is supposed to be used in 00-mesa-defaults.conf and should be
tweaked for best performance unlike the AMD_DEBUG experimental options.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6948>
We should set "Full Surface Depth and Stencil Clear" field of WM_HZ_OP
3DSTATE packet, only when application requires the entire depth surface
to be cleared.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6549>
This is a mostly cosmetic change but there is one subtle functional
issue: If we ever render to a 3D depth image, we are now handling the
base layer and number of layers correctly. I'm not sure rendering to 3D
depth is even allowed but we can theoretically handle it now.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6549>
Now that we're enabling HiZ on multi-layer images, there's no reason why
we can't enable HiZ clears for multi-view. The only reason I can think
of why we didn't before was because no one thought to and the old code
didn't. Enabling this means that an attachment will get HiZ cleared if
and only if att_state->fast_clear.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6549>
So far, the callback to create a resource from a memory object had code
for importing textures only. Modified it to allow importing buffers too.
Fixes the following piglit tests:
- ext_external_objects/vk-buf-exchange
- ext_external_objects/vk-pix-buf-update-errors
- ext_external_objects/vk-vert-buf-update-errors
- ext_external_objects/vk-vert-buf-reuse
v2: Used si_alloc_buffer_struct instead of CALLOC
v3: Fixed indentation issue, removed free in case of unsuccessful
allocation, joined two if conditions together
Signed-off-by: Eleni Maria Stea <estea@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6364>
Lowering IO for VS, TCS, TES and GS still have to be done for LLVM.
No fossils-db change on NAVI10.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6897>
IO are now lowered before the shader info pass is called and the
output usage masks have to be gathered from store_output instead.
This is currently only used by ACO.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6897>
This was completely broken.
Fixes dEQP-VK.glsl.atomic_operations.add_float32_compute_shared.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6936>
After disable SDMA on Arcturus(gfx9), dead lock with aux_context_lock is
detected since si_screen_clear_buffer is called recursively before
release lock.
The call trace is:
si_clear_render_target->si_compute_clear_render_target->
si_launch_grid_internal->si_launch_grid->si_emit_cache_flush->
si_prim_discard_signal_next_compute_ib_start->u_suballocator_alloc->
si_resource_create->si_buffer_create->si_alloc_resource->
si_screen_clear_buffer->simple_mtx_lock->
si_sdma_clear_buffer->si_pipe_clear_buffer->
si_clear_buffer->si_compute_do_clear_or_copy->
si_launch_grid_internal->si_launch_grid->si_emit_cache_flush->
si_prim_discard_signal_next_compute_ib_start->u_suballocator_alloc->
si_resource_create->si_buffer_create->si_alloc_resource->
si_screen_clear_buffer->simple_mtx_lock
Fixes: 07a49bf597 "radeonsi: disable SDMA on gfx9"
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6941>
This will merge loads of UBO components together into vec4 loads. At the
same time, it improves the alignment information on our loads, fixing the
regression from the vec3 loads fix.
shader-db results:
total instructions in shared programs: 12829370 -> 8755851 (-31.75%)
total cat6 in shared programs: 145840 -> 97027 (-33.47%)
Overall results from before the vec3 fix:
total instructions in shared programs: 8019997 -> 8755851 (9.18%)
total cat6 in shared programs: 87683 -> 97027 (10.66%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6612>
It turns out I had missed a case in my enumeration of why everything
currently was vec4-aligned.
Fixes a simple testcase of loading from a vec3[2] array in freedreno with
IR3_SHADER_DEBUG=nouboopt.
Initial shader-db results look devastating:
total instructions in shared programs: 8019997 -> 12829370 (59.97%)
total cat6 in shared programs: 87683 -> 145840 (66.33%)
Hopefully this will recover once we introduce the i/o vectorizer, but that
was blocked on getting the vec3 case fixed.
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6612>
It was passing an encoding of the two that wasn't good for ensuring "Don't
combine loads that would make us straddle a vec4 boundary" for
nir_lower_ubo_vec4.
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6612>
nir_lower_to_explicit_io will give us good alignments if we have the
cast's alignment information known, and it's trivial: Just the offset of
the UBO variable that is at the base of the deref. Otherwise, explicit io
assumes the load is aligned just to the size of a scalar value in it.
The change in freedreno is in the noise.
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6612>
Introduces a #define for the maximum valid align_mul that's used in the
load_store_vectorizer tests (currently, though it will be used more soon).
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6612>
The lowering pass may introduce vector bcsels that we need to scalarize
back out. It's unusual to have UBOs and not get any lowered to push
constants, so the flag was usually set anyway.
Fixes: 2b25240993 ("freedreno/ir3: Replace our custom vec4 UBO intrinsic
with the shared lowering.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6612>
Move legacy API functions to a separate file, and implement them by calling
the new API functions, like tu_CreateRenderPass was already doing.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6920>
When we come in from some other level or from the parent, we need to
ensure that the reach set is in prev_frontier but we also need to
consider the dominance frontier of our level. Otherwise, we may end up
leaving out possible blocks when computing the reach of a level.
Acked-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6750>
There are two issues here:
1. If there are any phi nodes, we'll make complete hash of them. This
isn't likely actually a problem because spirv_to_nir doesn't
generate any actual phi nodes today. However, if we start doing any
other passes before this, we may have a problem.
2. Even without phi nodes, we may still break SSA form. This can
happen if we ever have to stick a block inside a conditional to
satisfy weird CFG constraints. Doing so can cause it to no longer
look like it dominates some of its uses even though, at runtime,
it's guaranteed to be run first.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6750>
This commit adds a number of new validation checks:
1. We now check that every block pointer in the IR points to a block
that actually exists in a block list that's reachable from the
nir_function_impl.
2. We assert that nir_function_impl::body is non-empty
3. We assert that the start block has no predecessors. This is
important because we tend to put run-once code there.
4. We now validate some stuff on the end block.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6750>
In the case where end was a instruction-based cursor, we would mix up
our blocks and end up with block_begin pointing after the second split.
This causes a segfault as the cf_node list walk at the end of the
function never terminates properly. There's also a possibility of
mix-up if begin is an instruction-based cursor which was found by
inspection.
Fixes: fc7f2d2364 "nir/cf: add new control modification API's"
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6866>
We would've had to add yet another level of indentation, or duplicated
finding the if conditions in the next commit. Refactor this function to
use early returns like our other optimizations, so that this isn't an
issue.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6866>
As a side effect, we can delete the whole control flow stack thing.
v2 (Jason Ekstrand):
- Drop the ttn_if helper and just inline it in the two uses
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6866>