Gen5 adds some boolean conversion instructions after nir emits,
but that nir srcs don't line up with them, so reemit the boolean
conversion if we reemit the inot.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 31b5f5a51f ("nir/opt_if: Simplify if's with general conditions")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26782>
zink_bo_create can run into a heap-use-after-free when the bo is still
referencing an batch_state from an older destroyed context. In order to
fix this, every context gives back their batch_states to the zink, where
they can be reused from for new contexts.
Cc: mesa-stable
Suggested-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26889>
Implementations are supposed to do that per the Vulkan spec.
This fixes the following new VKCTS tests
dEQP-VK.pipeline.*.stencil.no_stencil_att.*
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26847>
This is safe to do in all circumstances due to the age of the hardware.
(we don't have UBOs, just constant registers with automatic OOB checks)
R500 hardware doesn't have standard adress register in fragment shaders
and while we have the loop register which we in theory can use for indirect
access, this is currently not possible to wire through NIR. So anytime
there is an indirect uniform array access in a loop, we end with a if
ladder with size depending on the size of the uniform array. The two worst
behaving apps here are glamor and some GTK shaders, both of which are
sometimes ending over the 512 instructions limit. Flattening the if
ladders helps a LOT, so we can get into the instruction limit in most
cases (all glamor shaders are OK now). So just enable the flattening by
setting all load_ubo_vec4 with ACCESS_CAN_SPECULATE.
Shader-db RV530:
total instructions in shared programs: 128762 -> 128440 (-0.25%)
instructions in affected programs: 540 -> 218 (-59.63%)
helped: 3
HURT: 0
total temps in shared programs: 17543 -> 17550 (0.04%)
temps in affected programs: 11 -> 18 (63.64%)
helped: 0
HURT: 3
total cycles in shared programs: 196984 -> 196657 (-0.17%)
cycles in affected programs: 592 -> 265 (-55.24%)
helped: 3
HURT: 0
LOST: 0
GAINED: 7
No changes for R300/R400 because there we don't have control flow
anyway.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6366
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26877>
The fcsel lowering for R3xx happens already in the main loop, here we
only do it for the fcsel_ge that comes from the frunc.
No change in shader-db
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26816>
Shader-db RV370:
total instructions in shared programs: 82071 -> 82155 (0.10%)
instructions in affected programs: 792 -> 876 (10.61%)
helped: 0
HURT: 12
total temps in shared programs: 12775 -> 12778 (0.02%)
temps in affected programs: 27 -> 30 (11.11%)
helped: 0
HURT: 3
total cycles in shared programs: 128403 -> 128499 (0.07%)
cycles in affected programs: 864 -> 960 (11.11%)
helped: 0
HURT: 12
The same regression for the few GTK shaders that happens with the R500
nir fcsel lowering also happens here due to the
nir_move_vec_src_uses_to_dest.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6126
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26816>
We can't call nir_lower_bool_to_float too early, because some other
passes like nir_opt_peephole_select will blow up, but we can still do
some selected parts to enable some optimiazions at a later point
(like fcsel(a,b,0) into fmul), etc.
No change in shader-db with RV370 or RV530 at this point.
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26816>
Right now this is done in the backend so move it up to NIR. Doing this
in the backend is easier, as at that time we can have a better idea
about when we hit the hardware limits of three different TMP sources,
however moving this to NIR allows for some optimizations. Specifically,
at this time if we decide we actually have to lower we still have the
info if we have plain fcsel for which we can save the comparison and
emit flrp only. During translation to TGSI all of fcsel, fcsel_gt, and
fcsel_ge translate to CMP so at that point the comparison is always needed.
Shader-db RV530:
total instructions in shared programs: 126057 -> 125823 (-0.19%)
instructions in affected programs: 11359 -> 11125 (-2.06%)
helped: 68
HURT: 12
total temps in shared programs: 17043 -> 17023 (-0.12%)
temps in affected programs: 459 -> 439 (-4.36%)
helped: 32
HURT: 12
total cycles in shared programs: 191604 -> 191294 (-0.16%)
cycles in affected programs: 11834 -> 11524 (-2.62%)
helped: 68
HURT: 12
The hurt shaders are some GTK shaders where there is some bad
interaction with nir_move_vec_src_uses_to_dest. This is known and might
be improved later by thweking the pass more.
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26816>
We do ffloor by default for adress register load so no need to do it
explicitly. This needs to happen after int lowering, otherwise we get
ftrunc by default as a bonus. This is mostly for wined3d.
Shader-db RV370:
total instructions in shared programs: 82147 -> 82071 (-0.09%)
instructions in affected programs: 2772 -> 2696 (-2.74%)
helped: 32
HURT: 0
total cycles in shared programs: 128479 -> 128403 (-0.06%)
cycles in affected programs: 2813 -> 2737 (-2.70%)
helped: 32
HURT: 0
Shader-db RV530:
total instructions in shared programs: 126141 -> 126057 (-0.07%)
instructions in affected programs: 3170 -> 3086 (-2.65%)
helped: 36
HURT: 0
total cycles in shared programs: 191688 -> 191604 (-0.04%)
cycles in affected programs: 3222 -> 3138 (-2.61%)
helped: 36
HURT: 0
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26816>
Fail (maxResourceDescriptorBufferRange is less than (((1u << 20) - (1u << 15)) * maxResourceDescriptorSize) at vktBindingDescriptorBufferTests.cpp:5127)
bump this to pass the test now.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25398>
multisample textures don't have mipmaps, so store sample_stride
into mipmap offset 15 and store num_samples into last_level
We can't use mipmap_offset0 as arrays might still store some values
into it.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25398>
This is step one in a size reduction plan, this reduces lp_jit_texture
by making all the fields smaller in the struct and upsizing on the llvm
size. It goes from 280->264.
This isn't sufficient to get conformance, but it's a good step one.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25398>
This just checks the level zero only is set for multisample stuff.
I tried using asserts, but there are paths due to precompilation
that won't get hit, but do get compiled.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25398>
If Mesa is executed under valgrind, fd_bo_init_common() calls
fd_bo_map() internally. For the heap (sub-block) allocator this causes a
segfault in fd_bo_map(), when this function tries to call the offset()
callback.
To prevent this from happening, preallocate fb->map before calling into
fd_bo_init_common(), stop calling VG_BO_ALLOC() if the memory map is
already initialised and disable the VG_BO_FREE call for the heap
allocator.
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26277>
If the shader memory has been allocated with the FD_BO_NOMAP and got
later allocated a memory chunk during fd_bo_upload(), this can result in
the valgrind splat when it tries to release the free and/or cache the
BO. To fix this issue, notify valgrind about newly mmaped shader memory.
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26277>
Partially revert the commit f4c9e9329c ("ir3/a6xx: Fix immediate
offset stg/ldg path").
There is no need to multiply the immediate offsets by 4. Doing so
results in loading and/or storing the data at wrong locations.
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26817>
That enables testing and development of the Venus-based Vulkan HAL
on a wider range of Android systems - flavors of "Cuttlefish" are
of particular practical interest. At this point, only two gralloc
variants are supported: CrOS and IMapper v4. The fallback gralloc
and any gralloc adapter modules relying on it (GBM, QCOM) are out
of scope for Android Vulkan HAL now.
Signed-off-by: VladimirTechMan <VladimirTechMan@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26858>
This removed a bunch of calls from the vec4 code that aren't called anywhere else.
Bring back the bits that were removed.
Fixes glxgears on gen5
Fixes: 81594d0db1 ("intel/compiler: Move earlier scheduler code that is not mode-specific")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26862>
This is ported from amd ac_llvm_helper.cpp, thanks to Marek for the pointer.
This is needed to avoid crashes due to atexit ordering between some piglit
tests and mesa internals.
Cc: mesa-stable
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26883>
These are now covered by nir_opt_loop():
- opt_if_loop_last_continue()
- opt_merge_breaks()
- opt_if_loop_terminator()
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24940>
This new pass aims to simplify loop control-flow by reducing the number
of break and continue statements. It also supersedes nir_opt_trivial_continues().
For this purpose, it implements 3 optimizations:
- opt_loop_terminator(), as previously
- opt_loop_merge_break_continue(), similar to opt_merge_breaks() incl. continues
- opt_loop_last_block(), a generalization of opt_if_loop_last_continue()
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24940>