This also removes the loop so opt_remove_cast_cast() will only optimize
cast(cast(x)) and not cast(cast(cast(x))). However, since nir_opt_deref
walks instructions top-down, there will almost never be a tripple cast
because the parent cast will have opt_remove_cast_cast() run on it.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21252>
Draw states were not disabled after a dynamic renderpass which
spans several command buffers, the next renderpass if started in
the same command buffer wouldn't emit the full draw state,
since TU_CMD_DIRTY_DRAW_STATE was not set by previous renderpass.
The issue could be observed when corrupting all regs at cmdbuf start in:
dEQP-VK.dynamic_rendering.primary_cmd_buff.random.seed7_geometry
Fixes: cb0f414b2a
("tu: Add support for suspending and resuming renderpasses")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21148>
It appears that A6XX_GRAS_SC_CNTL::rotation applies to the binning,
so we should ensure there is no unexpected rotations and apply with
A6XX_GRAS_SC_CNTL during the binning pass.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21148>
From the Haswell PRM Vol. 7, "IEEE Floating Point Mode":
"Single precision (F, Float) denorms are flushed to sign-preserved
zero on input and output of any floating-point mathematical
operation."
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20232>
The rounding mode only needs to be set once, because 16-bit floats or
preserving denorms aren't supported for the platforms where vec4 is
used.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20232>
RGP requires shaders to be uploaded consecutively inside the same
buffer object. Otherwise, either it makes the driver generating
huge traces (ie. in GiB) or it fails to load traces at all. Hopefully,
this will be improved soon when AMDGPU drivers will have GPL support.
To workaround this, the driver relocates graphics shaders in the same
buffer object when a pipeline is created. Then at draw time, it
overwrites SPI_SHADER_PGM_xxx registers to make sure SQTT can match
between emitted and exported shaders. It's a bit suboptimal because
graphics shaders are uploaded twice but it's the best solution I found.
This will allow to implement GPL caching without breaking capturing
shaders with RGP.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21078>
The shaders were uploaded consecutively to fit a RGP constraint but
this was more like a workaround. This upload path doesn't work well for
graphics pipeline library and it was the main blocker for GPL caching.
This commit breaks capturing shaders with RGP if the offset between
shaders is too big. Next commit should fix it by using shaders reloc.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21078>
The Vulkan Specification states about possible return values from
vkAcquireNextImageKHR:
* VK_NOT_READY is returned if timeout is zero and no image was
available.
* VK_TIMEOUT is returned if timeout is greater than zero and less than
UINT64_MAX, and no image beae available within the time allowed.
That is, if info->timeout is larger than zero, the function must return
VK_TIMEOUT instead of VK_NOT_READY if no image became available before
the timeout elapsed.
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21190>
RADV_PERFTEST=gpl increased execution time, so let's try with a 3d
runner.
dEQP-VK.api.device_init.create_instance_device_intentional_alloc_fail.basic
seems reliably fixed now for some reasons.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21214>
The Vulkan CTS version in Mesa CI is so old that a bunch of tests
are broken, but it's expected.
This runs +283939 tests and the overall VKCTS execution time increased
from ~23 minutes to ~26 minutes (+~13%) on my Threadripper 1950X.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21214>
While this change helps with few shaders, the main benefit is
that it allows to unroll loops comming from nine+ttn on vec4
backends. D3D9 REP ... ENDREP type loops are unrolled now already,
LOOP ... ENDLOOP need some nine changes that will come later.
r300 RV530 shader-db:
total instructions in shared programs: 132481 -> 132344 (-0.10%)
instructions in affected programs: 3532 -> 3395 (-3.88%)
helped: 13
HURT: 0
total temps in shared programs: 16961 -> 16957 (-0.02%)
temps in affected programs: 88 -> 84 (-4.55%)
helped: 4
HURT: 0
Reviewed-by: Emma Anholt <emma@anholt.net>
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Partial fix for: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8102
Partial fix for: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7222
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21038>
The graphics pipeline library implementation in RADV has been
improved considerably lately.
There is still a bit of work for caching individual libraries
and optimized (LTO) pipelines but I think overall it seems good
enough to stop reporting it as experimental and suboptimal.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21213>
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21210>
CP_REG_TEST (or any command that reads registers) is slow on a618
(gen1). Since SQE can early return, we don't necessarily need
emit_conditional_ib in fd6_emit_tile.
We still CP_REG_TEST twice for load and store when there is no clear.
Not sure if we can simply drop emit_conditional_ib instead?
glmark2 score goes from 943 to 1067.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21208>
This invalid memory access is a consequence of wrong assumptions,
for instance:
"prog->sh.data is NULL if it's ARB_fragment_program"
This issue is triggered with piglit/fp-formats -auto -fbo:
==9747==ERROR: AddressSanitizer: heap-use-after-free on address 0x007f7c812d90 at pc 0x007f833c09f8 bp 0x007fd7eca750 sp 0x007fd7eca768
READ of size 4 at 0x007f7c812d90 thread T0
#0 0x7f833c09f4 in st_get_sampler_views ../src/mesa/state_tracker/st_atom_texture.c:109
#1 0x7f833c0b48 in update_textures ../src/mesa/state_tracker/st_atom_texture.c:266
#2 0x7f82b2d120 in st_validate_state ../src/mesa/state_tracker/st_util.h:128
#3 0x7f82b2d120 in prepare_draw ../src/mesa/state_tracker/st_draw.c:88
#4 0x7f82b2de64 in st_draw_gallium ../src/mesa/state_tracker/st_draw.c:141
#5 0x7f83105940 in _mesa_draw_arrays ../src/mesa/main/draw.c:1202
#6 0x7f8d5fa5cc in piglit_draw_rect_from_arrays piglit/tests/util/piglit-util-gl.c:711
#7 0x7f8d5fac34 in piglit_draw_rect_custom piglit/tests/util/piglit-util-gl.c:833
#8 0x4019e0 in piglit_display piglit/tests/shaders/fp-formats.c:67
#9 0x7f8d643fc4 in run_test piglit/tests/util/piglit-framework-gl/piglit_fbo_framework.c:52
#10 0x401624 in main piglit/tests/shaders/fp-formats.c:39
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21175>
On gen3+, there are 32 predicate bits instead of 1.
I set out to see why CP_REG_TEST (and others commands that read
registers) is slower on gen1 but could not find anything. Since the
blob seems to use multiple predicate bits, let's keep them documented.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21206>
some games/apps (e.g., DOOM2016) compile+link shaders in one context
and then use them in another, expecting that the compiled shaders
will be reused. vulkan has pipeline (library) objects, which are not
specific to shaders but are in theory representing the shaders being used
thus, pipeline (library) objects need to be reusable for any case where
a shader can be reused
to handle this:
* extract pipeline library cache to a refcounted object
* store these objects on the screen
* make them owned by shaders
separable programs are slightly different since they'll use their own
fastpath, thus making their library caches owned by the programs to avoid
polluting the optimized caches
fixes#8264
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21223>
From the 96c19d23c9 commit message:
Ever since 4246c2869c and 7d85dc4f35 loop unrolling can no
longer depend on inot being eliminated from the loop
terminator condition so we need to be able to handle it.
Support these conditions here too.
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21179>
While making the function public, rename it to
nir_collect_src_uniforms. The old name makes it sound like it's just a
query that doesn't have side effects. That is, however, not the case.
This is step 4 in an attempt to unify a bunch of nir_inline_uniforms.c
and lvp_inline_uniforms.c code.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21179>
Only caller in this file still only passes 1.
This is step 2 in an attempt to unify a bunch of nir_inline_uniforms.c
and lvp_inline_uniforms.c code.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21179>
max_num_bo is currently limited to 1. That will change in the next
commit.
This is step 1 in an attempt to unify a bunch of nir_inline_uniforms.c
and lvp_inline_uniforms.c code.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21179>