RADV_PERFTEST=gpl increased execution time, so let's try with a 3d
runner.
dEQP-VK.api.device_init.create_instance_device_intentional_alloc_fail.basic
seems reliably fixed now for some reasons.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21214>
The Vulkan CTS version in Mesa CI is so old that a bunch of tests
are broken, but it's expected.
This runs +283939 tests and the overall VKCTS execution time increased
from ~23 minutes to ~26 minutes (+~13%) on my Threadripper 1950X.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21214>
While this change helps with few shaders, the main benefit is
that it allows to unroll loops comming from nine+ttn on vec4
backends. D3D9 REP ... ENDREP type loops are unrolled now already,
LOOP ... ENDLOOP need some nine changes that will come later.
r300 RV530 shader-db:
total instructions in shared programs: 132481 -> 132344 (-0.10%)
instructions in affected programs: 3532 -> 3395 (-3.88%)
helped: 13
HURT: 0
total temps in shared programs: 16961 -> 16957 (-0.02%)
temps in affected programs: 88 -> 84 (-4.55%)
helped: 4
HURT: 0
Reviewed-by: Emma Anholt <emma@anholt.net>
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Partial fix for: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8102
Partial fix for: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7222
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21038>
The graphics pipeline library implementation in RADV has been
improved considerably lately.
There is still a bit of work for caching individual libraries
and optimized (LTO) pipelines but I think overall it seems good
enough to stop reporting it as experimental and suboptimal.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21213>
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21210>
CP_REG_TEST (or any command that reads registers) is slow on a618
(gen1). Since SQE can early return, we don't necessarily need
emit_conditional_ib in fd6_emit_tile.
We still CP_REG_TEST twice for load and store when there is no clear.
Not sure if we can simply drop emit_conditional_ib instead?
glmark2 score goes from 943 to 1067.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21208>
This invalid memory access is a consequence of wrong assumptions,
for instance:
"prog->sh.data is NULL if it's ARB_fragment_program"
This issue is triggered with piglit/fp-formats -auto -fbo:
==9747==ERROR: AddressSanitizer: heap-use-after-free on address 0x007f7c812d90 at pc 0x007f833c09f8 bp 0x007fd7eca750 sp 0x007fd7eca768
READ of size 4 at 0x007f7c812d90 thread T0
#0 0x7f833c09f4 in st_get_sampler_views ../src/mesa/state_tracker/st_atom_texture.c:109
#1 0x7f833c0b48 in update_textures ../src/mesa/state_tracker/st_atom_texture.c:266
#2 0x7f82b2d120 in st_validate_state ../src/mesa/state_tracker/st_util.h:128
#3 0x7f82b2d120 in prepare_draw ../src/mesa/state_tracker/st_draw.c:88
#4 0x7f82b2de64 in st_draw_gallium ../src/mesa/state_tracker/st_draw.c:141
#5 0x7f83105940 in _mesa_draw_arrays ../src/mesa/main/draw.c:1202
#6 0x7f8d5fa5cc in piglit_draw_rect_from_arrays piglit/tests/util/piglit-util-gl.c:711
#7 0x7f8d5fac34 in piglit_draw_rect_custom piglit/tests/util/piglit-util-gl.c:833
#8 0x4019e0 in piglit_display piglit/tests/shaders/fp-formats.c:67
#9 0x7f8d643fc4 in run_test piglit/tests/util/piglit-framework-gl/piglit_fbo_framework.c:52
#10 0x401624 in main piglit/tests/shaders/fp-formats.c:39
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21175>
On gen3+, there are 32 predicate bits instead of 1.
I set out to see why CP_REG_TEST (and others commands that read
registers) is slower on gen1 but could not find anything. Since the
blob seems to use multiple predicate bits, let's keep them documented.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21206>
some games/apps (e.g., DOOM2016) compile+link shaders in one context
and then use them in another, expecting that the compiled shaders
will be reused. vulkan has pipeline (library) objects, which are not
specific to shaders but are in theory representing the shaders being used
thus, pipeline (library) objects need to be reusable for any case where
a shader can be reused
to handle this:
* extract pipeline library cache to a refcounted object
* store these objects on the screen
* make them owned by shaders
separable programs are slightly different since they'll use their own
fastpath, thus making their library caches owned by the programs to avoid
polluting the optimized caches
fixes#8264
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21223>
From the 96c19d23c9 commit message:
Ever since 4246c2869c and 7d85dc4f35 loop unrolling can no
longer depend on inot being eliminated from the loop
terminator condition so we need to be able to handle it.
Support these conditions here too.
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21179>
While making the function public, rename it to
nir_collect_src_uniforms. The old name makes it sound like it's just a
query that doesn't have side effects. That is, however, not the case.
This is step 4 in an attempt to unify a bunch of nir_inline_uniforms.c
and lvp_inline_uniforms.c code.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21179>
Only caller in this file still only passes 1.
This is step 2 in an attempt to unify a bunch of nir_inline_uniforms.c
and lvp_inline_uniforms.c code.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21179>
max_num_bo is currently limited to 1. That will change in the next
commit.
This is step 1 in an attempt to unify a bunch of nir_inline_uniforms.c
and lvp_inline_uniforms.c code.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21179>
We lost power in a storm, and these ones didn't come back afterwards. I
suspect I need a new PSU. And maybe some surge protection for the future.
:(
I've left the CI code in place for some day when I hopefully swap out the
power supplies.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21205>
By enabling this path, we get a 56% decrease in upload time on a texture
upload microbenchmark. This was measured on an Ice Lake with an iris
driver that tries to use the compressed format fallback path.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19827>
Returns the level of the gl_texture_image with respect to the resource
it's allocated within. Example: returns 0 for non-finalized texture.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19827>
Add a function to create and cache the compute programs that will be
used to transcode ASTC to DXT5.
Note that the error paths in st_create_context_priv may actually lead to
segfaults if hit. I've been able to work around them by 1) moving them
further down and 2) returning early from st_glFlush if st->pipe is NULL.
I don't know if that's the right solution however.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19827>
1. Drop the commented out includes. Shader caching is disabled if those
are found.
2. Replace the active includes with "%s". Later on, we'll construct the
final strings with vasprintf. One downside to doing this is that the
glsl file extensions are no longer true. These files are now
templates.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19827>
These compute shaders are from the MIT-licensed GPU compressor, Betsy.
I have included copyright headers, inlined the __sharedOnlyBarrier macro
definition from the "UavCrossPlatform_piece_all.glsl" header when
applicable, and made the following changes to support GLES:
* Conditionally disable the const keyword in the BC3 shaders
* Make the params uniform in the BC4 shader uint2
* Avoid implicit data type conversions in the BC3 shaders
* Use constructors for array initialization in the BC1 shader
* Add precision qualifiers to the BC3 shaders
* Output to an rgba16ui image for the BC1 and BC4 shaders
* Set the version of the BC3 shaders to 310 es
Ref: https://github.com/darksylinc/betsy/tree/cc723dcae9
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19827>
We're going to use resource_copy_region to copy from a resource that has
been written to with imageStore. Make it clear that this is safe.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19827>
The old code would disallow linear targets as well which would confuse
things with reimporting dma-bufs.
Fixes: 32728dc66e ("crocus: introduce main resource configuration helper.")
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21209>
The size in src[2] is in byte and needs to cover any possible data
accessed in src[0] by the indirection. That way the register
allocation is aware of what cannot be spilled for the instruction to
execute on valid data.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 70ace2bbcd ("intel/compiler: Implement Task Output and Mesh Input")
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21188>
apps/games using separate shader objects end up passing the separable
shaders to the link_shader hook individually, which is still not ideal for
zink's usage since the more optimal path is to have all the shaders and create
a RAST+FS GPL stage that can run all the inter-stage io handlers
it IS technically possible to handle this for simple VS+FS pipelines using
GPL, however, but it's kinda gross. such shaders now use descriptor buffer
to create their own pipelines/layouts/descriptors async, and then a "separable"
variant of the gfx program can be created by fast-linking these together
the "separable" gfx program can't handle shader variants, but it can do basic
pipeline caching for PSO state changes, which makes it flexible enough to sorta
kinda maybe handle the most basic cases of separate shader objects
descriptor buffer is used because having to create and manage a separate architecture
for sets/pools/templates is too nightmarish even for me
this is, at best, a partial solution, but it's the best the vulkan api can
currently do
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21197>