It is much easier to just add a simple late algebraic pass than actually
trying to teach the backend to recognize all the different patterns.
RV530 shader-db:
total instructions in shared programs: 129643 -> 129468 (-0.13%)
instructions in affected programs: 17665 -> 17490 (-0.99%)
helped: 176
HURT: 39
total presub in shared programs: 4912 -> 5411 (10.16%)
presub in affected programs: 1651 -> 2150 (30.22%)
helped: 0
HURT: 287
total temps in shared programs: 16904 -> 16918 (0.08%)
temps in affected programs: 812 -> 826 (1.72%)
helped: 25
HURT: 37
total cycles in shared programs: 194771 -> 194675 (-0.05%)
cycles in affected programs: 28096 -> 28000 (-0.34%)
helped: 146
HURT: 41
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9364
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24830>
Huge thanks to Tapani Pälli for debugging this issue, figuring out
what was going wrong, proposing fixes, and walking me through where
things were going off the rails.
BLORP always disables tessellation and geometry shaders. Our handling
tried to look at ice->shaders.uncompiled[] to determine whether the next
draw needed those shaders. If not, we can leave BLORP's residual state
that disabled those stages in place, and skip looking at it.
Unfortunately, predicting the future is a bit fraught, in part due to
the uncompiled[] and prog[] arrays being slightly out of sync at times.
Consider the following case:
1. Draw with tessellation shaders in place
=> uncompiled[TES] and prog[TES] will both point at valid shaders.
2. Gallium calls pipe->bind_tes_state(NULL).
=> This makes uncompiled[TES] point at NULL, and flags
IRIS_STAGE_DIRTY_UNCOMPILED_TES.
Because iris_update_compiled_shaders() hasn't happened yet,
uncompiled[TES] is NULL but prog[TES] has the stale TES from
the previous draw still.
3. BLORP operations happen
=> BLORP sees uncompiled[TES] == NULL and decides that tessellation
is off for the upcoming draws. So it skips flagging tess state.
4. Gallium calls pipe->bind_tes_state(shader from step #1).
=> uncompiled[TES] points at the original shader.
IRIS_STAGE_DIRTY_UNCOMPILED_TES gets flagged again.
5. Draw again
=> This calls iris_update_compiled_shaders(), which sees that
a TES is bound, and calls iris_update_compiled_tes(). But
because the same shader was bound as before, the program it
comes up with is identical to the one already bound at
ice->shaders.prog[TES]. So, it thinks it doesn't have to
flag any tessellation state dirty because it was already
set up for the last draw.
This random unbind and rebind between draws leads to a situation
where, at step #3, BLORP thinks it can skip flagging tessellation
state (nothing is bound), and at step #5, normal state handling
thinks it can skip flagging tessellation state (nothing changed
since last time). So nobody does, and things break.
This unbind appears to be happening when st_release_variants()
decides it wants to free some shaders. Then a rebind happens to
put back the actual shader for the draw. So, it's not theoretical.
To fix this, we change BLORP to look at ice->shaders.prog[] rather
than uncompiled[]. This is equivalent to thinking about the previous
draw, rather than the next. If the last draw had tessellation off,
then BLORP's disabling was a no-op, and the GPU is still in the same
state as the previous draw. This is more reliable than predicting
the future.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8308
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9678
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24880>
../../src/gallium/drivers/r600/sfn/sfn_nir.cpp:458:59: error: ‘ATOMIC_COUNTER_SIZE’ was not declared in this scope
../../src/gallium/drivers/r600/sfn/sfn_shader.cpp:609:53: error: ‘ATOMIC_COUNTER_SIZE’ was not declared in this scope
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24824>
The stride for each vertex buffer should come from the corresponding
vertex element structure.
Fixes piglit/glretrace regressions running on svga vgpu9 device.
Fixes: 7672545223 ("gallium: move vertex stride to CSO")
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24841>
For merging shader states, we'll need to lower sysvals separately for each
shader but assign uniforms together for the final merged shader. The easiest way
to do that is to decouple the lowering of sysvals to driver uniform reads, from
the assignment of driver uniform reads to actual uniform registers.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24847>
By uploading textures ahead-of-time, we can upload uniforms ahead-of-time too.
This will also allow some overhead shaving optimizations, I guess.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24847>
In Vulkan, UBOs are lowered by nir_lower_explicit_io, and the ubo_base_agx
sysval is unused (since it doesn't handle descriptor sets). That makes the UBO
lowering GL-only and hence belongs with the GL driver rather than the compiler.
This lets us delete the ubo_base_agx sysval.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24847>
Doing a descriptor crawl with binding tables requires a real binding table in
the shader, which won't work for VK or merged shader stages in GL. Instead,
let's lower anything that needs a crawl to bindless in the driver, so the
compiler code doesn't need to know anything about descriptor binding models.
That gets rid of the texture_base_agx sysval, which is problematic when there
are multiple descriptor sets worth of textures.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24847>
This is the counterpart of get_oq_index for non-occlusion hardware queries.
These are not tracked with occlusion queries, since occlusion query allocations
are limited, and they are not based on indexing but rather general
batch-allocated space.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24847>
We do all the math in pixels and only multiply by the sample count at the end,
meaning the layer stride needs to be in terms of pixels (not samples) for
correct addressing of multisample array images in our texture lowering. This is
particularly used for lowering the multisample array stores we get from eMRT
with multisampled layered framebuffers.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24847>
shader keys represent pipeline states which trigger variants, but not
all shaders are affected by certain states
this adds some sanitizing for the optimal path to ignore shader variants
which won't have any effect for the currently bound shaders, thus reducing
the number of pipelines compiled (both unoptimized and optimized)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24831>