Use gpr_map to determine how many cycles each dependency of the
s_delay_alu needs. This information helps the pass avoid further
s_delay_alu instructions.
fossil-db (gfx1100):
Totals from 13097 (9.73% of 134574) affected shaders:
Instrs: 30711894 -> 30702692 (-0.03%)
CodeSize: 153462500 -> 153425692 (-0.02%)
Latency: 372758612 -> 372741922 (-0.00%)
InvThroughput: 50164111 -> 50160717 (-0.01%); split: -0.01%, +0.00%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20512>
The LLVM backend does this when lowering ordered_xfb_counter_add_amd. I
guess there is some missing dependency checking or something.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19345>
The first operand of this new pseudo-instruction is a 64-bit SGPR for
the continue PC, followed by a variable list of fixed VGPRS for the
color exports which are the PS epilog inputs.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17485>
According to LLVM, image_bvh64_intersect_ray does not write results in
order with other VMEM instructions.
fossil-db (navi21):
Totals from 7 (0.00% of 162293) affected shaders:
Instrs: 39978 -> 39985 (+0.02%)
CodeSize: 219356 -> 219384 (+0.01%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>
If we don't reset ctx.vm_cnt/gpr_map/etc, this will spam a lot of
s_waitcnt instructions.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17207>
By default, std::stack uses std::deque to allocate its elements, which has
poor cache efficiency. std::vector makes appending elements more expensive
(due to potential reallocations), but in the changed contexts the element
count should always be low anyway.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11925>
clang generates a warning if there's no explicit break or fall-through
annotation. The latter would be kind of silly in this case, and not
robust against any future changes turning the fall-through invalid.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>
Instead of assuming WGP mode on GFX10+ in different places, add a member
to Program that can be used instead.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8761>
When there are no param exports in an NGG (or legacy VS) shader,
the NO_PC_EXPORT=1 is set, which means PS waves can launch before
the current stage finishes.
If the current stage has any stores, we need to make sure to wait for
those before we allow PS waves to start, so that PS can read what
these instructions stored.
Fossil DB results on Navi 10:
Totals from 45 (0.03% of 136420) affected shaders:
CodeSize: 87224 -> 87404 (+0.21%)
Instrs: 16750 -> 16795 (+0.27%)
Cycles: 69580 -> 69760 (+0.26%)
VMEM: 8022 -> 8167 (+1.81%)
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7868>
Use auto to avoid mistyping the constness of the pair key, which
triggers implicit conversions rather than compilation errors.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7346>
Iterate over maps by reference to avoid copies.
Replace find/insert with insert to avoid double search.
Use range-based for loop, avoiding copies by reference. Delete comment.
Erase by iterator instead of key to avoid repeat search.
Iterators unneeded to modify unwaited_instrs. Use range-based for loop.
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7285>
Sounds useful for debugging missing wait-states and for improving
detection of the faulty instruction in case of memory violations.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6386>
If one VMEM instruction uses a sampler and the other doesn't, we can't do
this optimization.
Totals from 47 (0.04% of 127638) affected shaders:
CodeSize: 271744 -> 271656 (-0.03%); split: -0.04%, +0.01%
Instrs: 52783 -> 52761 (-0.04%); split: -0.05%, +0.01%
Cycles: 5547040 -> 5546952 (-0.00%); split: -0.00%, +0.00%
VMEM: 10022 -> 9887 (-1.35%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4949>
This isn't perfect (for example, changes might not be too meaningful when
comparing shaders with different control flow) but it should be useful for
evaluating scheduler changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2965>
Clear the workgroup size for all supported shader stages.
Also, unify the workgroup size calculation accross various places.
As a result, insert_waitcnt can use the proper workgroup size
which means that some waits can be dropped from tessellation
shaders. Also, in cases where the previous calculation was wrong,
we now insert s_barrier instructions.
Totals from affected shaders (GFX10):
Code Size: 340116 -> 338484 (-0.48 %) bytes
Fixes: a8d15ab6da
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>