When an exec write isn't used but writes other registers
besides exec, and also reads exec (such as s_and_saveexec),
we would mistakenly delete the previous instruction that
writes the exec value that this instruction uses.
No Fossil DB changes on Rembrandt (GFX10.3).
Fixes: 0211e66f65
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9036
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23576>
(cherry picked from commit 67a0f2532f)
Previously, we could safely read/write unused lanes of VMEM/DS
destination VGPRs without waiting for the load to finish. That doesn't
seem to be the case on GFX11.
fossil-db (gfx1100):
Totals from 6698 (4.94% of 135636) affected shaders:
Instrs: 11184274 -> 11199420 (+0.14%); split: -0.00%, +0.14%
CodeSize: 57578344 -> 57638928 (+0.11%); split: -0.00%, +0.11%
Latency: 198348808 -> 198382472 (+0.02%); split: -0.00%, +0.02%
InvThroughput: 24376324 -> 24378439 (+0.01%); split: -0.00%, +0.01%
VClause: 192420 -> 192559 (+0.07%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8722
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8239
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22965>
(cherry picked from commit 88f6d7f4bd)
Not sure if this is possible, but we should avoid it anyway.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22714>
(cherry picked from commit d0caa50dcd)
mantissa needs to be at the lower part for shift left.
This fixes large integer value conversion.
Cc: mesa-stable
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22570>
(cherry picked from commit 6a39d35df0)
To simplify both llvm and aco backend and remove unnecessary
workaround code where prim count is known to be not zero.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22381>
aco_compiler_options::key is a leftover from when aco used
the radv_pipeline_key struct, but aco_compiler_options::key was
never actually used as a cache key.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21935>
This information was wrong in some places, let's fix it now.
GFX6:
The GPU has 64KB LDS, but only 32KB is usable by a workgroup.
NGG:
There was some misinformation about NGG only being able to
address 32 KB LDS, it turns out this is actually not true
and it can address the full 64K.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21935>
Don't recompile entire ACO when something changes in NIR.
Instead, only use some headers which are actually needed,
include these in ACO files instead of relying on nir.h to
include them.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22241>
We don't want to rely on any NIR structures in ACO, because
we would like to avoid the need to include nir.h in aco_ir.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22241>
When using multiple binaries, we don't know the required number of VGPRs beforehand,
which means we either have to over-allocate VGPRs or avoid shared VGPRs.
As bpermute is the only instructions needing shared VGPRs, we decide for the latter.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22267>
This just makes it possible to use the dominator
tree information during phi lowering.
No Fossil DB changes on GFX11.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21493>
The branch instruction is no longer conditional when the targets are the
same, so the operand is not necessary and can be removed.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21493>
Verifying that the branch instruction reads exec is not actually
necessary because the pattern that we look for already implies that.
This prepares for the next commit which will remove the exec operand
from branches that have the same target. These branches will no
longer read exec, but they should still get the same optimization.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21493>
Don't eliminate an instruction that writes registers other than exec and scc.
It is possible that this is eg. an s_and_saveexec and the saved value is
used by a later branch.
Fixes: bc13049747
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21493>
16bit int mad/fma/minmax combining can work with opsel set.
All other optimizations should already check if the instruction uses sdwa,
because we don't check this when applying the label initially.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22069>