Warning:
aco_register_allocation.cpp(383): warning C4819: The file contains a character that cannot be represented in the current code page (0). Save the file in Unicode format to prevent data loss
This warning was treated as error with compiling with msvc
u8 is belongs to c11 standard so it's safe to use it
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18682>
We can't detect color attachment without exports when compiling a PS
epilog, so we can't compact MRTs.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18514>
This was used to distinguish definitions fixed before and during RA, but
it seems it isn't used anymore.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18547>
fossils-db (NAVI21):
Totals from 158 (0.12% of 134913) affected shaders:
CodeSize: 569456 -> 568824 (-0.11%)
Only Control seems affected.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18615>
If there are holes between color outputs (e.g. a shader exports MRT1,
but not MRT0), we can remove the holes by moving higher MRTs lower. The
hardware will remap the MRTs to their correct locations if we remove
holes in SPI_SHADER_COL_FORMAT but not CB_SHADER_MASK. This is good for
performance because the hardware will allocate less space for color
MRTs.
This also allows to remove even more unused color exports because we no
longer need to force previous targets to be non-zero. Only SotTR seems
affected from our fossils db.
fossils-db (NAVI21):
Totals from 859 (0.64% of 134913) affected shaders:
VGPRs: 24328 -> 24216 (-0.46%)
CodeSize: 1433276 -> 1422576 (-0.75%)
Instrs: 255275 -> 253728 (-0.61%)
Latency: 1666836 -> 1661544 (-0.32%)
InvThroughput: 346038 -> 343406 (-0.76%)
Copies: 16520 -> 16506 (-0.08%)
PreSGPRs: 25934 -> 25920 (-0.05%)
PreVGPRs: 19903 -> 19662 (-1.21%)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5786>
This moves all fixed operands at once, so they don't interfere with one
another.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17493>
I don't think it makes sense for this to be anything but get_reg_bounds(),
and this change makes this function usuable with a mix of SGPRs and VGPRs.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17493>
If def_reg is empty, then def_reg.lo() may be lower than bounds.lo() if
we're moving VGPRs and info.bounds will be invalid.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17493>
If there's a lod parameter it matter if the image is 3d or 2d because
the hw reads either the fourth or third component as lod. So detect
3d images and place the lod at the third component otherwise.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18114>
Note that, from 22.4.1. Vertex Input Extraction of Vulkan spec:
The input variable in the shader must be declared as a 64-bit data type if
and only if format is a 64-bit data type.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17894>
Size is in bytes, not bits.
Fixes plenty of crashes in CI, like
dEQP-VK.synchronization.op.single_queue.event.write_image_fragment_read_image_tess_eval.image_128_r32_uint.
Fixes: 46f6e2ddbb ("aco: Implement storage image A16.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18266>
It isn't safe to modify the exec mask before the discard block, and the
definition interferes with GFX11 NOP insertion.
Just use s[0:1] instead, since we won't be using it.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18125>
When SCC is clobbered between s_cmp and its operand's writer,
the current optimization that eliminates s_cmp won't kick in.
However, when s_cmp is the only user of its operand temporary,
it is possible to "pull down" the instruction that wrote the operand.
Fossil DB stats on Navi 21:
Totals from 63302 (46.92% of 134906) affected shaders:
CodeSize: 176689272 -> 176418332 (-0.15%)
Instrs: 33552237 -> 33484502 (-0.20%)
Latency: 205847485 -> 205816205 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 34321285 -> 34319908 (-0.00%); split: -0.00%, +0.00%
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16266>
This was simply left out by accident when I wrote this.
Fossil DB stats on Navi 21:
Totals from 70165 (52.01% of 134906) affected shaders:
CodeSize: 246375656 -> 245814396 (-0.23%)
Instrs: 46519773 -> 46379458 (-0.30%)
Latency: 385159303 -> 385089261 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 66490172 -> 66487867 (-0.00%); split: -0.00%, +0.00%
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16266>
Also delete them when they are already dead in process_instruction().
No Fossil DB changes.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16266>
get_ssa_temp's and NIR's bit size can differ for scalar sources.
This causes broken packing of the MIMG operands with A16/G16.
Fixes: f5f73db846 ("aco: Support 16bit sources for texture ops.")
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18008>
A situation where it doesn't match is probably not possible, so this
probably doesn't fix anything.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18077>
If this is true, then the only instruction the loops visit is
p_logical_end and the loops are no-ops.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18077>
We shouldn't remove a p_cbranch_nz branch in this situation.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: b731be2e96 ("aco: Remove branch instruction when exec is constant non-zero.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18077>
The isConstant() check isn't useful. If it's a constant, then the
physReg() check will fail.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: baab6f18c9 ("aco: Optimize branching sequence during SSA elimination.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18077>
We would assemble an instruction writing vcc instead.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: 5ffc73896f ("aco/assembler: Fix v_cmpx with SDWA.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18077>
The old name is no longer accurate.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18077>