Commit graph

2098 commits

Author SHA1 Message Date
Daniel Schürmann
98e3c446d8 aco/optimizer: disallow can_eliminate_and_exec() with s_not
Totals from 295 (0.22% of 134913) affected shaders: (GFX10.3)
CodeSize: 1016564 -> 1016896 (+0.03%); split: -0.05%, +0.09%
Instrs: 187659 -> 187724 (+0.03%); split: -0.08%, +0.11%
Latency: 2839516 -> 2839541 (+0.00%); split: -0.01%, +0.01%
Copies: 12301 -> 12305 (+0.03%); split: -0.01%, +0.04%
PreSGPRs: 10266 -> 10268 (+0.02%)

Closes: #7024
Cc: mesa-stable
Tested-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18722>
2022-09-21 15:33:43 +00:00
Georg Lehmann
94b5225c9b aco: Use v_fmaak/v_fmamk if two operands are the same literal.
Foz-DB Navi21:
Totals from 5744 (4.26% of 134913) affected shaders:
VGPRs: 237128 -> 237056 (-0.03%); split: -0.04%, +0.01%
CodeSize: 16654484 -> 16620668 (-0.20%); split: -0.23%, +0.03%
MaxWaves: 152838 -> 152846 (+0.01%)
Instrs: 3063214 -> 3058572 (-0.15%); split: -0.17%, +0.02%
Latency: 23935195 -> 23934827 (-0.00%); split: -0.03%, +0.03%
InvThroughput: 5478562 -> 5478160 (-0.01%); split: -0.01%, +0.01%
VClause: 60432 -> 60435 (+0.00%); split: -0.02%, +0.03%
SClause: 121032 -> 120896 (-0.11%); split: -0.20%, +0.09%
Copies: 147865 -> 143144 (-3.19%); split: -3.59%, +0.40%
PreSGPRs: 195722 -> 195661 (-0.03%); split: -0.06%, +0.03%
PreVGPRs: 182849 -> 182787 (-0.03%)

Foz-DB Vega10:
Totals from 5290 (3.92% of 135041) affected shaders:
SGPRs: 357952 -> 359616 (+0.46%); split: -0.11%, +0.57%
VGPRs: 204048 -> 203928 (-0.06%); split: -0.08%, +0.02%
CodeSize: 14043176 -> 14003100 (-0.29%); split: -0.29%, +0.00%
MaxWaves: 39401 -> 39398 (-0.01%); split: +0.01%, -0.02%
Instrs: 2636739 -> 2631246 (-0.21%); split: -0.21%, +0.00%
Latency: 25264088 -> 25256482 (-0.03%); split: -0.05%, +0.02%
InvThroughput: 12039643 -> 12039346 (-0.00%); split: -0.00%, +0.00%
VClause: 55603 -> 55584 (-0.03%); split: -0.04%, +0.00%
SClause: 101577 -> 101342 (-0.23%); split: -0.30%, +0.07%
Copies: 213344 -> 207929 (-2.54%); split: -2.58%, +0.05%
Branches: 34053 -> 34054 (+0.00%)
PreSGPRs: 172405 -> 172260 (-0.08%)

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18645>
2022-09-21 12:16:47 +00:00
Yonggang Luo
209a89e51d aco: Convert to use u8 literal for Unicode character to fixes msvc warning
Warning:
aco_register_allocation.cpp(383): warning C4819: The file contains a character that cannot be represented in the current code page (0). Save the file in Unicode format to prevent data loss

This warning was treated as error with compiling with msvc
u8 is belongs to c11 standard so it's safe to use it

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18682>
2022-09-20 18:40:50 +00:00
Samuel Pitoiset
19eec024d2 radv,aco: do not compact MRTs if the pipeline uses a PS epilog
We can't detect color attachment without exports when compiling a PS
epilog, so we can't compact MRTs.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18514>
2022-09-20 13:12:49 +00:00
Rhys Perry
6df5ff7f19 aco: DCE ra_ctx::defs_done
This was used to distinguish definitions fixed before and during RA, but
it seems it isn't used anymore.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18547>
2022-09-20 12:24:03 +00:00
Samuel Pitoiset
704ef1fd3b radv,aco: lower barycentric_at_sample in NIR
fossils-db (NAVI21):
Totals from 158 (0.12% of 134913) affected shaders:
CodeSize: 569456 -> 568824 (-0.11%)

Only Control seems affected.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18615>
2022-09-20 09:52:37 +00:00
Rhys Perry
7d26fafacf radv: fix dynamic RT stack size with VGPR spilling
VGPR spilling might cause VGPRs to be spilled at scratch offset 0, so we
can't use that.

fossil-db (Sienna Cichlid, Q2RTX and Control):
Totals from 4 (0.26% of 1524) affected shaders:
Instrs: 8734 -> 8737 (+0.03%)
CodeSize: 48492 -> 48504 (+0.02%)
Latency: 384375 -> 384369 (-0.00%)
InvThroughput: 256250 -> 256246 (-0.00%)
Copies: 1312 -> 1313 (+0.08%)
Branches: 256 -> 258 (+0.78%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18541>
2022-09-20 01:39:20 +00:00
Samuel Pitoiset
9a6aa3e23a aco: prevent a division by zero when patch control points is dynamic
tess_input_vertices is zero if the state is dynamic.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18344>
2022-09-13 08:24:14 +00:00
Georg Lehmann
28a69b72d8 aco: Use plain VOPC for vcmpx when possible.
Foz-DB Navi21:
Totals from 66947 (49.62% of 134913) affected shaders:
CodeSize: 210383024 -> 210033376 (-0.17%)

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18417>
2022-09-07 16:08:36 +00:00
Samuel Pitoiset
8fcb4aa0eb radv: compact MRTs to save PS export memory space
If there are holes between color outputs (e.g. a shader exports MRT1,
but not MRT0), we can remove the holes by moving higher MRTs lower. The
hardware will remap the MRTs to their correct locations if we remove
holes in SPI_SHADER_COL_FORMAT but not CB_SHADER_MASK. This is good for
performance because the hardware will allocate less space for color
MRTs.

This also allows to remove even more unused color exports because we no
longer need to force previous targets to be non-zero. Only SotTR seems
affected from our fossils db.

fossils-db (NAVI21):
Totals from 859 (0.64% of 134913) affected shaders:
VGPRs: 24328 -> 24216 (-0.46%)
CodeSize: 1433276 -> 1422576 (-0.75%)
Instrs: 255275 -> 253728 (-0.61%)
Latency: 1666836 -> 1661544 (-0.32%)
InvThroughput: 346038 -> 343406 (-0.76%)
Copies: 16520 -> 16506 (-0.08%)
PreSGPRs: 25934 -> 25920 (-0.05%)
PreVGPRs: 19903 -> 19662 (-1.21%)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5786>
2022-09-07 08:17:20 +00:00
Samuel Pitoiset
df997cf47d aco: remove unused isel_context::tcs_num_patches
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18278>
2022-09-01 17:02:16 +00:00
Rhys Perry
061b8bfd29 aco/ra: rework fixed operands
This moves all fixed operands at once, so they don't interfere with one
another.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17493>
2022-09-01 11:22:46 +00:00
Rhys Perry
ec867ef0e7 aco/ra: remove bounds parameter from get_regs_for_copies()
I don't think it makes sense for this to be anything but get_reg_bounds(),
and this change makes this function usuable with a mix of SGPRs and VGPRs.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17493>
2022-09-01 11:22:46 +00:00
Rhys Perry
efcbccaf0e aco/ra: handle empty def_reg interval in get_regs_for_copies
If def_reg is empty, then def_reg.lo() may be lower than bounds.lo() if
we're moving VGPRs and info.bounds will be invalid.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17493>
2022-09-01 11:22:46 +00:00
Timur Kristóf
16c14663e5 aco: Fix p_init_scratch for task shaders.
Fixes: d2d94b62f2
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18339>
2022-09-01 09:10:47 +00:00
Georg Lehmann
a82b9d6001 aco: Fix image instructions with lod when 2d_view_of_3d is enabled on GFX9.
If there's a lod parameter it matter if the image is 3d or 2d because
the hw reads either the fourth or third component as lod. So detect
3d images and place the lod at the third component otherwise.

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18114>
2022-08-31 17:44:38 +00:00
Rhys Perry
96df4499ac radv,aco: implement 64-bit vertex inputs
Note that, from 22.4.1. Vertex Input Extraction of Vulkan spec:
The input variable in the shader must be declared as a 64-bit data type if
and only if format is a 64-bit data type.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17894>
2022-08-30 19:02:11 +00:00
Rhys Perry
831257bdce radv,aco: use pipe_format for dynamic vertex input state
Also prepare for 64-bit and R8G8B8/R16G16B16 with the addition of
radv_vs_input_state::nontrivial_formats.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5021
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17894>
2022-08-30 19:02:11 +00:00
Rhys Perry
c06a5a5ebd radv,aco: use pipe_format for static vertex input state
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17894>
2022-08-30 19:02:11 +00:00
Rhys Perry
6a2ada93b4 ac: add ac_vtx_format_info
This will be used by RADV and ACO.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17894>
2022-08-30 19:02:11 +00:00
Daniel Schürmann
f676326a1a aco/live_var_analysis: implement faster merging of live_out sets for some cases
If we know that logical and linear predecessors are the same,
we don't need to check for the register type of each variable.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18105>
2022-08-30 16:03:26 +00:00
Daniel Schürmann
3d6ea4f666 aco: use std::vector::reserve() more often
This removes the majority of vector re-allocations.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18105>
2022-08-30 16:03:26 +00:00
Rhys Perry
aa4db00c57 aco: remove dead code for querying image size/samples/levels
ac_nir_lower_resinfo() now lowers these.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17991>
2022-08-30 07:37:08 +00:00
Rhys Perry
290df95870 aco: add SCC clobber in build_cube_select
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17991>
2022-08-30 07:37:08 +00:00
Rhys Perry
636450c274 aco: allow direct_fetch=true for vec4 VS input loads
This seems to be a (mostly harmless) mistake from 369b8cffea.

fossil-db (navi21):
Totals from 15 (0.01% of 135636) affected shaders:
Instrs: 1992 -> 1999 (+0.35%)
Latency: 13557 -> 13567 (+0.07%); split: -0.24%, +0.31%
InvThroughput: 4059 -> 4065 (+0.15%); split: -0.20%, +0.34%
Copies: 186 -> 193 (+3.76%)

fossil-db (polaris10):
Totals from 5 (0.00% of 135610) affected shaders:

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18225>
2022-08-26 15:28:55 +00:00
Rhys Perry
030d6f873e aco: don't expand vec3 VS input load to vec4 on GFX6
Removes the (small) possibility of invalid memory access.

fossil-db (pitcairn):
Totals from 35456 (26.15% of 135610) affected shaders:
MaxWaves: 259508 -> 260642 (+0.44%); split: +0.44%, -0.01%
Instrs: 7915383 -> 7965774 (+0.64%); split: -0.09%, +0.72%
CodeSize: 37163748 -> 37524804 (+0.97%); split: -0.04%, +1.01%
SGPRs: 1515128 -> 1513576 (-0.10%); split: -0.27%, +0.17%
VGPRs: 1218376 -> 1211160 (-0.59%); split: -0.71%, +0.12%
SpillSGPRs: 1152 -> 1144 (-0.69%)
Latency: 83777626 -> 83867137 (+0.11%); split: -0.61%, +0.72%
InvThroughput: 25722445 -> 25727745 (+0.02%); split: -0.23%, +0.25%
VClause: 232058 -> 230464 (-0.69%); split: -2.53%, +1.84%
SClause: 322579 -> 322108 (-0.15%); split: -0.76%, +0.61%
Copies: 547032 -> 547954 (+0.17%); split: -1.83%, +2.00%
Branches: 72538 -> 72542 (+0.01%)
PreVGPRs: 898453 -> 897584 (-0.10%); split: -0.13%, +0.03%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18225>
2022-08-26 15:28:55 +00:00
Rhys Perry
3260844448 aco: fix 16-bit VS inputs
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: 3fba5bb9cc ("aco: implement 16-bit vertex fetches with tbuffer_load_format_d16_*")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18225>
2022-08-26 15:28:55 +00:00
Samuel Pitoiset
ee5b9bcc57 radv: stop duplicating radv_vs_output_info
Only the last vertex stage needs to access this.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18210>
2022-08-26 14:07:09 +00:00
Samuel Pitoiset
8cd1683944 aco: fix wrong size for 1D images and A16 on GFX9
Size is in bytes, not bits.

Fixes plenty of crashes in CI, like
dEQP-VK.synchronization.op.single_queue.event.write_image_fragment_read_image_tess_eval.image_128_r32_uint.

Fixes: 46f6e2ddbb ("aco: Implement storage image A16.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18266>
2022-08-26 13:30:46 +00:00
Rhys Perry
fb13ed6ff0 aco: fix long-jump version of discard early exit
It isn't safe to modify the exec mask before the discard block, and the
definition interferes with GFX11 NOP insertion.

Just use s[0:1] instead, since we won't be using it.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18125>
2022-08-25 16:10:53 +00:00
Georg Lehmann
9151048957 aco: Combine 16bit undef and constants instead of using s_pack.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18106>
2022-08-24 17:04:03 +00:00
Georg Lehmann
46f6e2ddbb aco: Implement storage image A16.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18106>
2022-08-24 17:04:03 +00:00
Georg Lehmann
8eac45b274 nir: Add nir_ssa_scalar_is_undef.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18183>
2022-08-24 15:22:40 +00:00
Yonggang Luo
2af3b6756a amd/compiler: Fixes warning [-Wunused-variable] in test_optimizer_postRA.cpp
Warning message:
../src/amd/compiler/tests/test_optimizer_postRA.cpp:137:13: warning: unused variable 'reg_s1' [-Wunused-variable]

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18205>
2022-08-23 14:14:52 +00:00
Yonggang Luo
4a607c2df4 amd/compiler: Fixes warning [-Wunused-variable] in test_to_hw_instr.cpp
Warning message:
../src/amd/compiler/tests/test_to_hw_instr.cpp:793:12: warning: unused variable 'reg_s1' [-Wunused-variable]

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18205>
2022-08-23 14:14:52 +00:00
Eric Engestrom
013b022924 aco: drop unused variable
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18193>
2022-08-22 23:05:20 +00:00
Yonggang Luo
7f64137d93 aco: Use unreachable instead assert(false)
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18194>
2022-08-22 22:44:59 +00:00
Timur Kristóf
1762e6b540 aco: Improve SCC nocompare optimization when SCC is clobbered.
When SCC is clobbered between s_cmp and its operand's writer,
the current optimization that eliminates s_cmp won't kick in.

However, when s_cmp is the only user of its operand temporary,
it is possible to "pull down" the instruction that wrote the operand.

Fossil DB stats on Navi 21:

Totals from 63302 (46.92% of 134906) affected shaders:
CodeSize: 176689272 -> 176418332 (-0.15%)
Instrs: 33552237 -> 33484502 (-0.20%)
Latency: 205847485 -> 205816205 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 34321285 -> 34319908 (-0.00%); split: -0.00%, +0.00%

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16266>
2022-08-20 15:27:40 +00:00
Timur Kristóf
e69de0f81d aco: Support s_cselect_b64 in SCC no-compare optimization.
This was simply left out by accident when I wrote this.

Fossil DB stats on Navi 21:

Totals from 70165 (52.01% of 134906) affected shaders:
CodeSize: 246375656 -> 245814396 (-0.23%)
Instrs: 46519773 -> 46379458 (-0.30%)
Latency: 385159303 -> 385089261 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 66490172 -> 66487867 (-0.00%); split: -0.00%, +0.00%

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16266>
2022-08-20 15:27:40 +00:00
Timur Kristóf
b0ef7c7c82 aco/optimizer_postRA: Don't try to optimize dead instructions.
Also delete them when they are already dead in process_instruction().

No Fossil DB changes.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16266>
2022-08-20 15:27:40 +00:00
Georg Lehmann
2dd641119f aco: Force tex operand to have the correct sub dword size before packing.
get_ssa_temp's and NIR's bit size can differ for scalar sources.
This causes broken packing of the MIMG operands with A16/G16.

Fixes: f5f73db846 ("aco: Support 16bit sources for texture ops.")
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18008>
2022-08-18 14:42:28 +02:00
Rhys Perry
f7d02a9b5e aco: test for one and_savexec opcode in try_optimize_branching_sequence
A situation where it doesn't match is probably not possible, so this
probably doesn't fix anything.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18077>
2022-08-16 17:31:33 +00:00
Rhys Perry
2587e75ee1 aco: improve vcc check for instructions between exec_val and exec_copy
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18077>
2022-08-16 17:31:33 +00:00
Rhys Perry
faf8038254 aco: remove val_and_copy_adjacent
If this is true, then the only instruction the loops visit is
p_logical_end and the loops are no-ops.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18077>
2022-08-16 17:31:33 +00:00
Rhys Perry
cfb306b4e4 aco: test branch opcode if removing it in try_optimize_branching_sequence
We shouldn't remove a p_cbranch_nz branch in this situation.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: b731be2e96 ("aco: Remove branch instruction when exec is constant non-zero.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18077>
2022-08-16 17:31:33 +00:00
Rhys Perry
21cb002b60 aco: fix re-write of uses of exec_val's lo/hi half
The isConstant() check isn't useful. If it's a constant, then the
physReg() check will fail.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: baab6f18c9 ("aco: Optimize branching sequence during SSA elimination.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18077>
2022-08-16 17:31:33 +00:00
Rhys Perry
dd105f7c1e aco: fix assembly of vopc_sdwa writing exec
We would assemble an instruction writing vcc instead.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: 5ffc73896f ("aco/assembler: Fix v_cmpx with SDWA.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18077>
2022-08-16 17:31:33 +00:00
Rhys Perry
f60cb8d0af aco: rename is_cmp to is_fp_cmp
The old name is no longer accurate.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18077>
2022-08-16 17:31:33 +00:00
Rhys Perry
e493aab3c3 aco: fix consecutive exec writes when finding exec_copy instruction
This can happen with transitions to exact:
 s2: %0:exec = p_parallelcopy %622:s[0-1]
 s2: %625:s[0-1],  s1: %624:scc,  s2: %0:exec = s_and_saveexec_b64 %141:vcc, %0:exec
 s2: %626:s[12-13] = p_cbranch_z %0:exec BB2, BB1

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: 410eff4d2f ("aco: Fix optimizing branching sequence with s_and_saveexec.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18077>
2022-08-16 17:31:33 +00:00
Georg Lehmann
7b9d3ebe42 aco: Use v_cmpx pre GFX10.
Foz-DB Vega10:
Totals from 29508 (21.85% of 135041) affected shaders:
CodeSize: 184345656 -> 184345820 (+0.00%)
Instrs: 35906154 -> 35906195 (+0.00%)
Latency: 581696114 -> 581530021 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 245625572 -> 245561351 (-0.03%); split: -0.03%, +0.00%
Copies: 3134925 -> 3278672 (+4.59%)

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18049>
2022-08-15 13:25:38 +00:00