Commit graph

2060 commits

Author SHA1 Message Date
Samuel Pitoiset
2784bfe93f aco: do not abort if the FS doesn't export anything but has an epilog
The main fragment shader can only export MRTZ (if present) and the
epilog will export colors.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17485>
2022-07-18 18:40:02 +00:00
Samuel Pitoiset
a6dff6caa1 aco: emit p_jump_to_epilog if the main fragment shader has an epilog
MRTZ is still exported from the main shader.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17485>
2022-07-18 18:40:02 +00:00
Samuel Pitoiset
8bdcc20815 aco: add new pseudo instruction p_jump_to_epilog
The first operand of this new pseudo-instruction is a 64-bit SGPR for
the continue PC, followed by a variable list of fixed VGPRS for the
color exports which are the PS epilog inputs.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17485>
2022-07-18 18:40:02 +00:00
Samuel Pitoiset
0db7a0b6e8 radv,aco: introduce {radv,aco}_ps_epilog_key
To pass the necessary pipeline information for compiling PS epilogs.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17485>
2022-07-18 18:40:02 +00:00
Samuel Pitoiset
eee098486a radv,aco: track if a fragment shader needs an epilog
This is currently disabled but it will be used for testing first,
and then for graphics pipeline libraries.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17485>
2022-07-18 18:40:02 +00:00
Konstantin Seurer
82283de717 radv: Use a global address for sbt_base
Required for indirect(2) ray tracing to work.
Fixes the following tests:

dEQP-VK.ray_tracing_pipeline.trace_rays_indirect2.indirect_*
dEQP-VK.ray_tracing_pipeline.trace_rays_cmds_maintenance_1.indirect2_*

Fixes: 16585664 ("radv: vkCmdTraceRaysIndirect2KHR")
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17316>
2022-07-14 16:35:31 +00:00
Konstantin Seurer
69daa3f762 radv: Use a global address for ray_launch_size
Required for indirect(2) ray tracing to work.

Fixes: b30f96dd ("radv,aco: Use ray_launch_size_addr")
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17316>
2022-07-14 16:35:31 +00:00
Daniel Schürmann
66d46a23fb aco: fix packed 16bit fneg/fsat optimization
Make sure that the Operand is '1.0.xx'.

Fixes: b03be30e07 ('aco: optimize packed fneg')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17395>
2022-07-11 10:10:00 +00:00
Rhys Perry
98a65eafb7 aco: use scratch_* for VGPR spill/reload on GFX9+
fossil-db (navi21):
Totals from 12 (0.01% of 162293) affected shaders:
Instrs: 122808 -> 122782 (-0.02%); split: -0.11%, +0.09%
CodeSize: 711248 -> 710788 (-0.06%); split: -0.16%, +0.10%
SpillSGPRs: 928 -> 831 (-10.45%)
SpillVGPRs: 1626 -> 1624 (-0.12%)
Latency: 4960285 -> 4932547 (-0.56%)
InvThroughput: 2574083 -> 2559953 (-0.55%)
VClause: 3404 -> 3402 (-0.06%)
Copies: 36992 -> 37181 (+0.51%); split: -0.05%, +0.56%
Branches: 3582 -> 3585 (+0.08%)
PreVGPRs: 3055 -> 3057 (+0.07%)

fossil-db (vega10):
Totals from 12 (0.01% of 161355) affected shaders:
Instrs: 124817 -> 124383 (-0.35%); split: -0.46%, +0.12%
CodeSize: 705116 -> 703664 (-0.21%); split: -0.44%, +0.23%
SpillSGPRs: 1012 -> 898 (-11.26%)
SpillVGPRs: 1632 -> 1624 (-0.49%)
Scratch: 201728 -> 200704 (-0.51%)
Latency: 6160115 -> 6266025 (+1.72%); split: -0.34%, +2.06%
InvThroughput: 6440203 -> 6544595 (+1.62%); split: -0.35%, +1.97%
VClause: 3409 -> 3423 (+0.41%)
Copies: 37929 -> 37748 (-0.48%); split: -1.16%, +0.69%
Branches: 3851 -> 3855 (+0.10%); split: -0.13%, +0.23%
PreVGPRs: 3053 -> 3055 (+0.07%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>
2022-07-08 14:49:03 +00:00
Rhys Perry
0e783d687a aco: use scratch_* for scratch load/store on GFX9+
fossil-db (navi21):
Totals from 52 (0.03% of 162293) affected shaders:
Instrs: 83190 -> 82145 (-1.26%)
CodeSize: 454892 -> 447260 (-1.68%); split: -1.68%, +0.00%
VGPRs: 4768 -> 4672 (-2.01%)
Latency: 1490887 -> 1487170 (-0.25%); split: -0.68%, +0.43%
InvThroughput: 935500 -> 933060 (-0.26%); split: -0.72%, +0.46%
VClause: 2715 -> 2632 (-3.06%); split: -4.53%, +1.47%
SClause: 1902 -> 1883 (-1.00%)
Copies: 8839 -> 8496 (-3.88%)
PreSGPRs: 2012 -> 1807 (-10.19%)
PreVGPRs: 3282 -> 3192 (-2.74%)

fossil-db (vega10):
Totals from 41 (0.03% of 161355) affected shaders:
Instrs: 35772 -> 35699 (-0.20%)
CodeSize: 187040 -> 186584 (-0.24%)
VGPRs: 4044 -> 4072 (+0.69%)
Latency: 243088 -> 242379 (-0.29%)
InvThroughput: 180301 -> 179783 (-0.29%)
VClause: 1204 -> 1216 (+1.00%)
SClause: 653 -> 637 (-2.45%)
Copies: 3736 -> 3704 (-0.86%); split: -0.88%, +0.03%
PreSGPRs: 1331 -> 1207 (-9.32%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>
2022-07-08 14:49:03 +00:00
Rhys Perry
d2d94b62f2 aco: initialize scratch base registers on GFX9-GFX10.3
fossil-db (navi21):
Totals from 1142 (0.70% of 162293) affected shaders:
Instrs: 271636 -> 271974 (+0.12%)
CodeSize: 1532020 -> 1533792 (+0.12%)
Latency: 7484066 -> 7485698 (+0.02%)
InvThroughput: 4048824 -> 4049579 (+0.02%)
SClause: 4171 -> 4212 (+0.98%)
PreSGPRs: 11203 -> 12276 (+9.58%)

fossil-db (vega10):
Totals from 3327 (2.06% of 161355) affected shaders:
Instrs: 257413 -> 257601 (+0.07%)
CodeSize: 1424244 -> 1425372 (+0.08%)
Latency: 8598402 -> 8600466 (+0.02%)
InvThroughput: 7906335 -> 7908234 (+0.02%)
SClause: 4932 -> 4973 (+0.83%)
PreSGPRs: 22010 -> 25405 (+15.42%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>
2022-07-08 14:49:03 +00:00
Rhys Perry
97e9e42e0d aco: treat flat-like as vmem in some scheduling heuristics
fossil-db (navi21):
Totals from 12 (0.01% of 162293) affected shaders:
Instrs: 48754 -> 48762 (+0.02%)
CodeSize: 267092 -> 267124 (+0.01%)
Latency: 1293798 -> 1292303 (-0.12%); split: -0.12%, +0.00%
InvThroughput: 854599 -> 853578 (-0.12%)
VClause: 1623 -> 1619 (-0.25%)
SClause: 1187 -> 1188 (+0.08%); split: -0.08%, +0.17%

fossil-db (vega10):
Totals from 1 (0.00% of 161355) affected shaders:
Latency: 18720 -> 18848 (+0.68%)
InvThroughput: 5775 -> 5776 (+0.02%)
SClause: 12 -> 11 (-8.33%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>
2022-07-08 14:49:03 +00:00
Rhys Perry
29953d6048 aco: include scratch/global in VMEM WAW optimization
fossil-db (navi21):
Totals from 2 (0.00% of 162293) affected shaders:
Instrs: 4788 -> 4785 (-0.06%)
CodeSize: 25884 -> 25872 (-0.05%)
Latency: 255008 -> 252950 (-0.81%)
InvThroughput: 170005 -> 168633 (-0.81%)
VClause: 206 -> 205 (-0.49%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>
2022-07-08 14:49:03 +00:00
Rhys Perry
c66206cbed aco: avoid WAW hazard with BVH MIMG and other VMEM
According to LLVM, image_bvh64_intersect_ray does not write results in
order with other VMEM instructions.

fossil-db (navi21):
Totals from 7 (0.00% of 162293) affected shaders:
Instrs: 39978 -> 39985 (+0.02%)
CodeSize: 219356 -> 219384 (+0.01%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>
2022-07-08 14:49:03 +00:00
Rhys Perry
7d34044908 aco: refactor VGPR spill/reload lowering
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>
2022-07-08 14:49:03 +00:00
Rhys Perry
6642f2fd74 aco: handle subtractions in parse_base_offset
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>
2022-07-08 14:49:03 +00:00
Rhys Perry
52934f6cdb aco: combine additions and constants into scratch load/store
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>
2022-07-08 14:49:03 +00:00
Rhys Perry
931a456db1 aco: improve support for scratch_* instructions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>
2022-07-08 14:49:03 +00:00
Rhys Perry
cbeb25ce91 aco: make FLAT_instruction::offset signed
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>
2022-07-08 14:49:03 +00:00
Rhys Perry
5898afba53 aco: include flat-like in vmem clause statistics
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>
2022-07-08 14:49:03 +00:00
Rhys Perry
08ed6ebc55 aco: make flat access latency match mtbuf/mubuf/mimg
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>
2022-07-08 14:49:03 +00:00
Georg Lehmann
4f5e25ea8d aco/assembler: Fix s_bitreplicate_b64_b32 on GFX9.
This seems to be a relic from before aco added per generation opcodes.

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17405>
2022-07-08 10:09:19 +00:00
Georg Lehmann
68db0a079b aco: Fix swapping sources in SOPC -> SOPK optimization.
Fixes: 2d6b0a4177 ("aco/optimizer: Optimize SOPC with literal to SOPK.")

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17407>
2022-07-08 09:43:51 +00:00
Samuel Pitoiset
cf46397aec aco: fix load_barycentric_at_sample without MSAA
It's legal to use this instruction in a fragment shader, even if the
graphics pipeline doesn't use MSAA.

Fixes
dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_sample.non_multisample_buffer.sample_n_*.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17398>
2022-07-08 07:28:33 +00:00
Rhys Perry
48578713b7 radv,aco,ac/llvm: use nir_op_f{sin,cos}_amd
This lets NIR optimize the multiplication, particularly sin/cos(a * #b).

fossil-db (Sienna Cichlid):
Totals from 12306 (7.58% of 162293) affected shaders:
MaxWaves: 224814 -> 224834 (+0.01%)
Instrs: 17365273 -> 17338758 (-0.15%); split: -0.16%, +0.00%
CodeSize: 93478488 -> 93354912 (-0.13%); split: -0.14%, +0.01%
VGPRs: 752080 -> 752072 (-0.00%); split: -0.00%, +0.00%
SpillSGPRs: 8440 -> 8410 (-0.36%)
Latency: 200402154 -> 200279405 (-0.06%); split: -0.06%, +0.00%
InvThroughput: 37588077 -> 37545545 (-0.11%); split: -0.11%, +0.00%
VClause: 293863 -> 293874 (+0.00%); split: -0.03%, +0.03%
SClause: 619539 -> 619064 (-0.08%); split: -0.09%, +0.01%
Copies: 1151591 -> 1151641 (+0.00%); split: -0.04%, +0.05%
Branches: 506434 -> 506437 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 877609 -> 877517 (-0.01%); split: -0.01%, +0.00%
PreVGPRs: 711938 -> 711940 (+0.00%); split: -0.00%, +0.00%

fossil-db (LLVM, Sienna Cichlid):
Totals from 4377 (3.59% of 121873) affected shaders:
SGPRs: 358960 -> 359176 (+0.06%); split: -0.18%, +0.25%
VGPRs: 319832 -> 319720 (-0.04%); split: -0.18%, +0.15%
SpillSGPRs: 46983 -> 47007 (+0.05%); split: -0.99%, +1.04%
CodeSize: 30872812 -> 30764512 (-0.35%); split: -0.39%, +0.04%
MaxWaves: 73814 -> 73904 (+0.12%); split: +0.25%, -0.13%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10587>
2022-07-07 22:18:08 +00:00
Georg Lehmann
2d6b0a4177 aco/optimizer: Optimize SOPC with literal to SOPK.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15999>
2022-07-06 09:54:54 +00:00
Georg Lehmann
52f8167b25 aco/optimizer: Convert s_add_u32 with literals to s_add_i32 if carry is not used.
To allow further optimizations to s_addk_i32.

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15999>
2022-07-06 09:54:54 +00:00
Georg Lehmann
e06773281b aco/ra: Optimize some SOP2 instructions with literal to SOPK.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15999>
2022-07-06 09:54:54 +00:00
Georg Lehmann
efdb323ad2 aco/ir: Pad SOP2 and SOPC to the same size as SOPK.
Being able to directly cast instructions simplifies optimizations.

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15999>
2022-07-06 09:54:54 +00:00
Georg Lehmann
87b4f3daa1 aco/ra: Move mac encoding optimization to its own function.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15999>
2022-07-06 09:54:54 +00:00
Georg Lehmann
c9490436b6 aco/ra: Static assert that changing instruction type to VOP2 is valid.
It's not obvious that this is correct.

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15999>
2022-07-06 09:54:54 +00:00
Rhys Perry
d55c4180d5 aco/tests: add vop3p constant combine tests
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16296>
2022-07-05 16:39:56 +00:00
Rhys Perry
84b404d34d aco: don't use 32-bit fp inline constants for fp16 vop3p literals
If we're applying the literal 0x3f800000 to a fp16 vop3p instruction, we
shouldn't use the 1.0 inline constant, because the hardware will use the
16-bit 1.0: 0x00003c00.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16296>
2022-07-05 16:39:56 +00:00
Rhys Perry
994f9b5a39 aco: try sign-extending or shifting constants in propagate_constants_vop3p
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16296>
2022-07-05 16:39:56 +00:00
Rhys Perry
33befb58b0 aco: fix redirect combine in propagate_constants_vop3p() with negatives
This previously didn't correctly consider negative integers when bits=16
(which sign-extend) and would have combined 0xfffe0000.xy as -2.yx. Now it
combines 0xfffeffff.xy as that instead. It was also skipped when bits=32.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16296>
2022-07-05 16:39:56 +00:00
Rhys Perry
fc39c3a0b1 aco: don't use opsel to fold constants into dot accumulation sources
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16296>
2022-07-05 16:39:56 +00:00
Rhys Perry
ae74474509 aco: fix propagate_constants_vop3p with integer vop3p and 16-bit constants
This would have created a 1.0.xx operand from 0x3c00.xx or 0x3c003c00.xy
for vop3p instructions which have 32-bit operands.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16296>
2022-07-05 16:39:56 +00:00
Rhys Perry
9739c07d9e aco: fix single-alignbyte do_pack_2x16() path with fp inline constants
We were using a 16-bit inline constant with a 32-bit instruction and the
test would have created
"v1: %_:v[0] = v_alignbyte_b32 0.5, %_:v[1][16:32], 2" instead.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16296>
2022-07-05 16:39:56 +00:00
Rhys Perry
5d8f5615d0 aco: ignore precise flag when optimizing integer clamps
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16296>
2022-07-05 16:39:56 +00:00
Rhys Perry
61eb632775 aco: include _e64 variants of 16-bit min/max in minmax optimizations
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16296>
2022-07-05 16:39:56 +00:00
Rhys Perry
f2a346eb40 aco: don't accept med3 opcodes in get_minmax_info()
I don't think the presence of med3 here breaks anything, but it shouldn't
be here anyway.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16296>
2022-07-05 16:39:56 +00:00
Rhys Perry
f937c5be7c aco: add and use constantValue16()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16296>
2022-07-05 16:39:56 +00:00
Dave Airlie
68642e2c26 aco: drop radv_shader.h include
This shouldn't be used anymore

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16445>
2022-07-01 01:34:19 +00:00
Dave Airlie
9fe2b6b748 aco/radv: provide a vs prolog callback from aco to radv.
Avoid building the radv binary in aco, just callback with the
necessary info.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16445>
2022-07-01 01:34:19 +00:00
Dave Airlie
2dce77c239 aco/radv: provide a callback from aco shader building to build binary
This moves the radv specific code into radv, and calls back from
aco into radv.

This should allow easier radeonsi integration later.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16445>
2022-07-01 01:34:19 +00:00
Dave Airlie
e5ec50b3c7 aco: refactor the radv binary builder out of the core aco fn.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16445>
2022-07-01 01:34:19 +00:00
Rhys Perry
84f04fd080 aco/ra: update register file when updating phi definition
update_renames() fills in the wrong temp id.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 302cb5c900 ("aco/ra: remove some redundant code")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17295>
2022-06-30 19:30:55 +00:00
Daniel Schürmann
2e895f8b04 radv: vectorize nir_op_fabs
Totals from 4 (0.00% of 134913) affected shaders: (GFX10.3)
CodeSize: 37868 -> 36576 (-3.41%)
Instrs: 5332 -> 5169 (-3.06%)
Latency: 24452 -> 24174 (-1.14%)
InvThroughput: 9784 -> 9462 (-3.29%)
VClause: 54 -> 50 (-7.41%)
Copies: 520 -> 519 (-0.19%)
PreVGPRs: 266 -> 264 (-0.75%)

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15176>
2022-06-27 15:07:27 +00:00
Daniel Schürmann
c298ab0d23 aco: correctly validate v_fma_mixhi_f16 register assignment
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15176>
2022-06-27 15:07:27 +00:00
Rhys Perry
33641b2a26 aco: cleanup force-waitcnt output
If we don't reset ctx.vm_cnt/gpr_map/etc, this will spam a lot of
s_waitcnt instructions.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17207>
2022-06-24 12:44:55 +00:00