Marek Olšák
116a05c721
ac: move ac_exp_param.h to ac_nir.h
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14266 >
2022-01-05 12:46:31 +00:00
Timur Kristóf
bc94c2718a
aco: Emit VRS rate when it's per-primitive.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14193 >
2022-01-04 17:46:02 +00:00
Samuel Pitoiset
2bf25e6f6e
radv,aco: keep track of the prolog disassembly if necessary
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13376 >
2022-01-04 07:50:07 +00:00
Samuel Pitoiset
e836174077
aco: do not print prologs disassembly if no disassembler
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13376 >
2022-01-04 07:50:07 +00:00
Samuel Pitoiset
3ef736c94e
aco: fix a dynamic-stack-buffer-overflow when printing instructions
...
Detected by ASAN.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14329 >
2022-01-04 08:04:50 +01:00
Tatsuyuki Ishi
31d839aacc
aco: lower masked swizzle to DPP8
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13971 >
2021-12-31 20:56:39 +00:00
Tatsuyuki Ishi
da0412e55b
aco: support DPP8
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13971 >
2021-12-31 20:56:39 +00:00
Daniel Schürmann
16a527deef
aco: don't split VOP3P definitions
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13576 >
2021-12-31 14:52:14 +00:00
Daniel Schürmann
7e02787a54
aco: use p_create_vector(v2b,v2b) in get_alu_src_vop3p()
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13576 >
2021-12-31 14:52:14 +00:00
Daniel Schürmann
e56d8b0b2e
aco: use explicit zero-padding for 64bit image loads in expand_vector()
...
Previously, this only worked because of regClass mismatches
in the allocated vector.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13576 >
2021-12-31 14:52:14 +00:00
Daniel Schürmann
91f17e1c73
aco/optimizer: apply extract from subdword p_split_vector
...
Totals from 1345 (1.00% of 134572) affected shaders: (GFX10.3)
VGPRs: 76752 -> 76744 (-0.01%); split: -0.02%, +0.01%
SpillSGPRs: 1459 -> 1460 (+0.07%)
SpillVGPRs: 1776 -> 1784 (+0.45%); split: -0.39%, +0.84%
CodeSize: 13310964 -> 13309420 (-0.01%); split: -0.06%, +0.05%
Scratch: 178176 -> 179200 (+0.57%)
Instrs: 2516874 -> 2516860 (-0.00%); split: -0.05%, +0.05%
Latency: 23228506 -> 23230338 (+0.01%); split: -0.14%, +0.15%
InvThroughput: 6002384 -> 6000158 (-0.04%); split: -0.24%, +0.21%
VClause: 41115 -> 41117 (+0.00%); split: -0.28%, +0.29%
SClause: 104639 -> 104664 (+0.02%); split: -0.07%, +0.09%
Copies: 185121 -> 184862 (-0.14%); split: -0.69%, +0.55%
Branches: 100740 -> 100735 (-0.00%); split: -0.01%, +0.00%
PreVGPRs: 70119 -> 69968 (-0.22%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13576 >
2021-12-31 14:52:14 +00:00
Daniel Schürmann
fb622775b5
aco/optimizer: optimize extract(extract())
...
Totals from 53 (0.04% of 134572) affected shaders: (GFX10.3)
SpillVGPRs: 1780 -> 1776 (-0.22%); split: -0.34%, +0.11%
CodeSize: 968352 -> 963196 (-0.53%); split: -0.55%, +0.02%
Scratch: 180224 -> 178176 (-1.14%)
Instrs: 169800 -> 169158 (-0.38%); split: -0.39%, +0.01%
Latency: 6186064 -> 6141408 (-0.72%); split: -1.16%, +0.44%
InvThroughput: 2605044 -> 2582967 (-0.85%); split: -1.37%, +0.52%
VClause: 4851 -> 4866 (+0.31%); split: -0.16%, +0.47%
SClause: 1744 -> 1746 (+0.11%)
Copies: 42874 -> 42325 (-1.28%); split: -1.40%, +0.12%
Branches: 5762 -> 5765 (+0.05%); split: -0.02%, +0.07%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13576 >
2021-12-31 14:52:14 +00:00
Daniel Schürmann
5ad9c20d4a
aco/optimizer: apply extract from p_extract_vector
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13576 >
2021-12-31 14:52:14 +00:00
Daniel Schürmann
11712729eb
aco/optimizer: keep instr_mod_labels after applying extract
...
This allows to use clamp on SDWA and VOP3 opsel instructions.
Totals from 32 (0.02% of 134572) affected shaders: (GFX10.3)
SpillVGPRs: 1783 -> 1780 (-0.17%); split: -0.50%, +0.34%
CodeSize: 881480 -> 881496 (+0.00%); split: -0.15%, +0.15%
Instrs: 154400 -> 154388 (-0.01%); split: -0.13%, +0.12%
Latency: 5021791 -> 5033485 (+0.23%); split: -0.67%, +0.90%
InvThroughput: 2486454 -> 2492312 (+0.24%); split: -0.67%, +0.91%
VClause: 4763 -> 4755 (-0.17%); split: -0.52%, +0.36%
Copies: 42866 -> 42965 (+0.23%); split: -0.25%, +0.48%
Branches: 5640 -> 5639 (-0.02%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13576 >
2021-12-31 14:52:14 +00:00
Daniel Schürmann
1502c22e2c
aco: don't allow SDWA on VOP3P instructions
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13576 >
2021-12-31 14:52:14 +00:00
Timur Kristóf
8d238f5581
aco: Export per-primitive mesh shader output attributes.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13580 >
2021-12-31 13:05:09 +00:00
Timur Kristóf
fc1424f1d8
aco: Use the correct outinfo for mesh shaders.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13580 >
2021-12-31 13:05:09 +00:00
Timur Kristóf
92556d6067
aco: Add 1D workgroup_id support for mesh shaders.
...
I'll add support for 3D workgroup_id later, but NV_mesh_shader only
supports 1D workgroups.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13580 >
2021-12-31 13:05:09 +00:00
Timur Kristóf
7759323b75
aco: Update README about NGG and mesh shaders.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13580 >
2021-12-31 13:05:09 +00:00
Timur Kristóf
6766e6a985
aco: Add Mesh and Task shader stages.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13580 >
2021-12-31 13:05:09 +00:00
Timur Kristóf
b293299776
aco/optimizer_postRA: Fix applying VCC to branches.
...
Fixes: a93092d0ed
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14281 >
2021-12-21 22:53:23 +00:00
Timur Kristóf
ce4daa259c
aco/optimizer_postRA: Fix combining DPP into VALU.
...
Fixes: 4ac47ad1cd
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14281 >
2021-12-21 22:53:23 +00:00
Daniel Schürmann
d36a43598c
aco/ra: fix get_reg_for_operand() in case of stride mismatches
...
We have to clear the register file from the previous operand
as otherwise, there might be no space left.
Totals from 5 (0.00% of 134572) affected shaders: (GFX10.3)
CodeSize: 21144 -> 21000 (-0.68%); split: -0.72%, +0.04%
Instrs: 3738 -> 3720 (-0.48%); split: -0.51%, +0.03%
Latency: 517229 -> 516319 (-0.18%); split: -0.18%, +0.00%
InvThroughput: 49068 -> 48902 (-0.34%); split: -0.38%, +0.04%
Copies: 501 -> 483 (-3.59%); split: -3.79%, +0.20%
Cc: mesa-stable
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14279 >
2021-12-21 17:15:45 +00:00
Daniel Schürmann
30a7199e37
aco/optimizer: propagate and fold inline constants on VOP3P instructions
...
This patch aims to propagate and fold constants on VOP3P instructions
by using omod selection and the fneg modifier.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13688 >
2021-12-21 13:23:36 +01:00
Daniel Schürmann
62bcfcd0a8
aco: change fneg for VOP3P to use fmul with +1.0
...
This will be useful to be able to also apply
fneg_lo and fneg_hi.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13688 >
2021-12-21 13:23:36 +01:00
Daniel Schürmann
193bd740ab
aco/optimizer: fix fneg modifier propagation on VOP3P
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13688 >
2021-12-21 13:23:36 +01:00
Rhys Perry
fa4e08112e
aco: remove SMEM constant/addition combining out of the loop
...
There's no reason for this optimization to be in this loop.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13755 >
2021-12-17 22:14:36 +00:00
Rhys Perry
dd18925f86
aco: skip &-4 before SMEM
...
The hardware ignores the low 2 bits. I'm not sure if they are ignored
before or after the address is calculated, but this optimization should be
cautious enough.
fossil-db (Sienna Cichlid):
Totals from 259 (0.19% of 134572) affected shaders:
SpillSGPRs: 1381 -> 1382 (+0.07%)
SpillVGPRs: 1783 -> 1782 (-0.06%); split: -0.67%, +0.62%
CodeSize: 1598612 -> 1596084 (-0.16%); split: -0.30%, +0.14%
Scratch: 180224 -> 179200 (-0.57%); split: -1.14%, +0.57%
Instrs: 284885 -> 284268 (-0.22%); split: -0.34%, +0.12%
Latency: 6585634 -> 6603388 (+0.27%); split: -0.48%, +0.75%
InvThroughput: 2638983 -> 2648474 (+0.36%); split: -0.58%, +0.94%
VClause: 6797 -> 6820 (+0.34%); split: -0.15%, +0.49%
SClause: 6569 -> 6574 (+0.08%); split: -1.11%, +1.19%
Copies: 50561 -> 50586 (+0.05%); split: -0.61%, +0.66%
Branches: 10058 -> 10062 (+0.04%); split: -0.01%, +0.05%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13755 >
2021-12-17 22:14:36 +00:00
Rhys Perry
cf5fc4b973
aco: disallow SMEM offsets that are not multiples of 4
...
These can't be encoded on GFX6/7, and combining these additions causes
CTS failures on GFX10.3.
I think the low 2 MSBs are ignored before the addition, not after, so
load(a + 3, 0) becomes load(a, 3), which is the same as load(a, 0).
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13755 >
2021-12-17 22:14:36 +00:00
Rhys Perry
94603786c5
aco: fix check_vop3_operands() for f16vec2 ffma fneg combine
...
For v_pk_fma_f16, we should consider all three operands, not the first
two.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 15a375b4c8 ("radv,aco: don't lower some ffma instructions")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14229 >
2021-12-17 11:16:12 +00:00
Samuel Pitoiset
5ce4017a2b
radv,aco: do not disable anisotropy filtering for non-mipmap images
...
This fixes
dEQP-VK.texture.filtering_anisotropy.single_level.anisotropy_*.mag_linear_min_linear.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14171 >
2021-12-16 07:20:50 +00:00
Rhys Perry
165ca5088b
radv,aco: implement nir_op_ffma
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9805 >
2021-12-13 11:22:33 +00:00
Rhys Perry
c5f02a1cd3
aco: swap multiplication operands if needed to create v_fmac_f32/etc
...
For v_pk_fma_f32 and v_fma_f32 from nir_op_ffma, we don't try to put
scalars in the first operand.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9805 >
2021-12-13 11:22:33 +00:00
Rhys Perry
f4f5d577fc
aco: swap operands if necessary to create v_madak/v_fmaak
...
Also rewrite the check_literal logic to be more straightforward.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9805 >
2021-12-13 11:22:33 +00:00
Rhys Perry
2665320c78
aco: create v_fmamk_f32/v_fmaak_f32 from nir_op_ffma
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9805 >
2021-12-13 11:22:33 +00:00
Rhys Perry
a487747ebd
aco: use more predictable tiebreaker when forming MADs
...
fossil-db (GFX10.3):
Totals from 84981 (58.10% of 146267) affected shaders:
VGPRs: 3829896 -> 3820480 (-0.25%); split: -0.33%, +0.08%
CodeSize: 270860472 -> 270850132 (-0.00%); split: -0.08%, +0.08%
MaxWaves: 2035822 -> 2042516 (+0.33%); split: +0.39%, -0.06%
Instrs: 51285526 -> 51308869 (+0.05%); split: -0.03%, +0.08%
Latency: 931503706 -> 932556231 (+0.11%); split: -0.19%, +0.30%
InvThroughput: 217084232 -> 217070849 (-0.01%); split: -0.12%, +0.11%
fossil-db (GFX10):
Totals from 85520 (58.47% of 146267) affected shaders:
VGPRs: 3729132 -> 3725344 (-0.10%); split: -0.21%, +0.10%
CodeSize: 272796500 -> 272783084 (-0.00%); split: -0.09%, +0.08%
MaxWaves: 2246410 -> 2249012 (+0.12%); split: +0.17%, -0.05%
Instrs: 51643962 -> 51664865 (+0.04%); split: -0.04%, +0.08%
Latency: 932331949 -> 933274979 (+0.10%); split: -0.19%, +0.29%
InvThroughput: 214187040 -> 214130994 (-0.03%); split: -0.13%, +0.11%
fossil-db (GFX9):
Totals from 84619 (57.80% of 146401) affected shaders:
SGPRs: 5366240 -> 5366944 (+0.01%); split: -0.09%, +0.10%
VGPRs: 3765608 -> 3764972 (-0.02%); split: -0.23%, +0.22%
CodeSize: 263634732 -> 263616320 (-0.01%); split: -0.08%, +0.08%
MaxWaves: 546617 -> 547091 (+0.09%); split: +0.18%, -0.09%
Instrs: 51426195 -> 51458334 (+0.06%); split: -0.03%, +0.10%
Latency: 1164445660 -> 1161923480 (-0.22%); split: -0.46%, +0.24%
InvThroughput: 542964697 -> 542329595 (-0.12%); split: -0.26%, +0.14%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9805 >
2021-12-13 11:22:33 +00:00
Rhys Perry
c7fa15b381
aco: improve clrx disassembly
...
- remove uninteresting lines of output
- remove binary offset before instructions, for easier diffing
- replace generated labels with block numbers
- add encoded instructions at the end of lines, like LLVM dissaembly
- print constant data instead of trying to disassemble it
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14042 >
2021-12-10 23:46:30 +00:00
Rhys Perry
786d434397
aco: don't create unnecessary addition in indirect get_sampler_desc()
...
I don't think this has any effect on GFX9+ because the addition is
combined into the load.
fossil-db (polaris10):
Totals from 12595 (9.29% of 135627) affected shaders:
SGPRs: 1054348 -> 1054860 (+0.05%); split: -0.02%, +0.07%
VGPRs: 667240 -> 667320 (+0.01%); split: -0.01%, +0.02%
CodeSize: 82761508 -> 82512816 (-0.30%); split: -0.30%, +0.00%
MaxWaves: 62182 -> 62181 (-0.00%)
Instrs: 16072934 -> 16010764 (-0.39%); split: -0.39%, +0.00%
Latency: 582819635 -> 582287964 (-0.09%); split: -0.13%, +0.04%
InvThroughput: 276460536 -> 276417613 (-0.02%); split: -0.06%, +0.05%
VClause: 261656 -> 261654 (-0.00%); split: -0.01%, +0.01%
SClause: 680952 -> 680854 (-0.01%); split: -0.05%, +0.04%
Copies: 1727202 -> 1727742 (+0.03%); split: -0.12%, +0.15%
Branches: 547050 -> 547033 (-0.00%); split: -0.01%, +0.00%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14043 >
2021-12-09 17:58:54 +00:00
Timur Kristóf
77db4e27b1
aco: Clean up and fix quad group instructions with WQM.
...
According to the Vulkan spec chapter 9.25 Helper Invocations,
quad group operations have to be executed by helper invocations.
This commit cleans up the code for quad group instructions by
unifying the code path of quad broadcast with the others, and then
calling emit_wqm just once at the end.
Fixes: 93c8ebfa78
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5570
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13929 >
2021-12-09 17:36:51 +00:00
Timur Kristóf
c3eebc860a
aco: Use util_widen_mask.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14005 >
2021-12-03 18:29:13 +00:00
Rhys Perry
6afba80534
aco: don't create DPP instructions with SGPR operands
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 2e6834d4f6 ("aco: combine DPP into VALU before RA")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13976 >
2021-11-30 20:11:48 +00:00
Rhys Perry
65a78b2252
aco: properly update use counts if a extract is still used
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13909 >
2021-11-29 18:52:12 +00:00
Samuel Pitoiset
add883bf9b
aco: fix right shift of exponent 32 detected by UBSAN
...
src/amd/compiler/aco_optimizer.cpp:1316:17: runtime error: shift
exponent 32 is too large for 32-bit type 'unsigned int'
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13951 >
2021-11-25 16:15:30 +00:00
Rhys Perry
cc2894345f
aco/spill: use spills_entry instead of spills_exit to kill linear VGPRs
...
If a predecessor has only spilled constants (no temporaries), spills_exit
will be empty.
fossil-db (Sienna Cichlid):
Totals from 2 (0.00% of 128647) affected shaders:
Latency: 139106 -> 139104 (-0.00%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5633
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13821 >
2021-11-22 19:46:22 +00:00
Timur Kristóf
5aa39253cb
nir: Rename nir_get_io_vertex_index_src and include per-primitive I/O.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13466 >
2021-11-16 07:46:55 +00:00
Rhys Perry
d89461208b
aco: consider pseudo-instructions reading exec in needs_exec_mask()
...
No matter the format, this should return true if the instruction has an
exec operand.
Otherwise, eliminate_useless_exec_writes_in_block() could remove an exec
write in a block if it's successor begins with:
s2: %3737:s[8-9] = p_parallelcopy %0:exec
s2: %0:exec, s1: %3738:scc = s_wqm_b64 %3737:s[8-9]
Totals from 3 (0.00% of 150170) affected shaders (GFX10.3):
CodeSize: 23184 -> 23204 (+0.09%)
Instrs: 4143 -> 4148 (+0.12%)
Latency: 98379 -> 98382 (+0.00%)
Copies: 172 -> 175 (+1.74%)
Branches: 95 -> 97 (+2.11%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: bc13049747 ("aco: Eliminate useless exec writes in jump threading.")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5620
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13776 >
2021-11-15 18:58:37 +00:00
Daniel Schürmann
ab21183b5d
aco: implement D16 texture loads
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13592 >
2021-11-15 18:28:20 +00:00
Daniel Schürmann
626aa7b648
aco: workaround GFX9 hardware bug for D16 image instructions
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13592 >
2021-11-15 18:28:20 +00:00
Daniel Schürmann
8f1483cd5c
aco: add more D16 load/store instructions to RA and validator
...
This enables correct handling for
buffer_load/store_format_d16_x and
D16 Image instructions.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13592 >
2021-11-15 18:28:20 +00:00
Timur Kristóf
d80c7f3406
aco: Fix how p_is_helper interacts with optimizations.
...
p_is_helper doesn't have any operands, so ACO's value numbering and/or
the pre-RA optimizer could incorrectly recognize two such instructions
as the same.
This patch adds exec as an operand to p_is_helper in order to achieve
correct behavior.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5570
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13577 >
2021-11-13 16:32:02 +01:00