Rhys Perry
7e30f99b0a
aco: don't combine fneg/fabs of different bit-size
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14810 >
2022-02-03 16:02:04 +00:00
Rhys Perry
27f1f5537d
aco/tests: implement sub-dword program inputs
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14810 >
2022-02-03 16:02:04 +00:00
Rhys Perry
e86b88f85b
aco/tests: add a bunch more building helpers
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14810 >
2022-02-03 16:02:04 +00:00
Rhys Perry
ba44634e4d
aco: fix v_mac_legacy_f32
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: f68797ead7 ("aco: create v_mac_legacy_f32/v_fmac_legacy_f32")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5952
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14820 >
2022-02-01 15:08:04 +00:00
Rhys Perry
16e0c312fa
aco: preserve pass_flags during format conversions
...
This helps the "vopc() & exec" optimization.
fossil-db (Sienna Cichlid):
Totals from 1638 (1.21% of 134913) affected shaders:
CodeSize: 3331804 -> 3327520 (-0.13%); split: -0.19%, +0.06%
Instrs: 611807 -> 610096 (-0.28%)
Latency: 5579326 -> 5574874 (-0.08%)
InvThroughput: 936782 -> 936731 (-0.01%); split: -0.01%, +0.00%
Copies: 43324 -> 43302 (-0.05%); split: -0.06%, +0.01%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14773 >
2022-01-31 13:45:01 +00:00
Rhys Perry
1804c21fb5
aco: optimize abs(mul(a, b))
...
fossil-db (Sienna Cichlid):
Totals from 18 (0.01% of 134913) affected shaders:
CodeSize: 173924 -> 173852 (-0.04%)
Instrs: 33864 -> 33846 (-0.05%)
Latency: 122233 -> 122211 (-0.02%)
InvThroughput: 22482 -> 22462 (-0.09%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14773 >
2022-01-31 13:45:01 +00:00
Rhys Perry
452975f257
aco: fix neg(abs(mul(a, b))) if the mul is not VOP3
...
Previously, is_abs was just ignored if mul_instr->isVOP3()==false.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 93c8ebfa78 ("aco: Initial commit of independent AMD compiler")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14773 >
2022-01-31 13:45:01 +00:00
Rhys Perry
f68797ead7
aco: create v_mac_legacy_f32/v_fmac_legacy_f32
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13436 >
2022-01-20 22:54:42 +00:00
Rhys Perry
43e32ad074
aco: consider legacy multiplications in optimizer
...
Optimize omod, -(a*b), b2f(a)*b, a*1, a*0 and create MAD/FMA.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13436 >
2022-01-20 22:54:42 +00:00
Rhys Perry
e7f91b194a
radv,aco,ac/llvm: implement fmulz and ffmaz
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13436 >
2022-01-20 22:54:42 +00:00
Dave Airlie
d54c07b4c4
mesa/*: use an internal enum for tessellation primitive types.
...
To avoid dragging gl.h into places it has no business being,
defined tessellation primitive mode to an enum.
This has a lot of fallout all over the place.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14605 >
2022-01-19 21:54:58 +00:00
Rhys Perry
b59764a9fc
aco: use p_extract for SGPR nir_op_unpack_half_2x16_split_y
...
fossil-db (Sienna Cichlid):
Totals from 7264 (5.40% of 134627) affected shaders:
VGPRs: 548152 -> 548184 (+0.01%)
SpillSGPRs: 7615 -> 7607 (-0.11%)
CodeSize: 71025152 -> 70993036 (-0.05%); split: -0.05%, +0.00%
Instrs: 13386799 -> 13298580 (-0.66%); split: -0.66%, +0.00%
Latency: 177464497 -> 177091693 (-0.21%); split: -0.21%, +0.00%
InvThroughput: 32185148 -> 32151873 (-0.10%); split: -0.10%, +0.00%
VClause: 233167 -> 233166 (-0.00%); split: -0.00%, +0.00%
SClause: 468423 -> 468426 (+0.00%); split: -0.00%, +0.01%
Copies: 950727 -> 942753 (-0.84%); split: -0.85%, +0.02%
Branches: 455919 -> 455901 (-0.00%); split: -0.01%, +0.00%
fossil-db (Vega):
Totals from 7264 (5.39% of 134762) affected shaders:
SGPRs: 738800 -> 738816 (+0.00%)
VGPRs: 550264 -> 550344 (+0.01%)
SpillSGPRs: 11149 -> 11157 (+0.07%)
CodeSize: 67487104 -> 67466772 (-0.03%); split: -0.03%, +0.00%
Instrs: 13142106 -> 13061767 (-0.61%)
Latency: 209278575 -> 208438854 (-0.40%); split: -0.40%, +0.00%
InvThroughput: 81486405 -> 81265773 (-0.27%); split: -0.27%, +0.00%
VClause: 222293 -> 222291 (-0.00%); split: -0.00%, +0.00%
SClause: 447783 -> 447800 (+0.00%); split: -0.00%, +0.01%
Copies: 1243760 -> 1238842 (-0.40%); split: -0.43%, +0.03%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14592 >
2022-01-19 12:41:48 +00:00
Samuel Pitoiset
e6173ed1d2
radv: allow to disable anisotropic filtering for single level image with drirc
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14471 >
2022-01-13 16:17:48 +00:00
Daniel Schürmann
8a78706643
nir: refactor nir_opt_move
...
This patch is a rewrite of nir_opt_move.
Differently from the previous version, each instruction is checked
if it can be moved downwards and then inserted before the first user
of the definition. The advantage is that less insert operations are
performed, the original order is kept if two movable instructions have
the same first user, and instructions without user in the same block
are moved towards the end.
v2: Only return true if an instruction really changed the position.
Don't care for discards, this will be handled by another MR.
v3: fix self-referring phis and update according to nir_can_move_instr().
v4: use nir_can_move_instr() and nir_instr_ssa_def()
v5: deduplicate some code
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3657 >
2022-01-12 13:41:54 +00:00
Daniel Schürmann
4e2b624c10
aco: validate VOP3P opsel correctly
...
Before RA, subdword operands must use .xx
After RA, opsel can either be .xx or .yy
Cc: mesa-stable
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14472 >
2022-01-11 11:41:12 +00:00
Rhys Perry
60c711833f
aco: remove pack_half_2x16(a, 0) optimization
...
This makes the compiler less predictable and should only have a very small
effect on performance.
fossil-db (Vega):
Totals from 2410 (1.79% of 134756) affected shaders:
CodeSize: 6911568 -> 6942840 (+0.45%)
Fixes Horizon Zero Dawn artifacts.
If a shader has:
a = pack_half_2x16(a, 0) //rtne
store(pack_half_2x16(0, b) | a) //rtne
a = unpack_2x16(a).x
It will become:
store(pack_half_2x16(a, b)) //rtz
a = unpack_2x16(pack_half_2x16(a, 0)).x //rtne
So a later shader with "unpack_2x16(load()).x" will use "a" rounded to
zero, while the previous shader will use "a" rounded to the nearest even.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 2f125908b3 ("radv,aco: lower_pack_half_2x16")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14475 >
2022-01-10 22:19:29 +00:00
Marek Olšák
116a05c721
ac: move ac_exp_param.h to ac_nir.h
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14266 >
2022-01-05 12:46:31 +00:00
Timur Kristóf
bc94c2718a
aco: Emit VRS rate when it's per-primitive.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14193 >
2022-01-04 17:46:02 +00:00
Samuel Pitoiset
2bf25e6f6e
radv,aco: keep track of the prolog disassembly if necessary
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13376 >
2022-01-04 07:50:07 +00:00
Samuel Pitoiset
e836174077
aco: do not print prologs disassembly if no disassembler
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13376 >
2022-01-04 07:50:07 +00:00
Samuel Pitoiset
3ef736c94e
aco: fix a dynamic-stack-buffer-overflow when printing instructions
...
Detected by ASAN.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14329 >
2022-01-04 08:04:50 +01:00
Tatsuyuki Ishi
31d839aacc
aco: lower masked swizzle to DPP8
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13971 >
2021-12-31 20:56:39 +00:00
Tatsuyuki Ishi
da0412e55b
aco: support DPP8
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13971 >
2021-12-31 20:56:39 +00:00
Daniel Schürmann
16a527deef
aco: don't split VOP3P definitions
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13576 >
2021-12-31 14:52:14 +00:00
Daniel Schürmann
7e02787a54
aco: use p_create_vector(v2b,v2b) in get_alu_src_vop3p()
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13576 >
2021-12-31 14:52:14 +00:00
Daniel Schürmann
e56d8b0b2e
aco: use explicit zero-padding for 64bit image loads in expand_vector()
...
Previously, this only worked because of regClass mismatches
in the allocated vector.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13576 >
2021-12-31 14:52:14 +00:00
Daniel Schürmann
91f17e1c73
aco/optimizer: apply extract from subdword p_split_vector
...
Totals from 1345 (1.00% of 134572) affected shaders: (GFX10.3)
VGPRs: 76752 -> 76744 (-0.01%); split: -0.02%, +0.01%
SpillSGPRs: 1459 -> 1460 (+0.07%)
SpillVGPRs: 1776 -> 1784 (+0.45%); split: -0.39%, +0.84%
CodeSize: 13310964 -> 13309420 (-0.01%); split: -0.06%, +0.05%
Scratch: 178176 -> 179200 (+0.57%)
Instrs: 2516874 -> 2516860 (-0.00%); split: -0.05%, +0.05%
Latency: 23228506 -> 23230338 (+0.01%); split: -0.14%, +0.15%
InvThroughput: 6002384 -> 6000158 (-0.04%); split: -0.24%, +0.21%
VClause: 41115 -> 41117 (+0.00%); split: -0.28%, +0.29%
SClause: 104639 -> 104664 (+0.02%); split: -0.07%, +0.09%
Copies: 185121 -> 184862 (-0.14%); split: -0.69%, +0.55%
Branches: 100740 -> 100735 (-0.00%); split: -0.01%, +0.00%
PreVGPRs: 70119 -> 69968 (-0.22%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13576 >
2021-12-31 14:52:14 +00:00
Daniel Schürmann
fb622775b5
aco/optimizer: optimize extract(extract())
...
Totals from 53 (0.04% of 134572) affected shaders: (GFX10.3)
SpillVGPRs: 1780 -> 1776 (-0.22%); split: -0.34%, +0.11%
CodeSize: 968352 -> 963196 (-0.53%); split: -0.55%, +0.02%
Scratch: 180224 -> 178176 (-1.14%)
Instrs: 169800 -> 169158 (-0.38%); split: -0.39%, +0.01%
Latency: 6186064 -> 6141408 (-0.72%); split: -1.16%, +0.44%
InvThroughput: 2605044 -> 2582967 (-0.85%); split: -1.37%, +0.52%
VClause: 4851 -> 4866 (+0.31%); split: -0.16%, +0.47%
SClause: 1744 -> 1746 (+0.11%)
Copies: 42874 -> 42325 (-1.28%); split: -1.40%, +0.12%
Branches: 5762 -> 5765 (+0.05%); split: -0.02%, +0.07%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13576 >
2021-12-31 14:52:14 +00:00
Daniel Schürmann
5ad9c20d4a
aco/optimizer: apply extract from p_extract_vector
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13576 >
2021-12-31 14:52:14 +00:00
Daniel Schürmann
11712729eb
aco/optimizer: keep instr_mod_labels after applying extract
...
This allows to use clamp on SDWA and VOP3 opsel instructions.
Totals from 32 (0.02% of 134572) affected shaders: (GFX10.3)
SpillVGPRs: 1783 -> 1780 (-0.17%); split: -0.50%, +0.34%
CodeSize: 881480 -> 881496 (+0.00%); split: -0.15%, +0.15%
Instrs: 154400 -> 154388 (-0.01%); split: -0.13%, +0.12%
Latency: 5021791 -> 5033485 (+0.23%); split: -0.67%, +0.90%
InvThroughput: 2486454 -> 2492312 (+0.24%); split: -0.67%, +0.91%
VClause: 4763 -> 4755 (-0.17%); split: -0.52%, +0.36%
Copies: 42866 -> 42965 (+0.23%); split: -0.25%, +0.48%
Branches: 5640 -> 5639 (-0.02%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13576 >
2021-12-31 14:52:14 +00:00
Daniel Schürmann
1502c22e2c
aco: don't allow SDWA on VOP3P instructions
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13576 >
2021-12-31 14:52:14 +00:00
Timur Kristóf
8d238f5581
aco: Export per-primitive mesh shader output attributes.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13580 >
2021-12-31 13:05:09 +00:00
Timur Kristóf
fc1424f1d8
aco: Use the correct outinfo for mesh shaders.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13580 >
2021-12-31 13:05:09 +00:00
Timur Kristóf
92556d6067
aco: Add 1D workgroup_id support for mesh shaders.
...
I'll add support for 3D workgroup_id later, but NV_mesh_shader only
supports 1D workgroups.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13580 >
2021-12-31 13:05:09 +00:00
Timur Kristóf
7759323b75
aco: Update README about NGG and mesh shaders.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13580 >
2021-12-31 13:05:09 +00:00
Timur Kristóf
6766e6a985
aco: Add Mesh and Task shader stages.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13580 >
2021-12-31 13:05:09 +00:00
Timur Kristóf
b293299776
aco/optimizer_postRA: Fix applying VCC to branches.
...
Fixes: a93092d0ed
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14281 >
2021-12-21 22:53:23 +00:00
Timur Kristóf
ce4daa259c
aco/optimizer_postRA: Fix combining DPP into VALU.
...
Fixes: 4ac47ad1cd
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14281 >
2021-12-21 22:53:23 +00:00
Daniel Schürmann
d36a43598c
aco/ra: fix get_reg_for_operand() in case of stride mismatches
...
We have to clear the register file from the previous operand
as otherwise, there might be no space left.
Totals from 5 (0.00% of 134572) affected shaders: (GFX10.3)
CodeSize: 21144 -> 21000 (-0.68%); split: -0.72%, +0.04%
Instrs: 3738 -> 3720 (-0.48%); split: -0.51%, +0.03%
Latency: 517229 -> 516319 (-0.18%); split: -0.18%, +0.00%
InvThroughput: 49068 -> 48902 (-0.34%); split: -0.38%, +0.04%
Copies: 501 -> 483 (-3.59%); split: -3.79%, +0.20%
Cc: mesa-stable
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14279 >
2021-12-21 17:15:45 +00:00
Daniel Schürmann
30a7199e37
aco/optimizer: propagate and fold inline constants on VOP3P instructions
...
This patch aims to propagate and fold constants on VOP3P instructions
by using omod selection and the fneg modifier.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13688 >
2021-12-21 13:23:36 +01:00
Daniel Schürmann
62bcfcd0a8
aco: change fneg for VOP3P to use fmul with +1.0
...
This will be useful to be able to also apply
fneg_lo and fneg_hi.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13688 >
2021-12-21 13:23:36 +01:00
Daniel Schürmann
193bd740ab
aco/optimizer: fix fneg modifier propagation on VOP3P
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13688 >
2021-12-21 13:23:36 +01:00
Rhys Perry
fa4e08112e
aco: remove SMEM constant/addition combining out of the loop
...
There's no reason for this optimization to be in this loop.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13755 >
2021-12-17 22:14:36 +00:00
Rhys Perry
dd18925f86
aco: skip &-4 before SMEM
...
The hardware ignores the low 2 bits. I'm not sure if they are ignored
before or after the address is calculated, but this optimization should be
cautious enough.
fossil-db (Sienna Cichlid):
Totals from 259 (0.19% of 134572) affected shaders:
SpillSGPRs: 1381 -> 1382 (+0.07%)
SpillVGPRs: 1783 -> 1782 (-0.06%); split: -0.67%, +0.62%
CodeSize: 1598612 -> 1596084 (-0.16%); split: -0.30%, +0.14%
Scratch: 180224 -> 179200 (-0.57%); split: -1.14%, +0.57%
Instrs: 284885 -> 284268 (-0.22%); split: -0.34%, +0.12%
Latency: 6585634 -> 6603388 (+0.27%); split: -0.48%, +0.75%
InvThroughput: 2638983 -> 2648474 (+0.36%); split: -0.58%, +0.94%
VClause: 6797 -> 6820 (+0.34%); split: -0.15%, +0.49%
SClause: 6569 -> 6574 (+0.08%); split: -1.11%, +1.19%
Copies: 50561 -> 50586 (+0.05%); split: -0.61%, +0.66%
Branches: 10058 -> 10062 (+0.04%); split: -0.01%, +0.05%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13755 >
2021-12-17 22:14:36 +00:00
Rhys Perry
cf5fc4b973
aco: disallow SMEM offsets that are not multiples of 4
...
These can't be encoded on GFX6/7, and combining these additions causes
CTS failures on GFX10.3.
I think the low 2 MSBs are ignored before the addition, not after, so
load(a + 3, 0) becomes load(a, 3), which is the same as load(a, 0).
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13755 >
2021-12-17 22:14:36 +00:00
Rhys Perry
94603786c5
aco: fix check_vop3_operands() for f16vec2 ffma fneg combine
...
For v_pk_fma_f16, we should consider all three operands, not the first
two.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 15a375b4c8 ("radv,aco: don't lower some ffma instructions")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14229 >
2021-12-17 11:16:12 +00:00
Samuel Pitoiset
5ce4017a2b
radv,aco: do not disable anisotropy filtering for non-mipmap images
...
This fixes
dEQP-VK.texture.filtering_anisotropy.single_level.anisotropy_*.mag_linear_min_linear.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14171 >
2021-12-16 07:20:50 +00:00
Rhys Perry
165ca5088b
radv,aco: implement nir_op_ffma
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9805 >
2021-12-13 11:22:33 +00:00
Rhys Perry
c5f02a1cd3
aco: swap multiplication operands if needed to create v_fmac_f32/etc
...
For v_pk_fma_f32 and v_fma_f32 from nir_op_ffma, we don't try to put
scalars in the first operand.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9805 >
2021-12-13 11:22:33 +00:00
Rhys Perry
f4f5d577fc
aco: swap operands if necessary to create v_madak/v_fmaak
...
Also rewrite the check_literal logic to be more straightforward.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9805 >
2021-12-13 11:22:33 +00:00