Georg Lehmann
d0eebb0e8b
aco: access neg/abs as int in usesModifiers
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21766 >
2023-03-09 14:15:14 +00:00
Georg Lehmann
828aff2a2d
aco: use array indexing for opsel/opsel_lo/opsel_hi
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21766 >
2023-03-09 14:15:13 +00:00
Georg Lehmann
a47c3f84fb
aco: use integer access for neg_lo/neg_hi
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21766 >
2023-03-09 14:15:13 +00:00
Georg Lehmann
60cd3ba39f
aco: copy abs/neg with assignment
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21766 >
2023-03-09 14:15:13 +00:00
Georg Lehmann
0614c2e8bd
aco: don't reallocate fma{mk,ak,_mix} instruction
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21762 >
2023-03-08 18:42:21 +00:00
Georg Lehmann
a4873071e6
aco/optimizer: don't reallocate instruction when converting to VOP3
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21762 >
2023-03-08 18:42:21 +00:00
Daniel Schürmann
41ae2d0725
radv/rt: use terminate() when returning from raygen shaders
...
Q2RTX stats:
Totals from 7 (0.01% of 134913) affected shaders:
CodeSize: 204712 -> 204744 (+0.02%); split: -0.06%, +0.07%
Instrs: 37526 -> 37522 (-0.01%); split: -0.07%, +0.06%
Latency: 950563 -> 956024 (+0.57%)
InvThroughput: 187915 -> 188977 (+0.57%)
Copies: 4829 -> 4763 (-1.37%)
Branches: 1570 -> 1583 (+0.83%)
PreSGPRs: 407 -> 400 (-1.72%)
PreVGPRs: 614 -> 617 (+0.49%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21736 >
2023-03-08 16:59:41 +00:00
Daniel Schürmann
cd1e5b1858
aco: fix NIR infinite loops
...
The previous solution breaks potential loop header phis.
Move the dummy-break to the bottom of the loop.
Fixes: dEQP-VK.reconvergence.subgroup_uniform_control_flow_ballot.*
Fixes: a9c4a31d8d ('aco: handle NIR loops without breaks')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21736 >
2023-03-08 16:59:41 +00:00
Timur Kristóf
87de5b2b9e
aco: Don't include headers from radv.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21696 >
2023-03-08 04:39:18 +00:00
Timur Kristóf
a0141c6308
aco, radv: Don't use radv_shader_args in aco.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21696 >
2023-03-08 04:39:18 +00:00
Timur Kristóf
e9793331db
aco, radv: Move PS epilog and VS prolog args to their info structs.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21696 >
2023-03-08 04:39:18 +00:00
Timur Kristóf
84a2cea596
aco, radv: Rename aco_*_key to aco_*_info.
...
The naming of aco_*_key didn't make sense because they
were never actually used as cache keys, only radv_*_key
are used as cache keys.
Rename the aco structs to aco_*_info instead.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21696 >
2023-03-08 04:39:18 +00:00
Qiang Yu
91e68db0e1
aco, radv: Move is_trap_handler_shader to aco info.
...
v2 by Timur Kristóf:
- Rebase this patch on latest main.
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21696 >
2023-03-08 04:39:18 +00:00
Qiang Yu
978220c99a
aco, radv: Add load_grid_size_from_user_sgpr to aco options.
...
v2 by Timur Kristóf:
- Rebase this patch.
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21696 >
2023-03-08 04:39:18 +00:00
Timur Kristóf
3058ab6090
aco: Generalize vs_inputs to args_pending_vmem.
...
Handle arguments that need a waitcnt without relying on
RADV specific VS input information.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21696 >
2023-03-08 04:39:18 +00:00
Georg Lehmann
57557e8815
aco/assembler/gfx11: simplify 16bit VOP12C promotion to VOP3
...
With the shared struct for modifies, this is can be a lot cleaner now.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21761 >
2023-03-07 22:38:39 +00:00
Marek Olšák
f7076d129d
amd: add nir_intrinsic_xfb_counter_sub_amd and fix overflowed streamout offsets
...
Fixes: 5ec79f9899 - ac/nir/ngg: nogs support streamout
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21584 >
2023-03-07 22:08:47 +00:00
Georg Lehmann
de4805f25f
aco: use bitfield array helpers for valu modifiers
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21023 >
2023-03-07 11:53:23 +00:00
Georg Lehmann
e7559da757
aco: add bitfield array helper classes
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21023 >
2023-03-07 11:53:23 +00:00
Georg Lehmann
097a97cc42
aco: remove VOP[123C]P? structs
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21023 >
2023-03-07 11:53:23 +00:00
Georg Lehmann
08542318e7
aco/optimizer: simplify using VALU instruction
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21023 >
2023-03-07 11:53:23 +00:00
Georg Lehmann
4591703e79
aco/print_ir: simplify using VALU instruction
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21023 >
2023-03-07 11:53:23 +00:00
Georg Lehmann
17ff2e8c52
aco: validate VALU modifiers
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21023 >
2023-03-07 11:53:23 +00:00
Georg Lehmann
fc193ab4db
aco/ra: set opsel_hi to zero when converting to VOP2
...
Otherwise the new modifier validation will fail.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21023 >
2023-03-07 11:53:23 +00:00
Georg Lehmann
366cf4efaa
aco/ir: rework IR to have one common valu instruction struct
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21023 >
2023-03-07 11:53:23 +00:00
Georg Lehmann
77afe7d960
aco: treat VINTERP_INREG as VALU
...
It's just v_fma with fixed DPP8 and builtin s_waitcnt_expcnt, so it can mostly
be handled as a pure VALU instruction.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21023 >
2023-03-07 11:53:23 +00:00
Rhys Perry
8aff7152a0
aco: make IDSet sparse
...
Improves compilation time of huge shaders.
A ray tracing pipeline of Hellblade: Senua's Sacrifice compiles in about
half the time, with this patch.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8179
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21022 >
2023-03-03 17:45:14 +00:00
Rhys Perry
736d6643bb
aco/tests: add tests for v_fma_f32 with 2 fp16 literals
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21633 >
2023-03-03 14:20:55 +00:00
Marek Olšák
6aee999131
aco: implement nir_op_unpack_32_4x8
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19399 >
2023-03-03 03:27:40 +00:00
Georg Lehmann
ede0630f9e
aco: use v_fma_mix_f32 for v_fma_f32 with 2 fp16 representable, different literals
...
We can pack two fp16 literals into one 32bit literal and use opsel to select
the correct value. Note that LLVM currently disassembles these instructions
incorrectly.
Foz-DB Navi21:
Totals from 13365 (9.91% of 134913) affected shaders:
VGPRs: 840880 -> 840016 (-0.10%); split: -0.11%, +0.01%
SpillSGPRs: 724 -> 722 (-0.28%)
CodeSize: 82439364 -> 82451336 (+0.01%); split: -0.06%, +0.08%
MaxWaves: 244858 -> 244980 (+0.05%)
Instrs: 15265976 -> 15247201 (-0.12%); split: -0.13%, +0.01%
Latency: 223316180 -> 223272495 (-0.02%); split: -0.03%, +0.02%
InvThroughput: 41981375 -> 41969917 (-0.03%); split: -0.04%, +0.01%
VClause: 266775 -> 266558 (-0.08%); split: -0.14%, +0.06%
SClause: 646602 -> 645996 (-0.09%); split: -0.16%, +0.07%
Copies: 794703 -> 776075 (-2.34%); split: -2.46%, +0.12%
Branches: 296317 -> 296316 (-0.00%)
PreSGPRs: 658796 -> 656479 (-0.35%); split: -0.35%, +0.00%
PreVGPRs: 744014 -> 743679 (-0.05%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20587 >
2023-03-02 10:59:05 +00:00
Georg Lehmann
ed349951cb
aco: mark mad definition as precise if the mul/add were precise
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20587 >
2023-03-02 10:59:05 +00:00
Qiang Yu
4b3a22fcd4
aco: only ls and ps use store output now
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21435 >
2023-02-28 07:19:29 +00:00
Georg Lehmann
e1eabab6fe
aco/optimizer_postRA: assume all registers are untrackable in loop headers
...
Register writes from the pre-header might not be correct for any but
the first loop iteration because they can be clobbered inside the loop.
Foz-DB Navi21:
Totals from 18 (0.01% of 134913) affected shaders:
CodeSize: 251384 -> 251508 (+0.05%)
Instrs: 47644 -> 47664 (+0.04%)
Latency: 801801 -> 801852 (+0.01%)
InvThroughput: 177579 -> 177593 (+0.01%)
Copies: 4752 -> 4771 (+0.40%)
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8376
Fixes: d3b0f78110 ("aco/optimizer_postRA: Initialize loop header with preheader information")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21540 >
2023-02-28 04:27:05 +00:00
Rhys Perry
75d9a4a6ce
aco: always update orig_names in get_reg_phi()
...
No idea why this wasn't done if pc.first was a renamed temporary.
Fixes navi10 RA validation error with
dEQP-VK.binding_model.descriptor_buffer.multiple.graphics_geom_buffers1_sets3_imm_samplers
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8349
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21501 >
2023-02-27 15:10:22 +00:00
Georg Lehmann
1c5c2f77c3
aco: use and swizzle mask in dpp quad perm
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21412 >
2023-02-27 11:09:42 +00:00
Georg Lehmann
8fabde3be4
aco/gfx11: use dpp_row_xmask and dpp_row_share
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21412 >
2023-02-27 11:09:42 +00:00
Georg Lehmann
b7cd0eb439
aco: use v_permlane(x)16_b32 for masked swizzle
...
Should be cheaper than ds_swizzle.
Totals from 8 (0.01% of 134913) affected shaders:
CodeSize: 16316 -> 16388 (+0.44%)
Instrs: 3088 -> 3086 (-0.06%)
Latency: 49558 -> 49508 (-0.10%)
InvThroughput: 9180 -> 9198 (+0.20%)
Copies: 376 -> 384 (+2.13%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21412 >
2023-02-27 11:09:42 +00:00
Rhys Perry
94abccf3ce
aco: fix pathological case in LdsDirectVALUHazard
...
Similar to bfd4ac4581 .
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: 296b4d95a3 ("aco/gfx11: workaround LdsDirectVALUHazard")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21423 >
2023-02-22 20:46:12 +00:00
Georg Lehmann
ee47cc8256
amd,nir: remove byte_permute_amd intrinsic
...
It's unused and if we ever want to use it again we should make it an alu
opcode instead.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21445 >
2023-02-22 20:13:52 +00:00
Rhys Perry
ab3184c0a2
aco: don't apply modifiers through DPP to unsupported instructions
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21201 >
2023-02-21 14:59:38 +00:00
Georg Lehmann
3bd5b583f9
aco: combine a ^ ~b and ~(a ^ b) to v_xnor_b32
...
Foz-DB Navi21:
Totals from 13 (0.01% of 134913) affected shaders:
CodeSize: 225432 -> 225180 (-0.11%)
Instrs: 41973 -> 41908 (-0.15%)
Latency: 297464 -> 297326 (-0.05%)
InvThroughput: 82536 -> 82467 (-0.08%)
Copies: 2452 -> 2440 (-0.49%)
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21410 >
2023-02-21 13:35:31 +00:00
Daniel Schürmann
2bb369dd8d
nir: add assertions that loops don't have a Continue Construct
...
Hoping that I didn't miss any, this *should* add assertions
to all functions and passes which explicitly handle 'nir_loop'.
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13962 >
2023-02-21 10:41:11 +00:00
Timur Kristóf
2c40215ab9
aco/optimizer: Change v_cmp with subgroup invocation to constant.
...
When a shader has a comparison with the subgroup invocation id,
we can use a constant instead, saving a VALU instruction.
When the constant can't be represented as a 64-bit literal,
use the s_bfm_b64 instruction to generate it instead, which
is still a win.
Fossil DB stats on GFX11:
Totals from 300 (0.22% of 134913) affected shaders:
CodeSize: 2223052 -> 2214336 (-0.39%); split: -0.43%, +0.04%
Instrs: 430216 -> 429882 (-0.08%); split: -0.14%, +0.06%
Latency: 5881180 -> 5878181 (-0.05%); split: -0.05%, +0.00%
InvThroughput: 731846 -> 729293 (-0.35%)
Copies: 31662 -> 31847 (+0.58%); split: -0.03%, +0.61%
Branches: 8241 -> 8100 (-1.71%)
PreVGPRs: 15788 -> 15786 (-0.01%)
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20843 >
2023-02-18 21:16:58 +01:00
Daniel Schürmann
b338d59047
radv: unconditionally enable scratch for RT shaders
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21159 >
2023-02-16 19:37:25 +00:00
Timur Kristóf
084d10a702
aco: Remove MTBUF zero operand.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21363 >
2023-02-16 17:16:34 +00:00
Timur Kristóf
afdacf4dcc
aco: Don't set scalar offset on buffer load instructions when it's zero.
...
This helps generate slightly more optimal instructions.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21363 >
2023-02-16 17:16:34 +00:00
Timur Kristóf
4621ffdec1
aco: Get rid of redundant load_vmem_mubuf function.
...
Call emit_load directly from visit_load_buffer instead.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21358 >
2023-02-16 15:29:37 +00:00
Timur Kristóf
881c52ba19
ac: Port ACO's get_fetch_format to ac_get_safe_fetch_size.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21358 >
2023-02-16 15:29:36 +00:00
Georg Lehmann
4fbcd046ce
aco: Don't use vcmpx with DPP.
...
V_CMPX+DPP returns 0 with reads from disabled lanes, unlike V_CMP+DPP (RDNA3 ISA doc, 7.7)
Fixes: baab6f18c9 ("aco: Optimize branching sequence during SSA elimination.")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20537 >
2023-02-14 19:15:17 +00:00
Georg Lehmann
281a505ef0
aco: new 16bit VOP3 opcodes can use opsel
...
No Foz-DB changes on gfx11.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20705 >
2023-02-14 16:14:55 +00:00