Commit graph

3472 commits

Author SHA1 Message Date
Daniel Schürmann
28ab7f0168 aco/jump_threading: remove branch sequence optimization
This optimization gets applied during postRA optimization, now.

No fossil changes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32330>
2024-12-12 08:11:22 +00:00
Daniel Schürmann
fcd94a8ca7 aco: move try_optimize_branching_sequence() to postRA optimizations
Totals from 196 (0.25% of 79206) affected shaders: (Navi31)

Instrs: 534343 -> 534438 (+0.02%); split: -0.00%, +0.02%
CodeSize: 2774852 -> 2775420 (+0.02%); split: -0.00%, +0.02%
Latency: 7103512 -> 7103021 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 959477 -> 959447 (-0.00%)
Copies: 42646 -> 42648 (+0.00%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32330>
2024-12-12 08:11:21 +00:00
Daniel Schürmann
95d44c7ce0 aco/optimizer_postRA: set branch()->never_taken if exec is constant non-zero
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32330>
2024-12-12 08:11:21 +00:00
Daniel Schürmann
d67932f69e aco/print_ir: don't print disconnected empty blocks
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32330>
2024-12-12 08:11:21 +00:00
Daniel Schürmann
22881712c8 aco/assembler: Don't emit target basic block index when chaining branches
This could erroneously cause an assertion to fail if the
target block index was larger than UINT16_MAX.

Fixes: cab5639a09 ('aco/assembler: chain branches instead of emitting long jumps')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32599>
2024-12-11 23:28:55 +00:00
Georg Lehmann
65506e635b aco/ra: don't write to scc/ttmp with s_fmac
Fixes: 4bd229ac50 ("aco/gfx11.5: select SOP2 float instructions")

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32545>
2024-12-11 12:51:18 +00:00
Georg Lehmann
0b9e2a5427 aco/ra: disallow s_cmpk with scc operand
Fixes: 2d6b0a4177 ("aco/optimizer: Optimize SOPC with literal to SOPK.")

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32545>
2024-12-11 12:51:18 +00:00
Georg Lehmann
fe0c72caec aco/ra: don't write to exec/ttmp with mulk/addk/cmovk
ttmp sgprs are readonly outside of trap handlers, so the instructions were
probably skipped. RA should also never create additional exec writes.

Fixes: e06773281b ("aco/ra: Optimize some SOP2 instructions with literal to SOPK.")

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32545>
2024-12-11 12:51:18 +00:00
Georg Lehmann
576a2e798c aco/gfx12: don't assume memory operations complete in order
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32569>
2024-12-11 12:22:59 +00:00
Samuel Pitoiset
553eb1a3fd radv: fix alpha-to-coverage with alpha-to-one when MRTZ is also exported
On AMD hardware, it's possible to export a separate alpha channel for
applying alpha-to-one after alpha-to-coverage and not before.

On GFX11+, it's already mostly supported but alpha needs to be exported
to MRTZ.a and one to MRT0.a. The hw always uses alpha for
alpha-to-coverage from MRTZ.a.

On older generations, the driver needs the same separate alpha export
but it also needs to configure the hardware with COVERAGE_TO_MASK_ENABLE
which selects alpha from MRTZ.a.

This should fix alpha-to-coverage with alpha-to-one when either
depth, stencil or samplemask are exported but it still needs a slightly
different solution without MRTZ. I will fix that later.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32523>
2024-12-11 10:50:31 +00:00
Samuel Pitoiset
70047e6bd6 aco: export alpha to MRTZ.a and one to MRT0.a for alpha-to-one on GFX11
For FS epilogs.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32523>
2024-12-11 10:50:31 +00:00
Georg Lehmann
4a977ea24f aco/gfx11+: use v_and_b32 to extract local id 0
Foz-DB Navi31:
Totals from 2561 (3.23% of 79206) affected shaders:
CodeSize: 10399004 -> 10389120 (-0.10%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32532>
2024-12-10 11:58:21 +00:00
Daniel Schürmann
b64fff7731 aco: remove definition from Pseudo branch instructions
They are not needed anymore.

Totals from 7019 (8.84% of 79395) affected shaders: (Navi31)

Instrs: 14805400 -> 14824196 (+0.13%); split: -0.00%, +0.13%
CodeSize: 78079972 -> 78132932 (+0.07%); split: -0.01%, +0.08%
SpillSGPRs: 4485 -> 4515 (+0.67%); split: -0.76%, +1.43%
Latency: 165862000 -> 165836134 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 30061764 -> 30057781 (-0.01%); split: -0.01%, +0.00%
SClause: 392323 -> 392286 (-0.01%); split: -0.01%, +0.00%
Copies: 1012262 -> 1012234 (-0.00%); split: -0.04%, +0.04%
Branches: 365910 -> 365909 (-0.00%); split: -0.00%, +0.00%
PreSGPRs: 360167 -> 355363 (-1.33%)
VALU: 8837197 -> 8837276 (+0.00%); split: -0.00%, +0.00%
SALU: 1402593 -> 1402621 (+0.00%); split: -0.03%, +0.03%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32037>
2024-12-06 14:34:03 +00:00
Daniel Schürmann
7e4687fd04 aco: remove definition from SOPP branch instructions
Totals from 17942 (22.60% of 79395) affected shaders: (Navi31)

Instrs: 20334063 -> 20312676 (-0.11%); split: -0.11%, +0.00%
CodeSize: 108458732 -> 108377540 (-0.07%); split: -0.08%, +0.00%
Latency: 180510540 -> 180479666 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 28079325 -> 28077938 (-0.00%); split: -0.01%, +0.00%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32037>
2024-12-06 14:34:03 +00:00
Daniel Schürmann
cab5639a09 aco/assembler: chain branches instead of emitting long jumps
As regular branch instructions cannot jump further than
32768 dwords, previously we used long jumps as fallback
solution. The disadvantage of that is that an extra SGPR
pair must be provided in order to temporarily store the PC.

This patch changes that to chained branch instructions by
inserting an artificial extra block into the code to be
targeted by the original branch. This block contains a
single branch instruction jumping to the original target.
Before the block, if necessary, we insert a <branch 1>
instruction for the existing code in order to jump over
the newly inserted block.

Only a few RT shaders are affected.

Totals from 29 (0.04% of 79395) affected shaders: (Navi31)

CodeSize: 17281176 -> 17276332 (-0.03%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32037>
2024-12-06 14:34:03 +00:00
Daniel Schürmann
c3d777d8ac aco/assembler: change ctx.loop_header to uint32_t instead of Block*
We are about to add new blocks during assembly which makes
pointers into a vector unreliable.
Also, only set it if the loop has no back-edge.

Totals from 126 (0.16% of 79206) affected shaders: (Navi31)

CodeSize: 1486152 -> 1488152 (+0.13%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32037>
2024-12-06 14:34:03 +00:00
Daniel Schürmann
592f3fd994 aco/assembler: Actually insert s_inst_prefetch instructions when aligning blocks for loops
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32037>
2024-12-06 14:34:03 +00:00
Daniel Schürmann
b92afdbd28 aco/assembler: constify assembly functions
Ensure that instruction formats and special operands
are not manipulated during assembly.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32037>
2024-12-06 14:34:03 +00:00
Daniel Schürmann
3a02bbd916 aco/print_asm: allow for empty blocks with arbitrary offsets
We will add empty blocks at the end of the shader,
in order to store some branch offset information.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32037>
2024-12-06 14:34:03 +00:00
Rhys Perry
ab26b99c2c aco: don't CSE p_shader_cycles_hi_lo_hi
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Fixes: fae2a85d57 ("aco/gfx12: implement subgroup shader clock")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12243
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32500>
2024-12-06 14:06:05 +00:00
Georg Lehmann
b2464e3609 aco/gfx12+: do not use v_pack_b32_f16 to pack untyped data
GFX12 removed IEEE_MODE, and made its signalling NaN quieting the default.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12251
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32502>
2024-12-06 12:33:05 +00:00
Georg Lehmann
7425e71ae0 aco/gfx12: disable vinterp ddx/ddy optimization
This only seems to work on gfx11 and gfx11.5, and it's only faster on gfx11.5.

We could continue to use vinterp, with constants copied to vgprs, but
whether that's beneficial depends on the shader.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>

Fixes: bee487df48 ("aco/gfx11.5+: use vinterp for fddx/fddy")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12250
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32495>
2024-12-06 12:01:39 +00:00
Rhys Perry
fd19ff0b9e aco: force linear for event_vmem_sample and event_vmem_bvh
I don't know if this issue affects GFX12, but workaround it anyway to be
safe.

fossil-db (gfx1200):
Totals from 3463 (4.36% of 79395) affected shaders:
Instrs: 9794280 -> 9833253 (+0.40%); split: -0.00%, +0.40%
CodeSize: 52306040 -> 52457988 (+0.29%); split: -0.01%, +0.30%
Latency: 90549385 -> 93617517 (+3.39%); split: -0.00%, +3.39%
InvThroughput: 13189030 -> 13602942 (+3.14%); split: -0.00%, +3.14%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32373>
2024-12-02 10:13:39 +00:00
Rhys Perry
86c63b29bc aco/gfx12: insert wait between VMEM WaW
https://github.com/llvm/llvm-project/pull/105549

fossil-db (gfx1200):
Totals from 1783 (2.25% of 79395) affected shaders:
Instrs: 7398391 -> 7404566 (+0.08%); split: -0.00%, +0.08%
CodeSize: 38862456 -> 38886364 (+0.06%); split: -0.00%, +0.06%
Latency: 83191513 -> 84211504 (+1.23%); split: -0.00%, +1.23%
InvThroughput: 15185936 -> 15345744 (+1.05%); split: -0.01%, +1.06%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32373>
2024-12-02 10:13:39 +00:00
Timur Kristóf
e2b8c4a9ac radv, aco: Consolidate num_interp + num_prim_interp into num_inputs.
num_inputs contains the total number of FS inputs.

Note that this also fixes a bug where some calculations in RADV
and ACO were missing the per-primitive attributes from the LDS
usage of PS.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32220>
2024-11-28 18:14:57 +00:00
Rhys Perry
4c3809e7fc aco: use small_vec in RegCounterMap
This seems to be a little faster.

insert_NOPs (navi31):
Difference at 95.0% confidence
	-11.484 +/- 6.13377
	-1.62767% +/- 0.860593%
	(Student's t, pooled s = 5.71913)

insert_NOPs (gfx1200):
Difference at 95.0% confidence
	-35.6745 +/- 4.97972
	-8.1236% +/- 1.10453%
	(Student's t, pooled s = 4.6431)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32374>
2024-11-28 17:07:34 +00:00
Rhys Perry
7a500c8b22 aco: make small_vec copyable
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32374>
2024-11-28 17:07:34 +00:00
Georg Lehmann
fd669fa69d aco/optimizer: label fcanonicalize like a copy if there is nothing to flush
Allows copy propagation into non alu instructions like phis.

Foz-DB Navi21:
Totals from 138 (0.17% of 79395) affected shaders:
Instrs: 308135 -> 307792 (-0.11%); split: -0.12%, +0.01%
CodeSize: 1567924 -> 1566484 (-0.09%); split: -0.10%, +0.01%
VGPRs: 9696 -> 9720 (+0.25%)
Latency: 1162719 -> 1161663 (-0.09%); split: -0.10%, +0.00%
InvThroughput: 256944 -> 256590 (-0.14%); split: -0.15%, +0.01%
VClause: 5631 -> 5626 (-0.09%); split: -0.14%, +0.05%
Copies: 29962 -> 30028 (+0.22%); split: -0.10%, +0.32%
Branches: 8241 -> 8237 (-0.05%)
PreVGPRs: 7800 -> 7797 (-0.04%)
VALU: 216243 -> 215898 (-0.16%); split: -0.17%, +0.01%
SALU: 30768 -> 30767 (-0.00%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32355>
2024-11-27 15:14:31 +00:00
Rhys Perry
63b0692eac aco: don't use uniform continues if exec might be empty
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31143>
2024-11-25 10:32:59 +00:00
Rhys Perry
aa0ede751d aco/tests: add tests for empty exec masks
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31143>
2024-11-25 10:32:59 +00:00
Rhys Perry
f35e229fae aco: skip code if exec is empty
This is safer and potentially faster.

fossil-db (navi21):
Totals from 690 (0.87% of 79395) affected shaders:
Instrs: 4534778 -> 4535916 (+0.03%)
CodeSize: 25268516 -> 25272080 (+0.01%); split: -0.00%, +0.01%
Latency: 48482721 -> 48513907 (+0.06%); split: -0.00%, +0.07%
InvThroughput: 13213965 -> 13217828 (+0.03%); split: -0.00%, +0.03%
Copies: 432307 -> 432295 (-0.00%); split: -0.05%, +0.04%
Branches: 187305 -> 188249 (+0.50%)
VALU: 2904490 -> 2904508 (+0.00%); split: -0.00%, +0.00%
SALU: 674962 -> 675133 (+0.03%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31143>
2024-11-25 10:32:59 +00:00
Rhys Perry
f00c3a14c0 aco: require WQM after demote in control flow
fossil-db (navi21):
Totals from 424 (0.53% of 79395) affected shaders:
Instrs: 404496 -> 404752 (+0.06%); split: -0.07%, +0.13%
CodeSize: 2150608 -> 2151616 (+0.05%); split: -0.05%, +0.09%
Latency: 9124298 -> 9115957 (-0.09%); split: -0.12%, +0.03%
InvThroughput: 1883570 -> 1883468 (-0.01%); split: -0.01%, +0.00%
VClause: 6832 -> 6830 (-0.03%)
SClause: 13801 -> 13778 (-0.17%); split: -0.17%, +0.01%
Copies: 26758 -> 26673 (-0.32%); split: -0.44%, +0.12%
Branches: 9819 -> 9567 (-2.57%)
PreSGPRs: 17902 -> 17934 (+0.18%)
SALU: 45407 -> 45906 (+1.10%); split: -0.01%, +1.11%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31143>
2024-11-25 10:32:59 +00:00
Rhys Perry
8a175b02bc aco: use repair pass for LCSSA workaround
This makes instruction selection simpler and fixes potential issues with
allocated_vec or the optimizer moving SGPR uses out of the loop.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31143>
2024-11-25 10:32:59 +00:00
Rhys Perry
5de990f5a9 aco: add SSA repair pass
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31143>
2024-11-25 10:32:58 +00:00
Daniel Schürmann
bb87832ce0 aco/insert_NOPs: add early exit to handle_valu_partial_forwarding_hazard_instr
No need to continue if there was already a hazard found in
a different control flow path.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32191>
2024-11-22 08:46:32 +00:00
Daniel Schürmann
07df37ba01 aco/insert_NOPs: use RegCounterMap as replacement for the CounterMap implementation
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32191>
2024-11-22 08:46:32 +00:00
Daniel Schürmann
fb5e5adfb3 aco/insert_NOPs: implement vector-based RegCounterMap as replacement for VGPRCounterMap
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32191>
2024-11-22 08:46:32 +00:00
Georg Lehmann
f3926c9d4e aco/isel: use undef Operands for p_create_vector created from nir vecs
Foz-DB Navi31:
Totals from 27464 (34.59% of 79395) affected shaders:
Instrs: 9595601 -> 9535260 (-0.63%); split: -0.63%, +0.00%
CodeSize: 47900112 -> 47658648 (-0.50%); split: -0.50%, +0.00%
Latency: 43928471 -> 43918448 (-0.02%); split: -0.05%, +0.02%
InvThroughput: 4940105 -> 4903447 (-0.74%); split: -0.75%, +0.01%
Copies: 667294 -> 604603 (-9.39%); split: -9.39%, +0.00%
VALU: 5282264 -> 5219604 (-1.19%); split: -1.19%, +0.00%
VOPD: 342 -> 311 (-9.06%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32249>
2024-11-21 14:09:52 +00:00
Daniel Schürmann
1ff9a0fe80 aco: remove Pseudo_instruction::tmp_in_scc
This information is redundant, now.

No fossil-changes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32217>
2024-11-20 11:04:32 +00:00
Daniel Schürmann
a1a4a6061c aco/ra: explicitly assign scratch SGPR for linear phis
We are about to remove the branch definitions which previously
served this purpose. Also remove Block::scc_live_out.
Some changes due to round-robin RA.

Totals from 939 (1.18% of 79395) affected shaders: (Navi31)

Instrs: 5038786 -> 5038611 (-0.00%); split: -0.01%, +0.00%
CodeSize: 26153412 -> 26152904 (-0.00%); split: -0.00%, +0.00%
Latency: 41649989 -> 41650120 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 6447508 -> 6447536 (+0.00%); split: -0.00%, +0.00%
SClause: 131319 -> 131276 (-0.03%); split: -0.03%, +0.00%
Copies: 359362 -> 359256 (-0.03%); split: -0.05%, +0.02%
SALU: 639275 -> 639169 (-0.02%); split: -0.03%, +0.01%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32217>
2024-11-20 11:04:32 +00:00
Daniel Schürmann
17da551133 aco/ra: use bitset for sgpr_operands_alias_defs
We cannot rely on SGPR Temps being fully aligned to 64 SGPRs.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32217>
2024-11-20 11:04:32 +00:00
Daniel Schürmann
a04e096339 aco/ra: set Pseudo_instruction::scratch_sgpr to SCC if it doesn't need to be preserved
Also ensure that 'needs_scratch_reg' is always true if SCC might be overwritten.
Few changes, because some p_split_vector get SCC as scratch reg assigned,
and thus, can inhibit some postRA optimizations.

Totals from 3 (0.00% of 79395) affected shaders: (Navi31)
Instrs: 10501 -> 10500 (-0.01%); split: -0.02%, +0.01%
CodeSize: 51580 -> 51520 (-0.12%); split: -0.12%, +0.01%
Latency: 84166 -> 84174 (+0.01%)
InvThroughput: 13109 -> 13111 (+0.02%)
SALU: 859 -> 860 (+0.12%)

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32217>
2024-11-20 11:04:32 +00:00
Samuel Pitoiset
2abbd361e2 radv,aco: dump LDS from the trap handler
Can be useful for debugging.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32182>
2024-11-20 07:35:47 +00:00
Timur Kristóf
d94e65574f aco: Separate options for printing IR and recording disassembly.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32079>
2024-11-19 16:15:28 +01:00
Marek Olšák
b258a9aa4e aco: remove unused TCS fields from aco_shader_info
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32171>
2024-11-16 21:58:26 -05:00
Georg Lehmann
3e037ac2a9 aco/gfx8: use ds_swizzle_b32 rotate mode
Despite only being mentioned in the ISA docs since vega, rotate (and fft)
swizzle mode seem to exist since gfx8.

https://github.com/llvm/llvm-project/issues/28975#issuecomment-980964939

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31348>
2024-11-14 15:34:48 +00:00
Samuel Pitoiset
b4b5f9eeb0 radv,aco: dump VGPRS from the trap handler shader
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32090>
2024-11-13 15:27:54 +00:00
Samuel Pitoiset
132b7a85c7 aco: drop the second M0 operand for s_set_gpr_idx_on
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32090>
2024-11-13 15:27:54 +00:00
Samuel Pitoiset
c712555a9f aco: save/restore VGPRS on GFX8 in the trap handler shader
This will be needed for dumping VGPRs.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32090>
2024-11-13 15:27:54 +00:00
Samuel Pitoiset
a77af57e83 aco: use all invocations from the current wave in the trap handler
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32090>
2024-11-13 15:27:54 +00:00