Daniel Schürmann
a6c38f706d
aco/ssa_elimination: perform jump threading after parallelcopy insertion
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31888 >
2024-10-30 09:23:54 +00:00
Samuel Pitoiset
4459a1d210
radv: resize the SPM bo when it's too small
...
This used to abort (see the previous commit) when the hardware wasn't
able to sample all SPM counters because the BO was too small. The SPM
BO can now be resized like the SQTT BO.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31883 >
2024-10-29 18:33:17 +00:00
Samuel Pitoiset
e14511f77d
ac/spm: do not abort when the SPM BO is too small
...
It needs to be resized instead, like the SQTT BO.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31883 >
2024-10-29 18:33:17 +00:00
Marek Olšák
4f096b994d
ac/nir,radeonsi: use load_cull_line_viewport_xy_scale_and_offset_amd
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865 >
2024-10-29 16:47:44 +00:00
Marek Olšák
0f39d44f1b
ac/nir,radeonsi: use load_cull_small_line_precision_amd
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865 >
2024-10-29 16:47:44 +00:00
Marek Olšák
10c6f87adb
ac/nir,radeonsi: use load_cull_small_lines_enabled_amd
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865 >
2024-10-29 16:47:44 +00:00
Marek Olšák
ee452129c6
nir: add cull_triangles_, cull_lines_ prefixes to viewport_xy_scale_and_offset
...
for radeonsi
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865 >
2024-10-29 16:47:44 +00:00
Marek Olšák
2227f5be9d
nir: rename load_cull_small_primitive_precision -> triangle, add line_precision
...
for radeonsi
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865 >
2024-10-29 16:47:44 +00:00
Marek Olšák
0914e0d02f
nir: rename load_cull_small_primitives -> triangles, add load_cull_small_lines
...
for radeonsi
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31865 >
2024-10-29 16:47:44 +00:00
Lu Yao
0442a6c292
ac/radeonsi: compute htile for tile mode RADEON_SURF_MODE_1D on GFX6-8
...
Computing 'htile_size/meta_size' is allowed for RADEON_SURF_MODE_1D when
RADEON_SURF_TC_COMPATIBLE_HTILE isn't set.
Lacking of computing causes performance degradation in some scenarios.
Fixes: d4d9ec55c5 ("radeonsi: implement TC-compatible HTILE")
Signed-off-by: Lu Yao <yaolu@kylinos.cn>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31617 >
2024-10-29 16:23:51 +00:00
Georg Lehmann
938f5ec7ce
radv: use nir_opt_fragdepth
...
Cyberpunk 2077 writes unmodified depth.
Foz-DB Navi21:
Totals from 28 (0.04% of 79395) affected shaders:
Instrs: 6484 -> 6448 (-0.56%)
CodeSize: 36016 -> 35784 (-0.64%)
Latency: 58517 -> 58400 (-0.20%)
InvThroughput: 7719 -> 7717 (-0.03%)
Branches: 129 -> 119 (-7.75%)
PreVGPRs: 394 -> 372 (-5.58%)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31874 >
2024-10-29 15:15:24 +00:00
Georg Lehmann
695d2414cd
nir,radv: optimize shared atomic offsets
...
Foz-DB Navi21:
Totals from 87 (0.11% of 79395) affected shaders:
Instrs: 140877 -> 140873 (-0.00%)
CodeSize: 747760 -> 747164 (-0.08%); split: -0.09%, +0.01%
Latency: 4528171 -> 4528162 (-0.00%)
InvThroughput: 826358 -> 826349 (-0.00%)
Copies: 10888 -> 10884 (-0.04%)
VALU: 84634 -> 84630 (-0.00%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31080 >
2024-10-29 09:31:08 +00:00
Georg Lehmann
a2baff4810
ac/llvm: handle shared atomic base offset
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31080 >
2024-10-29 09:31:08 +00:00
Samuel Pitoiset
e83f91f206
radv: regroup and emit all raster related states in the same function
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31787 >
2024-10-29 07:25:34 +00:00
Samuel Pitoiset
62f51becbb
radv: track more redundant raster related registers
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31787 >
2024-10-29 07:25:34 +00:00
Konstantin Seurer
0963a0a2b4
radv: Move ac_addrlib to the physical device
...
There is nothing amdgpu specific here so this does not need to be
abstracted away. max_alignment also is not used in winsys code.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31643 >
2024-10-28 20:06:38 +00:00
Semenov Herman (Семенов Герман)
1764f70ba8
radv: fix memleaks in radv_init_shader_upload_queue()
...
Co-authored-by: default avatarSamuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31608 >
2024-10-28 17:11:41 +00:00
Samuel Pitoiset
8300378bf3
radv: advertise VK_EXT_device_generated_commands on GFX8+
...
GFX6-7 can't really support it and it's not worth the effort anyways.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31383 >
2024-10-28 16:27:35 +00:00
Samuel Pitoiset
9f8684359f
radv: implement VK_EXT_device_generated_commands
...
The major differences compared to the NV extensions are:
- support for the sequence index as push constants
- support for draw with count tokens (note that DrawID is zero for
normal draws)
- support for raytracing
- support for IES (only compute is supported for now)
- improved preprocessing support with the state command buffer param
The NV DGC extensions were only enabled for vkd3d-proton and it will
maintain both paths for a while, so they can be replaced by the EXT.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31383 >
2024-10-28 16:27:35 +00:00
Semenov Herman (Семенов Герман)
637a4b849a
radv: fix memleaks in radv_sqtt_reloc_graphics_shaders()
...
Co-authored-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31607 >
2024-10-28 15:48:05 +01:00
Samuel Pitoiset
f7652de1f1
Revert "ac/surface: add RADEON_SURF_VIEW_3D_AS_2D_ARRAY for GFX9+"
...
This reverts commit dc5ef90547 .
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31869 >
2024-10-28 12:47:38 +00:00
Samuel Pitoiset
0ae880c08c
Revert "radv: implement 2D views of 3D images using 2D_ARRAY descriptors on GFX9+"
...
Using view3dAs2dArray changes the tiling and it's slower (-7.5% in
Silent Hill 2 Remake) than using 3D tiling. The previous implementation
was the best one regarding performance (it's also what RadeonSI does).
Sadly it seems that sampler2DViewOf3D can't really be supported without
that but nobody really needs it apparently.
Also view3dAs2array is incompatible for 2D views of sparse 3D images
because sparse 3D images requires 3D tiling.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11997
This reverts commit f5805bcb8e .
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31869 >
2024-10-28 12:47:38 +00:00
Samuel Pitoiset
742a1097a9
Revert "radv: advertise sampler2DViewOf3D"
...
This feature has never been exposed in stable releases, so I think it's
fine to disable it.
This reverts commit 493d5910a3 .
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31869 >
2024-10-28 12:47:38 +00:00
Samuel Pitoiset
b3a06daa72
radv: simplify determining if dual-source blending is enabled
...
If blending is disabled or the color write mask is 0, dual-source
blending would be ignored, and this can be simplified a bit.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31681 >
2024-10-28 12:04:59 +00:00
Daniel Schürmann
10958d04d5
aco: Respect addressible SGPR limit in VS prologs
...
On Tonga, the effective SGPR limit is 96, including VCC.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31859 >
2024-10-28 11:29:06 +00:00
Samuel Pitoiset
dc5efa892f
radv: remove useless check about gl_Position as PS inputs for NGGC
...
gl_Position isn't part of the PS inputs read mask.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31830 >
2024-10-28 11:03:47 +00:00
Samuel Pitoiset
8e4d1965bd
radv: fix considering NGG culling for depth-only rendering
...
When the FS is unknown, this can happen with fast-link GPL or unlinked
ESO, rely on the number of VS/TES outputs which should be a good
approximation of the number of PS inputs.
This fixes a (huge?) performance regression from May 2023 because
for depth-only rendering, the FS is NULL and NGG culling wasn't
considered at all.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31830 >
2024-10-28 11:03:47 +00:00
Samuel Pitoiset
72871d8330
radv: set missing FMASK surface counters for MSAA MRTs
...
This has been removed few years ago by mistake but it's important for
performance. This is mostly for addrlib to determine tile_swizzle which
is used to make memory access faster with multiple render targets.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31797 >
2024-10-28 08:21:12 +01:00
Samuel Pitoiset
aa19bf3d93
amd/descriptors: set fmask_tile_swizzle for TC-compat CMASK images on GFX8
...
This is required.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31797 >
2024-10-28 08:21:12 +01:00
Georg Lehmann
d01c1ba939
aco: move exec copy out of waterfall loops
...
Foz-DB Navi21:
Totals from 348 (0.44% of 79395) affected shaders:
CodeSize: 17944800 -> 17946268 (+0.01%); split: -0.02%, +0.03%
Latency: 29775973 -> 29774369 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 10233380 -> 10232801 (-0.01%); split: -0.01%, +0.00%
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19070 >
2024-10-25 16:47:32 +00:00
Georg Lehmann
6c73a8a7f2
aco: optimize conditional divergent breaks at the end of loops
...
Removes one branch and one s_mov.
Foz-DB Navi21:
Totals from 1483 (1.87% of 79395) affected shaders:
Instrs: 6424114 -> 6373084 (-0.79%)
CodeSize: 35309320 -> 35091084 (-0.62%); split: -0.63%, +0.01%
Latency: 87950935 -> 88030841 (+0.09%); split: -0.03%, +0.12%
InvThroughput: 24784756 -> 24799536 (+0.06%); split: -0.02%, +0.08%
Copies: 588743 -> 561805 (-4.58%)
Branches: 242521 -> 215578 (-11.11%)
SALU: 877856 -> 850918 (-3.07%)
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19070 >
2024-10-25 16:47:32 +00:00
Georg Lehmann
075c5818cb
aco/ssa_elimination: don't assume exec writes can be removed based on block kind
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19070 >
2024-10-25 16:47:32 +00:00
Georg Lehmann
61ab33c883
aco/ssa_elimination: add instr_accesses helper
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19070 >
2024-10-25 16:47:32 +00:00
Pierre-Eric Pelloux-Prayer
60f7b2fc9f
radeonsi/ci: mark *.tessellation_shader_tessellation.max_in_out_attributes as fixed
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31684 >
2024-10-25 13:36:54 +00:00
Samuel Pitoiset
38d7492391
ci: uprev VKCTS to 1.3.10.0
...
This tag contains tests for DGC EXT.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31789 >
2024-10-25 14:03:37 +02:00
Joshua Ashton
c66fd95d92
radv: Fix sample locations at 0 for X/Y
...
We cannot set the {X,Y}MAX_RIGHT_EXCLUSION bits
if we have a sample location at a pixel boundary.
CTS does not seem to be catching this.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Co-authored-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31839 >
2024-10-25 11:24:12 +00:00
Joshua Ashton
130a423118
radv: Enable variableSampleLocations
...
This should come for free now we are dynamic
rendering based.
This passes CTS on RX 7900XTX.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31839 >
2024-10-25 11:24:12 +00:00
Eric Engestrom
03f056ea71
ci: skip slow tests on all non-"full" jobs
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31828 >
2024-10-25 08:26:31 +00:00
Eric Engestrom
bedb2f8a86
ci: rename "merge-skips" to "slow-skips" as they're about to be used outside of merge piplines
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31828 >
2024-10-25 08:26:31 +00:00
Samuel Pitoiset
927a17f30a
amd: do not emit PA_SU_PRIM_FILTER_CNTL in the common GFX preamble
...
RADV needs to adjust this register for user sample locations because
it seems possible to have a sample on the -8 coordinate.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31815 >
2024-10-25 07:41:22 +00:00
Samuel Pitoiset
3d172d08b0
radv: do no emit PA_SC_CONSERVATIVE_RASTERIZATION_CNTL in the preamble on GFX12
...
It's already emitted as part of the cmdbuf.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31815 >
2024-10-25 07:41:22 +00:00
Samuel Pitoiset
56cffd4b9b
radv: simplify determining if a graphics pipeline uses NGG culling
...
has_ngg_culling can only be TRUE if the last VGT shader also uses NGG.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31829 >
2024-10-25 07:10:28 +00:00
Samuel Pitoiset
62efebfd70
radv: fix emitting NGG culling state for ESO
...
It's possible to enable NGG culling with ESO if shaders are linked, or
if the VS doesn't need a prolog or if TES is used. This wasn't
supposed to be enabled but I think it worked just by luck because the
user SGPR value was probably zero and NGGC was disabled at draw time.
Found by inspection.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31829 >
2024-10-25 07:10:27 +00:00
Samuel Pitoiset
982af1a2bc
radv: capture shader statistics when RGP is enabled
...
This is useful in order to correlate shader hashes between RGP and
Fossilize. This is because Fossilize needs to pass the capture
statistics flag for getting shader hashes and the pipeline key won't
match otherwise.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31820 >
2024-10-25 06:29:02 +00:00
Georg Lehmann
b79950fc1f
aco: remove heuristic that restricts VOP2/C with 2 sgprs
...
Looking at the stats, the slightly increased code size isn't a problem
compared to the benefits. This also only affects gfx10+, and those generations
aren't throughput limited by 64bit instructions like early gcn.
Foz-DB Navi21:
Totals from 12377 (15.59% of 79395) affected shaders:
MaxWaves: 269323 -> 269857 (+0.20%); split: +0.23%, -0.03%
Instrs: 16505304 -> 16472552 (-0.20%); split: -0.21%, +0.01%
CodeSize: 89815804 -> 90130344 (+0.35%); split: -0.02%, +0.37%
VGPRs: 661160 -> 658640 (-0.38%); split: -0.40%, +0.02%
SpillSGPRs: 3032 -> 3049 (+0.56%)
SpillVGPRs: 826 -> 796 (-3.63%)
Latency: 145800231 -> 145818568 (+0.01%); split: -0.14%, +0.15%
InvThroughput: 39026010 -> 38892467 (-0.34%); split: -0.36%, +0.02%
VClause: 325693 -> 325992 (+0.09%); split: -0.12%, +0.21%
SClause: 497938 -> 497208 (-0.15%); split: -0.23%, +0.08%
Copies: 1239036 -> 1204045 (-2.82%); split: -2.90%, +0.07%
Branches: 462952 -> 462934 (-0.00%); split: -0.01%, +0.00%
PreSGPRs: 586066 -> 587558 (+0.25%)
PreVGPRs: 550024 -> 547736 (-0.42%)
VALU: 11147608 -> 11114528 (-0.30%); split: -0.31%, +0.01%
SALU: 2105546 -> 2105131 (-0.02%); split: -0.03%, +0.01%
VMEM: 575983 -> 575923 (-0.01%)
Foz-DB Navi31:
Totals from 11544 (14.54% of 79395) affected shaders:
MaxWaves: 319612 -> 319804 (+0.06%)
Instrs: 17563158 -> 17527341 (-0.20%); split: -0.22%, +0.02%
CodeSize: 92366832 -> 92626280 (+0.28%); split: -0.03%, +0.31%
VGPRs: 667620 -> 665484 (-0.32%); split: -0.33%, +0.01%
SpillSGPRs: 3418 -> 3434 (+0.47%)
SpillVGPRs: 896 -> 858 (-4.24%)
Scratch: 4738048 -> 4736512 (-0.03%)
Latency: 141366653 -> 141399756 (+0.02%); split: -0.10%, +0.12%
InvThroughput: 26213994 -> 26165751 (-0.18%); split: -0.21%, +0.03%
VClause: 307956 -> 308124 (+0.05%); split: -0.12%, +0.18%
SClause: 477816 -> 477326 (-0.10%); split: -0.18%, +0.08%
Copies: 1161148 -> 1129386 (-2.74%); split: -2.81%, +0.08%
Branches: 411509 -> 411506 (-0.00%); split: -0.00%, +0.00%
PreSGPRs: 531354 -> 535027 (+0.69%)
PreVGPRs: 525201 -> 521861 (-0.64%)
VALU: 10360363 -> 10330274 (-0.29%); split: -0.30%, +0.01%
SALU: 1778044 -> 1777585 (-0.03%); split: -0.04%, +0.01%
VMEM: 551379 -> 551303 (-0.01%)
VOPD: 3539 -> 3471 (-1.92%); split: +0.14%, -2.06%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31804 >
2024-10-24 17:44:13 +00:00
Georg Lehmann
54fa55a3f7
radv: don't use v_mqsad_u32_u8 on gfx7
...
According to tests on hawaii, v_mqsad_u32_u8 always uses saturating accumulation
while v_msad_u8 truncates. GFX8+ can control this with the VOP3 clamp bit,
on older hardware that's not supported.
We want truncation for the NIR opcode.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12062
Fixes: c3c138b10f ("radv: optimize msad_4x8 to mqsad_4x8")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31809 >
2024-10-24 17:20:56 +00:00
Rhys Perry
4579586c66
aco/tests: add tests for VALUReadSGPRHazard
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30478 >
2024-10-24 16:08:08 +00:00
Rhys Perry
47e0f468cf
aco: workaround VALUReadSGPRHazard
...
fossil-db (gfx1200):
Totals from 65112 (82.01% of 79395) affected shaders:
Instrs: 41732906 -> 42987198 (+3.01%); split: -0.00%, +3.01%
CodeSize: 222451964 -> 226942644 (+2.02%); split: -0.01%, +2.03%
Latency: 290411063 -> 290944688 (+0.18%); split: -0.00%, +0.18%
InvThroughput: 45854913 -> 45910275 (+0.12%); split: -0.00%, +0.12%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30478 >
2024-10-24 16:08:07 +00:00
Rhys Perry
9ab0c4b047
aco: minor CounterMap::operator== fix
...
I don't think this matters for how we use CounterMap::operator==.
The BITSET_TEST() was unnecessary because of the BITSET_EQUAL above.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30478 >
2024-10-24 16:08:07 +00:00
Rhys Perry
f5b871f825
aco: split CounterMap off from VGPRCounterMap
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30478 >
2024-10-24 16:08:07 +00:00