Commit graph

11021 commits

Author SHA1 Message Date
Rhys Perry
97e9e42e0d aco: treat flat-like as vmem in some scheduling heuristics
fossil-db (navi21):
Totals from 12 (0.01% of 162293) affected shaders:
Instrs: 48754 -> 48762 (+0.02%)
CodeSize: 267092 -> 267124 (+0.01%)
Latency: 1293798 -> 1292303 (-0.12%); split: -0.12%, +0.00%
InvThroughput: 854599 -> 853578 (-0.12%)
VClause: 1623 -> 1619 (-0.25%)
SClause: 1187 -> 1188 (+0.08%); split: -0.08%, +0.17%

fossil-db (vega10):
Totals from 1 (0.00% of 161355) affected shaders:
Latency: 18720 -> 18848 (+0.68%)
InvThroughput: 5775 -> 5776 (+0.02%)
SClause: 12 -> 11 (-8.33%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>
2022-07-08 14:49:03 +00:00
Rhys Perry
29953d6048 aco: include scratch/global in VMEM WAW optimization
fossil-db (navi21):
Totals from 2 (0.00% of 162293) affected shaders:
Instrs: 4788 -> 4785 (-0.06%)
CodeSize: 25884 -> 25872 (-0.05%)
Latency: 255008 -> 252950 (-0.81%)
InvThroughput: 170005 -> 168633 (-0.81%)
VClause: 206 -> 205 (-0.49%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>
2022-07-08 14:49:03 +00:00
Rhys Perry
c66206cbed aco: avoid WAW hazard with BVH MIMG and other VMEM
According to LLVM, image_bvh64_intersect_ray does not write results in
order with other VMEM instructions.

fossil-db (navi21):
Totals from 7 (0.00% of 162293) affected shaders:
Instrs: 39978 -> 39985 (+0.02%)
CodeSize: 219356 -> 219384 (+0.01%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>
2022-07-08 14:49:03 +00:00
Rhys Perry
7d34044908 aco: refactor VGPR spill/reload lowering
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>
2022-07-08 14:49:03 +00:00
Rhys Perry
6642f2fd74 aco: handle subtractions in parse_base_offset
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>
2022-07-08 14:49:03 +00:00
Rhys Perry
52934f6cdb aco: combine additions and constants into scratch load/store
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>
2022-07-08 14:49:03 +00:00
Rhys Perry
931a456db1 aco: improve support for scratch_* instructions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>
2022-07-08 14:49:03 +00:00
Rhys Perry
cbeb25ce91 aco: make FLAT_instruction::offset signed
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>
2022-07-08 14:49:03 +00:00
Rhys Perry
5898afba53 aco: include flat-like in vmem clause statistics
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>
2022-07-08 14:49:03 +00:00
Rhys Perry
08ed6ebc55 aco: make flat access latency match mtbuf/mubuf/mimg
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079>
2022-07-08 14:49:03 +00:00
Samuel Pitoiset
6517a2b926 radv: fix dumping VS prologs assembly
This got removed by mistake and broke
RADV_DEBUG=shaders,nocache,prologs.

Fixes: 9fe2b6b748 ("aco/radv: provide a vs prolog callback from aco to radv.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17413>
2022-07-08 10:58:33 +00:00
Tatsuyuki Ishi
768cd5715d radv: Fix vkCmdCopyQueryResults -> vkCmdResetPool hazard.
The Vulkan specification states:

> Query commands, for the same query and submitted to the same queue,
> execute in their entirety in submission order, relative to each other. In
> effect there is an implicit execution dependency from each such query
> command to all query commands previously submitted to the same queue.

Fixes dEQP-VK.query_pool.statistics_query.reset_after_copy.*

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17400>
2022-07-08 10:35:11 +00:00
Georg Lehmann
4f5e25ea8d aco/assembler: Fix s_bitreplicate_b64_b32 on GFX9.
This seems to be a relic from before aco added per generation opcodes.

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17405>
2022-07-08 10:09:19 +00:00
Georg Lehmann
68db0a079b aco: Fix swapping sources in SOPC -> SOPK optimization.
Fixes: 2d6b0a4177 ("aco/optimizer: Optimize SOPC with literal to SOPK.")

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17407>
2022-07-08 09:43:51 +00:00
Pierre-Eric Pelloux-Prayer
326c042491 ac/llvm: use LLVMBuildLoad2 in visit_load
Only FS can have f16 outputs, so always use f32 here.

Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17361>
2022-07-08 08:41:25 +00:00
Pierre-Eric Pelloux-Prayer
dc8d82516b ac/llvm: handle opaque pointers in visit_store_output
Outputs are always f32 or f16.

Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17361>
2022-07-08 08:41:25 +00:00
Pierre-Eric Pelloux-Prayer
196c4ebe1a ac: add per output is_16bit flag to ac_shader_abi
Outputs are always f32 except for FS that may use unpacked f16.
Store this information here to make it available to later processing.

Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17361>
2022-07-08 08:41:25 +00:00
Pierre-Eric Pelloux-Prayer
940734630d ac: use LLVMContextSetOpaquePointers if available
Disabling opaque pointers in LLVM doesn't fix all the issues but
it makes pointers non-opaque by default (eg LLVMPointerType()
returns a typed pointer).

Reviewed-by: Mihai Preda <mhpreda@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17361>
2022-07-08 08:41:25 +00:00
Samuel Pitoiset
cf46397aec aco: fix load_barycentric_at_sample without MSAA
It's legal to use this instruction in a fragment shader, even if the
graphics pipeline doesn't use MSAA.

Fixes
dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_sample.non_multisample_buffer.sample_n_*.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17398>
2022-07-08 07:28:33 +00:00
Rhys Perry
48578713b7 radv,aco,ac/llvm: use nir_op_f{sin,cos}_amd
This lets NIR optimize the multiplication, particularly sin/cos(a * #b).

fossil-db (Sienna Cichlid):
Totals from 12306 (7.58% of 162293) affected shaders:
MaxWaves: 224814 -> 224834 (+0.01%)
Instrs: 17365273 -> 17338758 (-0.15%); split: -0.16%, +0.00%
CodeSize: 93478488 -> 93354912 (-0.13%); split: -0.14%, +0.01%
VGPRs: 752080 -> 752072 (-0.00%); split: -0.00%, +0.00%
SpillSGPRs: 8440 -> 8410 (-0.36%)
Latency: 200402154 -> 200279405 (-0.06%); split: -0.06%, +0.00%
InvThroughput: 37588077 -> 37545545 (-0.11%); split: -0.11%, +0.00%
VClause: 293863 -> 293874 (+0.00%); split: -0.03%, +0.03%
SClause: 619539 -> 619064 (-0.08%); split: -0.09%, +0.01%
Copies: 1151591 -> 1151641 (+0.00%); split: -0.04%, +0.05%
Branches: 506434 -> 506437 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 877609 -> 877517 (-0.01%); split: -0.01%, +0.00%
PreVGPRs: 711938 -> 711940 (+0.00%); split: -0.00%, +0.00%

fossil-db (LLVM, Sienna Cichlid):
Totals from 4377 (3.59% of 121873) affected shaders:
SGPRs: 358960 -> 359176 (+0.06%); split: -0.18%, +0.25%
VGPRs: 319832 -> 319720 (-0.04%); split: -0.18%, +0.15%
SpillSGPRs: 46983 -> 47007 (+0.05%); split: -0.99%, +1.04%
CodeSize: 30872812 -> 30764512 (-0.35%); split: -0.39%, +0.04%
MaxWaves: 73814 -> 73904 (+0.12%); split: +0.25%, -0.13%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10587>
2022-07-07 22:18:08 +00:00
Tatsuyuki Ishi
2848e2f28e radv/ci: Move sample_texture.*_compressed_format_* to faillist for gfx<=9
This turned out to be not a CTS bug but rather hardware bug around the
cache handle BCn textures.

It requires significant tracking to detect such cases, and it's likely
not worth a workaround since reading a texture as both compressed and
uncompressed in succession shall not be a realistic use case.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6689
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17345>
2022-07-07 13:19:56 +00:00
Hans-Kristian Arntzen
9dbfc21ab9 radv: Implement VK_EXT_shader_module_identifier.
Passes dEQP-VK.pipeline.*.shader_module_identifier.*

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17332>
2022-07-06 16:27:21 +00:00
Georg Lehmann
2d6b0a4177 aco/optimizer: Optimize SOPC with literal to SOPK.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15999>
2022-07-06 09:54:54 +00:00
Georg Lehmann
52f8167b25 aco/optimizer: Convert s_add_u32 with literals to s_add_i32 if carry is not used.
To allow further optimizations to s_addk_i32.

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15999>
2022-07-06 09:54:54 +00:00
Georg Lehmann
e06773281b aco/ra: Optimize some SOP2 instructions with literal to SOPK.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15999>
2022-07-06 09:54:54 +00:00
Georg Lehmann
efdb323ad2 aco/ir: Pad SOP2 and SOPC to the same size as SOPK.
Being able to directly cast instructions simplifies optimizations.

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15999>
2022-07-06 09:54:54 +00:00
Georg Lehmann
87b4f3daa1 aco/ra: Move mac encoding optimization to its own function.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15999>
2022-07-06 09:54:54 +00:00
Georg Lehmann
c9490436b6 aco/ra: Static assert that changing instruction type to VOP2 is valid.
It's not obvious that this is correct.

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15999>
2022-07-06 09:54:54 +00:00
Samuel Pitoiset
599b587220 radv/ci: update list of failures against CTS 1.3.3.0
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17370>
2022-07-06 10:26:20 +02:00
Emma Anholt
00ad29dd23 ci: Uprev deqp to 1.3.3.0.
New tests, dEQP line rasterization test fix that lets Intel pass.

Clears out bogus xfails from 1.3.2.0 uprev on a630, which I suspect were
"we lost the device twice on a full run once, and those fails got pasted
in without checking if it happened a full run again" (since we haven't
seen them in other full run attempts).

Also clears out the a630 vk asan xfails (essentially all tests run) by
turning off leak detection which was just catching leaks in vkcts.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17304>
2022-07-05 17:02:33 +00:00
Rhys Perry
d55c4180d5 aco/tests: add vop3p constant combine tests
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16296>
2022-07-05 16:39:56 +00:00
Rhys Perry
84b404d34d aco: don't use 32-bit fp inline constants for fp16 vop3p literals
If we're applying the literal 0x3f800000 to a fp16 vop3p instruction, we
shouldn't use the 1.0 inline constant, because the hardware will use the
16-bit 1.0: 0x00003c00.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16296>
2022-07-05 16:39:56 +00:00
Rhys Perry
994f9b5a39 aco: try sign-extending or shifting constants in propagate_constants_vop3p
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16296>
2022-07-05 16:39:56 +00:00
Rhys Perry
33befb58b0 aco: fix redirect combine in propagate_constants_vop3p() with negatives
This previously didn't correctly consider negative integers when bits=16
(which sign-extend) and would have combined 0xfffe0000.xy as -2.yx. Now it
combines 0xfffeffff.xy as that instead. It was also skipped when bits=32.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16296>
2022-07-05 16:39:56 +00:00
Rhys Perry
fc39c3a0b1 aco: don't use opsel to fold constants into dot accumulation sources
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16296>
2022-07-05 16:39:56 +00:00
Rhys Perry
ae74474509 aco: fix propagate_constants_vop3p with integer vop3p and 16-bit constants
This would have created a 1.0.xx operand from 0x3c00.xx or 0x3c003c00.xy
for vop3p instructions which have 32-bit operands.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16296>
2022-07-05 16:39:56 +00:00
Rhys Perry
9739c07d9e aco: fix single-alignbyte do_pack_2x16() path with fp inline constants
We were using a 16-bit inline constant with a 32-bit instruction and the
test would have created
"v1: %_:v[0] = v_alignbyte_b32 0.5, %_:v[1][16:32], 2" instead.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16296>
2022-07-05 16:39:56 +00:00
Rhys Perry
5d8f5615d0 aco: ignore precise flag when optimizing integer clamps
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16296>
2022-07-05 16:39:56 +00:00
Rhys Perry
61eb632775 aco: include _e64 variants of 16-bit min/max in minmax optimizations
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16296>
2022-07-05 16:39:56 +00:00
Rhys Perry
f2a346eb40 aco: don't accept med3 opcodes in get_minmax_info()
I don't think the presence of med3 here breaks anything, but it shouldn't
be here anyway.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16296>
2022-07-05 16:39:56 +00:00
Rhys Perry
f937c5be7c aco: add and use constantValue16()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16296>
2022-07-05 16:39:56 +00:00
Samuel Pitoiset
184ae84a0a radv: always enable VK_EXT_debug_utils
Instead of enabling it conditionally for SQTT. Other Vulkan drivers
always expose it as well.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6772
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17355>
2022-07-05 12:20:13 +02:00
Samuel Pitoiset
053312ab87 radv: disable DCC for Melty Blood Actress Again Current Code
A D3D9 game that uses feedback loops again.

See https://github.com/ValveSoftware/Proton/issues/271

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17356>
2022-07-04 16:33:14 +02:00
Samuel Pitoiset
b11158cc8b radv: remove old workaround for HTILE layers with F1 2021
Turns out this was likely a vkd3d-proton issue because it can no
longer be reproduced since it switched to dynamic rendering by default.
AMDGPU-PRO was also affected by the same issue at that time.

According to Hans-Kristian, some bugs related to that have also been
fixed at the same time.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17296>
2022-07-04 11:51:36 +00:00
Samuel Pitoiset
25d5ef0450 radv: do not abort if SPM isn't supported for the current GPU
In a mixed GFX9/GFX10 setup, this would crash for the GFX9 logical
device. Just print a message intead of aborting.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17292>
2022-07-04 13:08:32 +02:00
Samuel Pitoiset
06a48e599e radv: use LOAD_CONTEXT_REG to load the opaque buffer size on GFX10+
For unknown reasons, COPY_DATA can hang on GFX10+ while it doesn't
hang on GFX9. Adding PFP_SYNC_ME before/after the COPY_DATA doesn't
fix the hang either.

Using a LOAD_CONTEXT_REG_INDEX packet shouldn't be needed unless the
driver supports preemption (shadow memory) which RADV doesn't support.

I don't have a real explanation but PFP_SYNC_ME+LOAD_CONTEXT_REG_INDEX
fixes a GPU hang with Space Engineers (game uses a bunch of consecutive
calls to vkCmdDrawIndirectByteCountEXT without anything in-between).

Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5838
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17290>
2022-07-04 08:42:53 +02:00
Samuel Pitoiset
3e90eb4463 radv/ci: add CI lists for LLVM on NAVI21
Copied and adjusted from the ACO lists.

v2: Martin Roukala
 - add an extra test in the list of timeouts
 - add an extra test in the list of flakes
 - remove a fail

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Martin Roukala <martin.roukala@mupuf.org>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16841>
2022-07-02 13:06:49 +03:00
Martin Roukala (né Peres)
c0fbc31737 radv/ci: test the llvm backend on navi21
The LLVM backend is not officially supported by the RADV developers,
but it has been useful early during bring-up, or later when users are
experiencing what looks like a compiler bug. It is thus beneficial to
keep it working.

However, maintaining the vkcts expectations for every platform requires
more work and machine time than what we would like to commit to. This
is why we agreed that we would only keep LLVM tested on the latest
family of Radeon GPUs.

Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16841>
2022-07-02 12:35:51 +03:00
Timur Kristóf
ff13fc381d radv: Use NIR optimization to move discards to the top.
Fossil stats on Sienna Cichlid:

Totals from 1988 (1.55% of 128653) affected shaders:
VGPRs: 68096 -> 67928 (-0.25%); split: -0.61%, +0.36%
CodeSize: 5391936 -> 5391312 (-0.01%); split: -0.11%, +0.10%
MaxWaves: 53020 -> 52946 (-0.14%); split: +0.05%, -0.19%
Instrs: 992413 -> 992509 (+0.01%); split: -0.10%, +0.11%
Latency: 8643141 -> 8789295 (+1.69%); split: -0.31%, +2.00%
InvThroughput: 1680195 -> 1680605 (+0.02%); split: -0.04%, +0.07%
SClause: 50886 -> 51318 (+0.85%); split: -0.73%, +1.57%
Copies: 57017 -> 56741 (-0.48%); split: -1.28%, +0.80%
PreSGPRs: 66766 -> 67048 (+0.42%); split: -0.24%, +0.66%
PreVGPRs: 56832 -> 56935 (+0.18%); split: -0.44%, +0.62%

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13037>
2022-07-01 18:55:53 +00:00
Timur Kristóf
5d2a243dde radv: Add CULL_PRIMITIVE to special output mask.
It isn't compiled to an output param, so can be safely ignored
from the param assignment.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17244>
2022-07-01 18:09:07 +00:00