Commit graph

8373 commits

Author SHA1 Message Date
Timur Kristóf
b293299776 aco/optimizer_postRA: Fix applying VCC to branches.
Fixes: a93092d0ed
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14281>
2021-12-21 22:53:23 +00:00
Timur Kristóf
ce4daa259c aco/optimizer_postRA: Fix combining DPP into VALU.
Fixes: 4ac47ad1cd
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14281>
2021-12-21 22:53:23 +00:00
Daniel Schürmann
d36a43598c aco/ra: fix get_reg_for_operand() in case of stride mismatches
We have to clear the register file from the previous operand
as otherwise, there might be no space left.

Totals from 5 (0.00% of 134572) affected shaders: (GFX10.3)
CodeSize: 21144 -> 21000 (-0.68%); split: -0.72%, +0.04%
Instrs: 3738 -> 3720 (-0.48%); split: -0.51%, +0.03%
Latency: 517229 -> 516319 (-0.18%); split: -0.18%, +0.00%
InvThroughput: 49068 -> 48902 (-0.34%); split: -0.38%, +0.04%
Copies: 501 -> 483 (-3.59%); split: -3.79%, +0.20%

Cc: mesa-stable
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14279>
2021-12-21 17:15:45 +00:00
Daniel Schürmann
30a7199e37 aco/optimizer: propagate and fold inline constants on VOP3P instructions
This patch aims to propagate and fold constants on VOP3P instructions
by using omod selection and the fneg modifier.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13688>
2021-12-21 13:23:36 +01:00
Daniel Schürmann
62bcfcd0a8 aco: change fneg for VOP3P to use fmul with +1.0
This will be useful to be able to also apply
fneg_lo and fneg_hi.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13688>
2021-12-21 13:23:36 +01:00
Daniel Schürmann
193bd740ab aco/optimizer: fix fneg modifier propagation on VOP3P
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13688>
2021-12-21 13:23:36 +01:00
Rhys Perry
fa4e08112e aco: remove SMEM constant/addition combining out of the loop
There's no reason for this optimization to be in this loop.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13755>
2021-12-17 22:14:36 +00:00
Rhys Perry
dd18925f86 aco: skip &-4 before SMEM
The hardware ignores the low 2 bits. I'm not sure if they are ignored
before or after the address is calculated, but this optimization should be
cautious enough.

fossil-db (Sienna Cichlid):
Totals from 259 (0.19% of 134572) affected shaders:
SpillSGPRs: 1381 -> 1382 (+0.07%)
SpillVGPRs: 1783 -> 1782 (-0.06%); split: -0.67%, +0.62%
CodeSize: 1598612 -> 1596084 (-0.16%); split: -0.30%, +0.14%
Scratch: 180224 -> 179200 (-0.57%); split: -1.14%, +0.57%
Instrs: 284885 -> 284268 (-0.22%); split: -0.34%, +0.12%
Latency: 6585634 -> 6603388 (+0.27%); split: -0.48%, +0.75%
InvThroughput: 2638983 -> 2648474 (+0.36%); split: -0.58%, +0.94%
VClause: 6797 -> 6820 (+0.34%); split: -0.15%, +0.49%
SClause: 6569 -> 6574 (+0.08%); split: -1.11%, +1.19%
Copies: 50561 -> 50586 (+0.05%); split: -0.61%, +0.66%
Branches: 10058 -> 10062 (+0.04%); split: -0.01%, +0.05%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13755>
2021-12-17 22:14:36 +00:00
Rhys Perry
cf5fc4b973 aco: disallow SMEM offsets that are not multiples of 4
These can't be encoded on GFX6/7, and combining these additions causes
CTS failures on GFX10.3.

I think the low 2 MSBs are ignored before the addition, not after, so
load(a + 3, 0) becomes load(a, 3), which is the same as load(a, 0).

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13755>
2021-12-17 22:14:36 +00:00
Bas Nieuwenhuizen
860532c5a1 radv: Add safety check for RGP traces on VanGogh.
To avoid accidental hangs.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5260
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13730>
2021-12-17 21:25:01 +00:00
Samuel Pitoiset
ac8afe3f44 radv: fix dynamic rendering global scissor
Make sure to clamp the global scissor to the render area.

This fixes dEQP-VK.draw.dynamic_rendering.*oversized.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14170>
2021-12-17 16:51:00 +00:00
Rhys Perry
94603786c5 aco: fix check_vop3_operands() for f16vec2 ffma fneg combine
For v_pk_fma_f16, we should consider all three operands, not the first
two.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 15a375b4c8 ("radv,aco: don't lower some ffma instructions")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14229>
2021-12-17 11:16:12 +00:00
Bas Nieuwenhuizen
ed7c48f94a radv/amdgpu: Only wait on queue_syncobj when needed.
If signalled on the same queue it is totally useless, so only wait
if we have a syncobj that is explicitly being waited on, which can
be from potentially another queue/ctx. (Ideally we'd check but there
is no way to do so currently. Might revisit when we integrate the
common sync framework)

Fixes: 7675d066ca ("radv/amdgpu: Add support for submitting 0 commandbuffers.")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14214>
2021-12-17 08:54:08 +00:00
Samuel Pitoiset
5ce4017a2b radv,aco: do not disable anisotropy filtering for non-mipmap images
This fixes
dEQP-VK.texture.filtering_anisotropy.single_level.anisotropy_*.mag_linear_min_linear.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14171>
2021-12-16 07:20:50 +00:00
Samuel Pitoiset
8a327722d5 ac/nir: add an option to disable anisotropic filtering for single level images
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14171>
2021-12-16 07:20:50 +00:00
Samuel Pitoiset
cca8803e6b radv/winsys: update sparse mappings with OP_REPLACE instead of OP_MAP/OP_UNMAP
When the BO is NULL, AMDGPU will reset the PTE VA range to the initial
state. Otherwise, it will first unmap all existing VA that overlap the
requested range and then map. This seems better than using MAP/UNMAP.

This reduces stuttering in Forza Horizon 5.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14116>
2021-12-15 08:17:29 +01:00
Samuel Pitoiset
a6ca0a8c52 radv/winsys: stop using reference counting for virtual BOs
This shouldn't be necessary because applications have to manage
resources and memory themselves.

This also prevented memory to be freed if an application doesn't unbind
a sparse memory object and free it, which is legal as long as the
resource isn't used afterwards.

This was introduced to unmap the sparse mappings when destroying
a virtual BO, but now that the driver uses OP_CLEAR it's no longer
needed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14116>
2021-12-15 08:17:26 +01:00
Samuel Pitoiset
a931d5a4a4 radv/winsys: clear the PRT VA range when destroying a virtual BO
Instead of unmapping every range.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14116>
2021-12-15 08:17:24 +01:00
Samuel Pitoiset
782474782b radv/winsys: remove useless has_sparse_vm_mappings checks
Sparse is only exposed on GFX8+, so this is always TRUE.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14116>
2021-12-15 08:17:22 +01:00
Rhys Perry
451e6c1b32 radv: have the null winsys set more fields
I copied stuff from ac_gpu_info.c until there were no Sienna Cichild or
Polaris10 fossil-db changes between real hardware and RADV_FORCE_FAMILY.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14126>
2021-12-14 12:11:50 +00:00
Bas Nieuwenhuizen
9e8fa8168b radv: Expose the ETC2 emulation.
As needed on Android (as it is required) and by driconf flag otherwise.
The non-Android case would be on the host side for an Android VM.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14071>
2021-12-14 11:30:48 +00:00
Bas Nieuwenhuizen
a8078bab91 radv: Deal with border colors with emulated ETC2.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14071>
2021-12-14 11:30:48 +00:00
Bas Nieuwenhuizen
1153db23f5 radv: Add ETC2 decode shader.
To make sure that apps actually get something when the HW doesn't
support ETC2. To do that we decompress after every copy operation.

Includes a quite complicated decode shader. It is not bit-to-bit
equivalent to AMD APUs that support ETC2, but close enough to
pass CTS. Likely missing bits are related to the R11 and R11G11
formats where we decode to 16 bits but likely do the extension
differently.

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14071>
2021-12-14 11:30:48 +00:00
Bas Nieuwenhuizen
42a71be793 radv: Add extra plane for decoding ETC images with emulation.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14071>
2021-12-14 11:30:48 +00:00
Bas Nieuwenhuizen
c55ebdb76d radv: Use the correct base format for reintepretation.
Going to hit it when emulating ETC2 through another plane.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14071>
2021-12-14 11:30:48 +00:00
Bas Nieuwenhuizen
7c5fe66f8a radv: Set up ETC2 emulation wiring.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14071>
2021-12-14 11:30:48 +00:00
Rhys Perry
15a375b4c8 radv,aco: don't lower some ffma instructions
GFX10.3 has no v_mad_f32 and we can't recombine exact ffma into a
v_fma_f32 if they're split. GFX9+ only has v_fma_f16 and no generation has
a 64-bit MAD.

fossil-db (GFX10.3):
Totals from 84040 (57.46% of 146267) affected shaders:
VGPRs: 3717256 -> 3688064 (-0.79%); split: -0.87%, +0.08%
SpillSGPRs: 10419 -> 10403 (-0.15%)
CodeSize: 263064884 -> 262442820 (-0.24%); split: -0.31%, +0.07%
MaxWaves: 2036908 -> 2038374 (+0.07%); split: +0.10%, -0.03%
Instrs: 49849448 -> 49572182 (-0.56%); split: -0.60%, +0.04%
Latency: 908130602 -> 907764246 (-0.04%); split: -0.18%, +0.14%
InvThroughput: 207051300 -> 206762704 (-0.14%); split: -0.24%, +0.10%

fossil-db (GFX10):
Totals from 2 (0.00% of 146267) affected shaders:
Latency: 8123 -> 8107 (-0.20%)

fossil-db (GFX9):
Totals from 2 (0.00% of 146401) affected shaders:
(no statistics affected)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9805>
2021-12-13 11:22:33 +00:00
Rhys Perry
165ca5088b radv,aco: implement nir_op_ffma
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9805>
2021-12-13 11:22:33 +00:00
Rhys Perry
c5f02a1cd3 aco: swap multiplication operands if needed to create v_fmac_f32/etc
For v_pk_fma_f32 and v_fma_f32 from nir_op_ffma, we don't try to put
scalars in the first operand.

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9805>
2021-12-13 11:22:33 +00:00
Rhys Perry
f4f5d577fc aco: swap operands if necessary to create v_madak/v_fmaak
Also rewrite the check_literal logic to be more straightforward.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9805>
2021-12-13 11:22:33 +00:00
Rhys Perry
2665320c78 aco: create v_fmamk_f32/v_fmaak_f32 from nir_op_ffma
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9805>
2021-12-13 11:22:33 +00:00
Rhys Perry
a487747ebd aco: use more predictable tiebreaker when forming MADs
fossil-db (GFX10.3):
Totals from 84981 (58.10% of 146267) affected shaders:
VGPRs: 3829896 -> 3820480 (-0.25%); split: -0.33%, +0.08%
CodeSize: 270860472 -> 270850132 (-0.00%); split: -0.08%, +0.08%
MaxWaves: 2035822 -> 2042516 (+0.33%); split: +0.39%, -0.06%
Instrs: 51285526 -> 51308869 (+0.05%); split: -0.03%, +0.08%
Latency: 931503706 -> 932556231 (+0.11%); split: -0.19%, +0.30%
InvThroughput: 217084232 -> 217070849 (-0.01%); split: -0.12%, +0.11%

fossil-db (GFX10):
Totals from 85520 (58.47% of 146267) affected shaders:
VGPRs: 3729132 -> 3725344 (-0.10%); split: -0.21%, +0.10%
CodeSize: 272796500 -> 272783084 (-0.00%); split: -0.09%, +0.08%
MaxWaves: 2246410 -> 2249012 (+0.12%); split: +0.17%, -0.05%
Instrs: 51643962 -> 51664865 (+0.04%); split: -0.04%, +0.08%
Latency: 932331949 -> 933274979 (+0.10%); split: -0.19%, +0.29%
InvThroughput: 214187040 -> 214130994 (-0.03%); split: -0.13%, +0.11%

fossil-db (GFX9):
Totals from 84619 (57.80% of 146401) affected shaders:
SGPRs: 5366240 -> 5366944 (+0.01%); split: -0.09%, +0.10%
VGPRs: 3765608 -> 3764972 (-0.02%); split: -0.23%, +0.22%
CodeSize: 263634732 -> 263616320 (-0.01%); split: -0.08%, +0.08%
MaxWaves: 546617 -> 547091 (+0.09%); split: +0.18%, -0.09%
Instrs: 51426195 -> 51458334 (+0.06%); split: -0.03%, +0.10%
Latency: 1164445660 -> 1161923480 (-0.22%); split: -0.46%, +0.24%
InvThroughput: 542964697 -> 542329595 (-0.12%); split: -0.26%, +0.14%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9805>
2021-12-13 11:22:33 +00:00
Samuel Pitoiset
9a388beda7 radv: ignore dynamic inheritance if the render pass isn't NULL
From the Vulkan spec:

    "If the pNext chain of VkCommandBufferInheritanceInfo includes a
     VkCommandBufferInheritanceRenderingInfoKHR structure, then that
     structure controls parameters of dynamic render pass instances
     that the VkCommandBuffer can be executed within. If
     VkCommandBufferInheritanceInfo::renderPass is not VK_NULL_HANDLE,
     or VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT is not
     specified in VkCommandBufferBeginInfo::flags, parameters of this
     structure are ignored."

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14109>
2021-12-13 10:48:44 +00:00
Samuel Pitoiset
841949e50b radv: fix dynamic rendering inheritance if the subpass index isn't 0
The driver will always create only one subpass in the render pass
for inheritance but the subpass index isn't always zero.

This fixes dEQP-VK.multiview.dynamic_rendering.secondary_cmd_buffer*.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14109>
2021-12-13 10:48:44 +00:00
Samuel Pitoiset
43022ecc3a radv: enable lower_lod_zero_width
This fixes dEQP-VK.glsl.texture_functions.query.texturequerylod.*.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14147>
2021-12-13 10:00:07 +00:00
Rhys Perry
c7fa15b381 aco: improve clrx disassembly
- remove uninteresting lines of output
- remove binary offset before instructions, for easier diffing
- replace generated labels with block numbers
- add encoded instructions at the end of lines, like LLVM dissaembly
- print constant data instead of trying to disassemble it

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14042>
2021-12-10 23:46:30 +00:00
Samuel Pitoiset
e2ad92eb22 radv: do not perform depth/stencil resolves for suspended render pass
From the Vulkan spec:

    "Store and resolve operations are only performed at the end of a
     render pass instance that does not specify the
     VK_RENDERING_SUSPENDING_BIT_KHR flag."

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14112>
2021-12-10 22:03:00 +00:00
Samuel Pitoiset
c27348a3f7 Revert "radv: Add bufferDeviceAddressMultiDevice support."
This was a workaround for fixing a crash with Baldur's Gate 3 at start
but the game fixed it since.

This reverts commit 1fe375e7cf.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14134>
2021-12-10 21:38:00 +00:00
Rhys Perry
786d434397 aco: don't create unnecessary addition in indirect get_sampler_desc()
I don't think this has any effect on GFX9+ because the addition is
combined into the load.

fossil-db (polaris10):
Totals from 12595 (9.29% of 135627) affected shaders:
SGPRs: 1054348 -> 1054860 (+0.05%); split: -0.02%, +0.07%
VGPRs: 667240 -> 667320 (+0.01%); split: -0.01%, +0.02%
CodeSize: 82761508 -> 82512816 (-0.30%); split: -0.30%, +0.00%
MaxWaves: 62182 -> 62181 (-0.00%)
Instrs: 16072934 -> 16010764 (-0.39%); split: -0.39%, +0.00%
Latency: 582819635 -> 582287964 (-0.09%); split: -0.13%, +0.04%
InvThroughput: 276460536 -> 276417613 (-0.02%); split: -0.06%, +0.05%
VClause: 261656 -> 261654 (-0.00%); split: -0.01%, +0.01%
SClause: 680952 -> 680854 (-0.01%); split: -0.05%, +0.04%
Copies: 1727202 -> 1727742 (+0.03%); split: -0.12%, +0.15%
Branches: 547050 -> 547033 (-0.00%); split: -0.01%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14043>
2021-12-09 17:58:54 +00:00
Timur Kristóf
77db4e27b1 aco: Clean up and fix quad group instructions with WQM.
According to the Vulkan spec chapter 9.25 Helper Invocations,
quad group operations have to be executed by helper invocations.

This commit cleans up the code for quad group instructions by
unifying the code path of quad broadcast with the others, and then
calling emit_wqm just once at the end.

Fixes: 93c8ebfa78
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5570
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13929>
2021-12-09 17:36:51 +00:00
Dave Airlie
d051854cca treewide: drop mtypes/macros includes from main
These aren't required in lots of places, so remove them.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14127>
2021-12-08 22:14:45 +00:00
Bas Nieuwenhuizen
7f34b70474 radv: Use the winsys 0 cmdbuffer submission support.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14097>
2021-12-08 20:21:05 +00:00
Bas Nieuwenhuizen
7675d066ca radv/amdgpu: Add support for submitting 0 commandbuffers.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14097>
2021-12-08 20:21:05 +00:00
Bas Nieuwenhuizen
6eb0821315 radv/winsys: Add queue family param to submit.
Extracting it from the first CS will not go over well once we try
submitting 0 of them.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14097>
2021-12-08 20:21:05 +00:00
Bas Nieuwenhuizen
c03d258046 radv/amdgpu: Add a syncobj per queue.
For merging our own dependencies in without submitting.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14097>
2021-12-08 20:21:05 +00:00
Rhys Perry
420170fabc radv: initialize workgroup_size in radv_meta_init_shader
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14087>
2021-12-08 11:07:40 +00:00
Rhys Perry
85161fb8ac radv: clone shader in radv_shader_compile_to_nir
This way, radv_shader_compile_to_nir doesn't alter the NIR.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14087>
2021-12-08 11:07:40 +00:00
Rhys Perry
2020a1799b radv: include RT shaders in RADV_DEBUG=shaders,shaderstats
Instead of using module->nir or nir->info->name to determine if it's a
meta shader, use nir->info->internal.

This also has an effect of disabling printing of meta shaders with
NIR_DEBUG=print.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14087>
2021-12-08 11:07:40 +00:00
Rhys Perry
d74498e617 radv: add radv_meta_init_shader
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14087>
2021-12-08 11:07:40 +00:00
Samuel Pitoiset
31efef7b3a radv: mark GFX10.3 (aka RDNA2) as conformant products with CTS 1.2.7.1
https://www.khronos.org/conformance/adopters/conformant-products#submission_599

Also bump the CTS version for GFX8, GFX9 and GFX10 because they also
pass 1.2.7.1.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14076>
2021-12-08 09:27:26 +00:00