Rhys Perry
cf62c2b2ac
radv: call nir_shader_gather_info again
...
pipeline-db (Navi, ACO):
Totals from affected shaders:
SGPRS: 11840 -> 11840 (0.00 %)
VGPRS: 19012 -> 19124 (0.59 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Scratch size: 3696 -> 3696 (0.00 %) dwords per thread
Code Size: 998680 -> 921388 (-7.74 %) bytes
LDS: 19646 -> 19646 (0.00 %) blocks
Max Waves: 3398 -> 3401 (0.09 %)
pipeline-db (Navi, LLVM):
Totals from affected shaders:
SGPRS: 17016 -> 17128 (0.66 %)
VGPRS: 19564 -> 14876 (-23.96 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Scratch size: 3872 -> 3872 (0.00 %) dwords per thread
Code Size: 820416 -> 743576 (-9.37 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 3367 -> 3534 (4.96 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4193 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4193 >
2020-03-19 15:37:07 +00:00
Samuel Pitoiset
56de6f698e
radv: remove wrong assert that checks compute subgroup size
...
Ooops. For some reasons, I have been confused with Wave32 on GFX10,
but it's still possible to require a specific subgroup size if
only Wave64 is supported.
Fixes: 672d106199 ("radv/gfx10: fix required subgroup size with VK_EXT_subgroup_size_control")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4227 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4227 >
2020-03-18 21:31:47 +00:00
Samuel Pitoiset
94e37859a9
radv: fix random depth range unrestricted failures due to a cache issue
...
The shader module name is used to compute the pipeline key. The
driver used to load the wrong pipelines because the shader names
were similar.
This should fix random failures of
dEQP-VK.pipeline.depth_range_unrestricted.*
Fixes: f11ea22666 ("radv: fix a performance regression with graphics depth/stencil clears")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4216 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4216 >
2020-03-18 11:36:24 +00:00
Bas Nieuwenhuizen
8e4e2cedcf
amd/llvm: Fix divergent descriptor regressions with radeonsi.
...
piglit/bin/arb_bindless_texture-limit -auto -fbo:
Needed to deal with non-NULL dynamic_index without deref in tex instructions.
piglit/bin/shader_runner tests/spec/arb_bindless_texture/execution/images/multiple-resident-images-reading.shader_test -auto:
Need to deal with non-deref images in enter_waterfall_imae.
Fixes: b83c9aca4a "amd/llvm: Fix divergent descriptor indexing. (v3)"
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4191 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4191 >
2020-03-17 22:53:16 +01:00
Marek Olšák
8dc5e174c7
ac: don't set old denormals flags with LLVM >= 11
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196 >
2020-03-17 20:47:48 +00:00
Marek Olšák
63a5051ea6
ac: set new LLVM denormal flags
...
See: https://reviews.llvm.org/D71358
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196 >
2020-03-17 20:47:48 +00:00
Marek Olšák
56cc10bd27
ac: unify denorm setting enforcement
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196 >
2020-03-17 20:47:48 +00:00
Samuel Pitoiset
c923de68dd
radv/gfx10: fix required ballot size with VK_EXT_subgroup_size_control
...
If compute shaders require a specific subgroup size (ie. Wave32),
we have to use the correct ballot size.
Fixes dEQP-VK.subgroups.ballot_other.compute.*_requiredsubgroupSize.
Fixes: fb07fd4e6c ("radv: implement VK_EXT_subgroup_size_control")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4215 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4215 >
2020-03-17 12:45:01 +00:00
Samuel Pitoiset
672d106199
radv/gfx10: fix required subgroup size with VK_EXT_subgroup_size_control
...
If compute shaders require a specific subgroup size (ie. Wave32),
we have to return the correct one.
Fixes dEQP-VK.subgroups.size_control.compute.required_subgroup_size_*.
Fixes: fb07fd4e6c ("radv: implement VK_EXT_subgroup_size_control")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4215 >
2020-03-17 12:45:01 +00:00
Samuel Pitoiset
46e8ba1344
radv: only inject implicit subpass dependencies if necessary
...
The Vulkan 1.2.134 spec update clarified when implicit subpass
dependencies should be injected by the driver. They only make
sense if automatic layout transitions are performed.
This should fix a performance regression with RPCS3 (although
they added a workaround for RADV since the regression has been found).
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2502
Fixes: e60de08547 ("radv: handle missing implicit subpass dependencies")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4210 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4210 >
2020-03-17 13:24:36 +01:00
Rhys Perry
2d14a8f237
aco: fix operand order for LS VGPR init bug workaround
...
Fixes: a952bf3946 ('aco: Fix LS VGPR init bug on affected hardware.')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4201 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4201 >
2020-03-16 19:34:32 +00:00
Rhys Perry
ded7a8bb46
aco: fix instruction encoding for LS VGPR init bug workaround
...
Fixes: a952bf3946 ('aco: Fix LS VGPR init bug on affected hardware.')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4201 >
2020-03-16 19:34:32 +00:00
Rhys Perry
ee9e0d1eca
aco: set late kill for v_interp_p1_f32 for some APUs
...
Apparently needed for Stoney Ridge, Kabini and Mullins APUs.
gfx702 also has 16-bank LDS and https://llvm.org/docs/AMDGPUUsage.html
lists some dGPUs under there. Those GPUs seem to be Hawaii actually
(gfx701) and we don't seem to have gotten any interpolation related bugs
reported with them so far.
The late kill flag was tested by running pipeline-db with
ACO_DEBUG=validatera while setting late kill for SMEM buffer loads,
emit_vop2_instruction() and texture instructions. I also tested with
just setting the flag for v_interp_p1_f32.
As far as I know, the only other thing we have to consider for 16-bank LDS
is something to do with 16-bit interpolation. We don't do that yet.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3914 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3914 >
2020-03-16 16:09:02 +00:00
Rhys Perry
1872759f55
aco: add a late kill flag
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3914 >
2020-03-16 16:09:02 +00:00
Rhys Perry
c51348bd9b
aco: move some register demand helpers into aco_live_var_analysis.cpp
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3914 >
2020-03-16 16:09:02 +00:00
Samuel Pitoiset
e1b08b55ff
radv/sqtt: handle thread trace capture in sqtt_QueuePresentKHR()
...
To avoid wasting CPU cycles when thread trace is not enabled.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4180 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4180 >
2020-03-16 15:42:04 +00:00
Rhys Perry
625d8705f0
aco: don't stop scheduling at exports
...
This allows us to move v_cvt_pkrtz_f16_f32 instructions upwards, improving
schedules and (somewhat unintentionally) moving the exports slightly
closer together.
Totals from affected shaders:
SGPRS: 1030224 -> 1030248 (0.00 %)
VGPRS: 794080 -> 794392 (0.04 %)
Spilled SGPRs: 127117 -> 127117 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 89028152 -> 89032312 (0.00 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 65252 -> 65219 (-0.05 %)
SMEM score: 843808.00 -> 843918.00 (0.01 %)
VMEM score: 5331687.00 -> 5397802.00 (1.24 %)
SMEM clauses: 567659 -> 567655 (-0.00 %)
VMEM clauses: 290715 -> 290716 (0.00 %)
Instructions: 17143219 -> 17144259 (0.01 %)
Cycles: 1098442808 -> 1098446968 (0.00 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3776 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3776 >
2020-03-13 14:04:50 +00:00
Rhys Perry
6b4c31f814
aco: allow barriers to be skipped during scheduling
...
Much better scheduling apparently in 160 shaders
Totals from affected shaders:
SGPRS: 6272 -> 6344 (1.15 %)
VGPRS: 4832 -> 4844 (0.25 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 467192 -> 467428 (0.05 %) bytes
LDS: 459 -> 459 (0.00 %) blocks
Max Waves: 1407 -> 1409 (0.14 %)
SMEM score: 9309.00 -> 11216.00 (20.49 %)
VMEM score: 26679.00 -> 33652.00 (26.14 %)
SMEM clauses: 1817 -> 1776 (-2.26 %)
VMEM clauses: 2286 -> 2288 (0.09 %)
Instructions: 86537 -> 86596 (0.07 %)
Cycles: 676260 -> 676568 (0.05 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3776 >
2020-03-13 14:04:50 +00:00
Rhys Perry
928ac97875
aco: add helpers for ensuring correct ordering while scheduling
...
Pipeline-db changes in 721 shaders.
Totals from affected shaders:
SGPRS: 42336 -> 42656 (0.76 %)
VGPRS: 38368 -> 38636 (0.70 %)
Spilled SGPRs: 11967 -> 11967 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 5268088 -> 5269840 (0.03 %) bytes
LDS: 1069 -> 1069 (0.00 %) blocks
Max Waves: 4473 -> 4447 (-0.58 %)
SMEM score: 41155.00 -> 41826.00 (1.63 %)
VMEM score: 146339.00 -> 147471.00 (0.77 %)
SMEM clauses: 24434 -> 24535 (0.41 %)
VMEM clauses: 16637 -> 16592 (-0.27 %)
Instructions: 996037 -> 996388 (0.04 %)
Cycles: 76476112 -> 75281416 (-1.56 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3776 >
2020-03-13 14:04:50 +00:00
Rhys Perry
2cd760847a
aco: add helpers for moving instructions for scheduling
...
No pipeline-db changes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3776 >
2020-03-13 14:04:50 +00:00
Samuel Pitoiset
2d295ab3f3
radv: add llvm_compiler_shader() helper
...
To match aco_compile_shader().
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4163 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4163 >
2020-03-13 10:22:13 +00:00
Samuel Pitoiset
4d991c2de4
radv: remove unnecessary LLVM includes
...
They are already included from src/amd/llvm.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4163 >
2020-03-13 10:22:13 +00:00
Samuel Pitoiset
5ea32a6201
radv: remove radv_shader_variant::aco_used
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4163 >
2020-03-13 10:22:13 +00:00
Samuel Pitoiset
3fea948177
radv: cleanup occurences of use_aco everywhere
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4163 >
2020-03-13 10:22:13 +00:00
Samuel Pitoiset
a46e9f4d9a
radv: use ac_gpu_info::use_late_alloc
...
Based on PAL and RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4144 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4144 >
2020-03-12 18:17:47 +00:00
Samuel Pitoiset
741dd9e32b
radv: rewrite late alloc computation
...
Based on PAL and RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4144 >
2020-03-12 18:17:47 +00:00
Samuel Pitoiset
74e7b442f2
radv: tune primitive binning for small chips
...
Based on PAL and RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4144 >
2020-03-12 18:17:47 +00:00
Samuel Pitoiset
22d3e047e5
radv: use better tessellation tunables on GFX9+
...
Based on PAL and RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4144 >
2020-03-12 18:17:47 +00:00
Samuel Pitoiset
6d27022ce1
radv/gfx10: cache metadata in L2 on small chips
...
Based on PAL and RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4144 >
2020-03-12 18:17:47 +00:00
Marek Olšák
84f97a21a6
ac: disable late alloc on small gfx10 chips
...
same as PAL.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4143 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4143 >
2020-03-12 17:27:23 +00:00
Marek Olšák
7ba5e94c50
ac: add radeon_info::use_late_alloc to control LATE_ALLOC globally
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4143 >
2020-03-12 17:27:23 +00:00
Samuel Pitoiset
e6e97ea92e
radv/sqtt: describe layout transitions with user markers
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4138 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4138 >
2020-03-12 17:04:55 +00:00
Samuel Pitoiset
b229302b96
radv/sqtt: describe begin/end subpass barriers with user markers
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4138 >
2020-03-12 17:04:55 +00:00
Bas Nieuwenhuizen
b83c9aca4a
amd/llvm: Fix divergent descriptor indexing. (v3)
...
There are multiple LLVM passes that very much move the
intrinsic using the descriptor outside of the loop, defeating
the entire point of creating the loop.
Defeat the optimizer by splitting the break into a separate
if-statement and putting an optimization barrier on the bool
in between.
v2: Move from a callback based system to begin/end loop.
This does not make it significantly less intrusive but
is a bit nicer with all the extra struct and callback
stubs.
v3: Deal with non-divergent values in divergent path.
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2160
Fixes: 028ce52739 "radv: Add non-uniform indexing lowering."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4109 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4109 >
2020-03-12 16:12:02 +00:00
Timur Kristóf
cfa299eadb
radv: Enable subgroup shuffle on GFX10 when ACO is used.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4159 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4159 >
2020-03-12 13:34:41 +00:00
Timur Kristóf
967eb23261
radv: Enable lowering dynamic quad broadcasts.
...
This will lower dynamic quad broadcasts into something that both
LLVM and ACO can understand. On hardware which supports shuffles,
they are lowered to shuffle, on older hardware (GFX6-7) they will
get lowered to constant quad broadcasts.
Fixes dEQP-VK.subgroups.quad.*.subgroupquadbroadcast_nonconst_*
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4147 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4147 >
2020-03-12 13:16:07 +00:00
Rhys Perry
85d05b3fd7
aco: fix uninitialized data error in waitcnt pass
...
Shouldn't create any incorrect waitcnts but may create suboptimial
waitcnts in rare cases.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4133 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4133 >
2020-03-12 11:46:56 +00:00
Samuel Pitoiset
cc320ef9af
ac/llvm: add missing optimization barrier for 64-bit readlanes
...
Otherwise, LLVM optimizes it but it's actually incorrect.
Fixes: 0f45d4dc2b ("ac: add ac_build_readlane without optimization barrier")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3585 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3585 >
2020-03-12 08:46:42 +01:00
Samuel Pitoiset
f0178f516f
radv: fix 32-bits build (again)
...
Fixes: dcfc08f5b8 ("radv/sqtt: describe begin/end command buffers with user markers")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4044 >
2020-03-11 19:30:13 +00:00
Timur Kristóf
61f2e8d9bb
aco: Don't store TCS outputs to LDS when we're sure that none are read.
...
This allows us not to write an output to LDS, even if it has
an indirect offset.
No pipeline DB changes.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3964 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3964 >
2020-03-11 08:34:11 +00:00
Timur Kristóf
9b36d8c23a
aco: Only write TCS outputs to LDS when they are read by the TCS.
...
Note that tess factors are always read at the end of the shader,
so those are still always saved to LDS.
Totals from affected shaders:
VGPRS: 25244 -> 25164 (-0.32 %)
Code Size: 1768268 -> 1690804 (-4.38 %) bytes
Max Waves: 4947 -> 4953 (0.12 %)
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3964 >
2020-03-11 08:34:11 +00:00
Timur Kristóf
4dcca26945
aco: Store tess factors in VMEM only at the end of the shader.
...
This optimizes out several superfluous stores of the tess factors,
especially if the shader wrote those outputs multiple times.
Pipeline DB changes on GFX10:
Totals from affected shaders:
SGPRS: 30384 -> 29536 (-2.79 %)
Code Size: 2260720 -> 2214484 (-2.05 %) bytes
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3964 >
2020-03-11 08:34:11 +00:00
Timur Kristóf
8c3ab49c6b
aco: Don't generate an if when the first part of a merged HS or GS is empty.
...
In some cases (eg. in a few tessellation CTS tests) the VS part of
a merged HS is completely empty. Let's not generate a divergent if
in these cases. (LLVM also doesn't do it.)
No pipeline DB changes, only affects the CTS.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3964 >
2020-03-11 08:34:11 +00:00
Timur Kristóf
b969501398
radv: Enable ACO on all stages.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3964 >
2020-03-11 08:34:11 +00:00
Timur Kristóf
cec6a856e5
aco: Enable running TES as ES, including merged TES+GS.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3964 >
2020-03-11 08:34:11 +00:00
Timur Kristóf
4fe5eadfae
radv: Enable ACO for TES when there is no GS.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3964 >
2020-03-11 08:34:11 +00:00
Timur Kristóf
926bdfae7d
aco: Implement loading TES inputs.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3964 >
2020-03-11 08:34:11 +00:00
Timur Kristóf
ec56a7093c
aco: Enable streamout when TES runs on the HW VS stage.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3964 >
2020-03-11 08:34:11 +00:00
Timur Kristóf
6047e51430
aco: Store TES outputs when TES runs on the HW VS stage.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3964 >
2020-03-11 08:34:11 +00:00
Timur Kristóf
1d9d1cbce9
aco: Use TES output info when TES runs on the VS stage.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3964 >
2020-03-11 08:34:11 +00:00