Commit graph

4419 commits

Author SHA1 Message Date
Bas Nieuwenhuizen
17741a0a05 radv: Only use the gfx mipmap level offset/pitch for linear textures.
The tiled-case is non-sensical for non-base mips, but Vulkan requires
that this function handles it but at the same time does not require
returning anything useful. So we can basically return anything.

Correct tiled pitch and offset are still required for our own WSI and
in the future getting the layouts of images with DRM format modifiers.
Both don't have to deal with images with more than 1 level though.

Fixes: 824bd0830e "radv: return the correct pitch for linear mipmaps on GFX10"
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2301
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2304
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2020-01-04 13:04:40 +01:00
Bas Nieuwenhuizen
f0ed67b770 Revert "amd/common: Always initialize gfx9 mipmap offset/pitch."
This reverts commit 973181c06c.

Requested by Marek.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2020-01-04 13:04:40 +01:00
Samuel Pitoiset
7b70502a5d radv: implement VK_AMD_mixed_attachment_samples
With VK_AMD_mixed_attachment_samples, the number of depth/stencil
samples isn't always equal to the number of color samples. Adjust
the number of Z samples when it's different but make sure to have
a consistent sample count if there are no depth/stencil attachments.

Also adjust the number of samples used for fragment shaders which is
the number of color samples if mixed attachment samples are used.

Only enabled on GFX8+ because it's untested on previous chips.

All dEQP-VK.pipeline.multisample.mixed_attachment_samples.* now pass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3018>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3018>
2020-01-03 12:31:53 +00:00
Samuel Pitoiset
7bbf497b68 radv: record number of color/depth samples for each subpass
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3018>
2020-01-03 12:31:53 +00:00
Bas Nieuwenhuizen
973181c06c amd/common: Always initialize gfx9 mipmap offset/pitch.
The WSI expects pitch to be meaningful even for tiled
textures.

(It is used for the pitch in modesetting and X11)

Fixes: 824bd0830e "radv: return the correct pitch for linear mipmaps on GFX10"
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2301
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2304
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3245>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3245>
2020-01-02 11:25:51 +00:00
Timur Kristóf
11e62a9734 aco: Fix uniform i2i64.
Fixes 240 failing test cases in dEQP-VK.spirv_assembly which
were failing due to a bad s_ashr_i32 instruction. This commit
fixes the instruction format along with the definitions of the
instruction.

Fixes: 11f43caaec
Cc: 19.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-31 14:22:31 +01:00
Samuel Pitoiset
824bd0830e radv: return the correct pitch for linear mipmaps on GFX10
On GFX9, the pitch of a level is always the pitch of the entire image
but not on GFX10.

This fixes graphics glithes with Halo - The Master Chief Collection.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2188
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-30 14:17:45 +01:00
Bas Nieuwenhuizen
88f567b5ce amd/common: Handle alignment of 96-bit formats.
addrlib doesn't quite do it right, so do it ourselves.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2162
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-30 00:02:46 +01:00
Eric Engestrom
51569e525a amd: fix empty-body issues
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Fixes: 8d43e2b2de ("meson: add -Werror=empty-body to disallow `if(x);`")
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2019-12-27 22:09:00 +00:00
Mauro Rossi
563bd61fee android: radv: build radv_shader_args.c
Updates radv Makefile.sources and fixes the following building error:

external/mesa/src/amd/vulkan/radv_shader.c:1122:
error: undefined reference to 'radv_declare_shader_args'

Fixes: 3b14336 ("ac/nir, radv, radeonsi: Switch to using ac_shader_args")
Fixes: 66c703b ("radv: Move argument declaration out of nir_to_llvm")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-12-27 09:20:54 +01:00
Mauro Rossi
962b70c259 android: radeonsi,ac: fix building error due to ac changes
Updates amd Makefile.sources and fixes the following building errors:

external/mesa/src/gallium/drivers/radeonsi/si_compute_prim_discard.c:338: error: undefined reference to 'ac_add_arg'
external/mesa/src/gallium/drivers/radeonsi/si_compute_prim_discard.c:340: error: undefined reference to 'ac_add_arg'
external/mesa/src/gallium/drivers/radeonsi/si_compute_prim_discard.c:341: error: undefined reference to 'ac_add_arg'
external/mesa/src/gallium/drivers/radeonsi/si_compute_prim_discard.c:342: error: undefined reference to 'ac_add_arg'

Fixes: 9885af3 ("ac: Add a shared interface between radv, radeonsi, LLVM and ACO")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-12-27 09:20:49 +01:00
Mauro Rossi
ad1c65e322 android: radv: fix vk_format_table.c generated source build
RADV Android build rules are now getting the wrong vk_format.h
from src/vulkan/util include, the simplest way to fix is to add
src/amd/vulkan include prior to src/vulkan/util include

Fixes the following building errors:

out/target/product/x86_64/obj_x86/STATIC_LIBRARIES/libmesa_radv_common_intermediates/vk_format_table.c:39:4:
 error: use of undeclared identifier 'VK_FORMAT_LAYOUT_PLAIN'
...
out/target/product/x86_64/obj_x86/STATIC_LIBRARIES/libmesa_radv_common_intermediates/vk_format_table.c:131:8:
error: use of undeclared identifier 'VK_FORMAT_TYPE_UNSIGNED'; did you mean 'UTIL_FORMAT_TYPE_UNSIGNED'?
      {VK_FORMAT_TYPE_UNSIGNED, true, false, false, 4, 0},      /* x = a */
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.

Fixes: 3a28281 ("util: Add a mapping from VkFormat to PIPE_FORMAT.")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-12-27 09:20:44 +01:00
Bas Nieuwenhuizen
a435f002c4 radv: Expose all sample counts for integer formats as well.
Things work the same between float and integer.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2261
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-12-26 10:48:29 +01:00
Rhys Perry
afe1a8ff5b aco: fix vgpr alloc granule with wave32
We still need to increase the number of physical vgprs

Totals from affected shaders:
SGPRS: 671976 -> 675288 (0.49 %)
VGPRS: 550112 -> 562596 (2.27 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 27621660 -> 27606532 (-0.05 %) bytes
Max Waves: 81083 -> 87833 (8.32 %)
Instructions: 5391560 -> 5389031 (-0.05 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-21 12:38:42 +01:00
Rhys Perry
01ccd7839c aco: improve jump threading with wave32
Totals from affected shaders:
SGPRS: 748746 -> 748746 (0.00 %)
VGPRS: 636984 -> 636984 (0.00 %)
Spilled SGPRs: 387 -> 387 (0.00 %)
Spilled VGPRs: 15 -> 15 (0.00 %)
Code Size: 61138824 -> 60928620 (-0.34 %) bytes
Max Waves: 48602 -> 48602 (0.00 %)
Instructions: 11967660 -> 11915084 (-0.44 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-21 12:38:42 +01:00
Rhys Perry
6ff92f3d68 aco/wave32: fix comparison optimizations
Previously, they weren't done in wave32.

Totals from affected shaders:
SGPRS: 507726 -> 508006 (0.06 %)
VGPRS: 450340 -> 450268 (-0.02 %)
Spilled SGPRs: 298 -> 298 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 39689708 -> 39384488 (-0.77 %) bytes
Max Waves: 39631 -> 39636 (0.01 %)
Instructions: 7865919 -> 7793650 (-0.92 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-12-21 12:38:42 +01:00
Karol Herbst
b35e583c17 aco: use NIR_MAX_VEC_COMPONENTS instead of 4
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-12-21 11:00:16 +00:00
Marek Olšák
7d65614422 ac/surface: fix an assertion failure on gfx9 in CMASK computation
addrlib only allows the 2D resource type with CMASK.

Fixes: 69ea473eeb "amd/addrlib: update to the latest version"

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3187>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3187>
2019-12-20 22:57:08 +00:00
Samuel Pitoiset
02dd1fb859 radv: rely on pipeline layout when creating push descriptors with template
descriptorSetLayout should be ignored for push descriptors. While
we are it, also ignore pipelineBindPoint.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2210
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3180>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3180>
2019-12-20 13:41:29 +01:00
Samuel Pitoiset
2eef9e050f radv: ignore pColorBlendState if rasterization is disabled
Or if the subpass has no color attachments.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3167>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3167>
2019-12-20 08:21:02 +01:00
Samuel Pitoiset
021c7b5309 radv: tidy up radv_pipeline_init_blend_state()
This is needed for the next commit because pColorBlendState can
actually be NULL but some fields might have to be initialized
(eg. alpha to coverage with no color attachments).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3167>
2019-12-20 08:20:58 +01:00
Samuel Pitoiset
ebc7a77869 radv: ignore pDepthStencilState if rasterization is disabled
Or if the subpass has no depth stencil attachment.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3167>
2019-12-20 08:20:55 +01:00
Samuel Pitoiset
ce67e41535 radv: ignore pTessellationState if the pipeline doesn't use tess
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3167>
2019-12-20 08:20:52 +01:00
Samuel Pitoiset
7735f314b7 radv: ignore pMultisampleState if rasterization is disabled
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3167>
2019-12-20 08:20:49 +01:00
Samuel Pitoiset
589bfcbde3 radv: init a default multisample state for the resolve FS path
pMultisampleState must be a valid pointer.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3167>
2019-12-20 08:20:44 +01:00
Samuel Pitoiset
13b4e9adcf ac: declare an enum for the OOB select field on GFX10
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3147>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3147>
2019-12-19 15:15:32 +01:00
Samuel Pitoiset
f3cccd05d9 radv/gfx10: fix the out-of-bounds check for vertex descriptors
When stride is 0, it should check against the offset not the index.

This fixes black character models with Beat Saber and missing snow
with Dragon Quest.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2233
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1975
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3147>
2019-12-19 15:15:30 +01:00
Bas Nieuwenhuizen
a9a3108be7 radv: Limit workgroup size to 1024.
Fixes a hang with geekbench.

The existence of RX 580 and NAVI10 results shows that the generations
before and after this do not have the issue. (They show up on the
website). So this is likely a GFX9 only issue.

This is not something weird like LDS size since none of the shaders
seem to use LDS.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3145>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3145>
2019-12-18 20:41:18 +00:00
Samuel Pitoiset
d399f4f414 radv/gfx10: fix ngg_get_ordered_id
Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3133>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3133>
2019-12-17 12:34:18 +00:00
Marek Olšák
69ea473eeb amd/addrlib: update to the latest version
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-12-16 17:04:57 -05:00
Marek Olšák
f90cbd18ff ac: fix the return value in cull_bbox when bbox culling is disabled
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
2019-12-16 20:06:07 +00:00
Marek Olšák
e5e3ffa6b9 ac: fix ac_get_i1_sgpr_mask for Wave32
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
2019-12-16 20:06:07 +00:00
Bas Nieuwenhuizen
b53856aca3 amd/common: Always use addrlib for HTILE tc-compat.
Even without depth+stencil addrlib can (correctly!) decide to
disable tc compatible HTILE.

One example is 8x sampling with 32-bit depth on Stoney. The row size
on Stoney is 1024, while the tile size is 2048, which results in
tile splits which are not supported with tc-compat.

On Stoney, this fixes
dEQP-VK.glsl.builtin_var.fragdepth.*_list_d32_sfloat_multisample_8

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054>
2019-12-14 20:39:29 +00:00
Bas Nieuwenhuizen
e197fb1c2f amd/common: Fix tcCompatible degradation on Stoney.
addrlib sometimes returns smaller sizes for tcCompat as it does
not seem to take into account the depth+stencil matching config
gymnastics with tcCompat.

This fixes
dEQP-VK.pipeline.render_to_image.core.2d_array.huge.height.r8g8b8a8_unorm_d32_sfloat_s8_uint

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054>
2019-12-14 20:39:29 +00:00
Samuel Pitoiset
b37c91c12e radv: handle unaligned vertex fetches on GFX6/GFX10
The Vulkan spec doesn't have any words for vertex attributes alignment.

Fixes a test failure on GFX6 and a GPU hang on GFX10 with:
dEQP-VK.spirv_assembly.instruction.spirv1p4.entrypoint.tess_con_pc_entry_point

vkpipeline-db results on GFX10:
Totals from affected shaders:
SGPRS: 463772 -> 472972 (1.98 %)
VGPRS: 343208 -> 343752 (0.16 %)
Spilled SGPRs: 323 -> 336 (4.02 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 13806200 -> 14164472 (2.60 %) bytes
Max Waves: 84021 -> 83755 (-0.32 %)

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2161
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-13 09:54:07 +00:00
Samuel Pitoiset
eda1b77cc2 radv: enable SpvCapabilityImageMSArray
The Vulkan spec says that StorageImageMultisample and ImageMSArray
SPIRV-V capabilities must be enabled if the
shaderStorageImageMultisample feature is supported.

This fixes a warning with RenderDoc.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2212
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-12 18:52:08 +01:00
Samuel Pitoiset
a0f1a5fa05 ac/nir: fix out-of-bound access when loading constants from global
Global load/store instructions can't know if the offset is
out-of-bound because they don't use descriptors (no range).

Fix this by clamping the offset for arrays that are indexed
with a non-constant offset that's greater or equal to the array
size.

This fixes VM faults and GPU hangs with Dead Rising 4.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2148
Fixes: 71a6794200 ("ac/nir: Enable nir_opt_large_constants")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-12 10:12:56 +00:00
Bas Nieuwenhuizen
2e44bfc14f radv: Fix RGBX Android<->Vulkan format correspondence.
This is correct per the Vulkan spec format equivalence table.

Fixes: f36b52740a "radv/android: Add android hardware buffer queries."
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-12-11 11:40:13 +01:00
Samuel Pitoiset
e4c8491bdf radv: implement VK_KHR_separate_depth_stencil_layouts
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-10 13:16:17 +01:00
Samuel Pitoiset
48ee62178f radv: initialize HTILE for separate depth/stencil aspects
It either clears the whole HTILE buffer or part of it depending
on the HTILE mask parameter.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-10 13:09:29 +01:00
Samuel Pitoiset
41cebfc9c1 radv: do not init HTILE as compressed state when dst layout allows it
I don't think this makes much differences and a potential clear
following the initialization will overwrite HTILE anyways.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-10 13:09:26 +01:00
Samuel Pitoiset
b603cc8c84 radv: synchronize after performing a separate depth/stencil fast clears
For depth+stencil images, the driver might use an optimized path
if only one aspect is cleared. It either clears the depth or the
stencil part of HTILE. Because the two separate aspects might use
the same HTILE memory we have to synchronize.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-10 13:09:22 +01:00
Samuel Pitoiset
008fe909ca radv: fix possibly wrong PA_SC_AA_CONFIG value for conservative rast
PA_SC_AA_CONFIG might be updated when conversative rasterization is
enabled. Because the driver only re-emits the multisample state if
the number of samples is different, that register value might not
be updated correctly.

Found by inspection, doesn't fix anything known.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-10 11:04:43 +01:00
Samuel Pitoiset
4f659224c8 radv: move emission of two PA_SC_* registers to the pipeline CS
They don't have to be updated dynamically.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-10 11:04:40 +01:00
Samuel Pitoiset
86dfe92bd0 radv: do not use VK_TRUE/VK_FALSE
For consistency.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-09 09:21:26 +01:00
Daniel Schürmann
8259c97b2d aco: propagate temporaries into expanded vectors
Gives a very slight decrease in code size:
Totals from affected shaders:
Code Size: 1708488 -> 1702768 (-0.33 %) bytes
Max Waves: 2858 -> 2855 (-0.10 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
df3e674fb3 aco: improve readfirstlane after uniform ssbo loads on GFX7
pipeline-db changes for GFX7:

80310 shaders in 40472 tests
Totals:
SGPRS: 3655900 -> 3643916 (-0.33 %)
VGPRS: 2678324 -> 2686324 (0.30 %)
Spilled SGPRs: 1730 -> 1634 (-5.55 %)
Spilled VGPRs: 14 -> 21 (50.00 %)
Scratch size: 15540 -> 15536 (-0.03 %) dwords per thread
Code Size: 136106120 -> 135457616 (-0.48 %) bytes
LDS: 1259 -> 1259 (0.00 %) blocks
Max Waves: 601014 -> 600206 (-0.13 %)

Totals from affected shaders:
SGPRS: 307832 -> 295848 (-3.89 %)
VGPRS: 267864 -> 275864 (2.99 %)
Spilled SGPRs: 770 -> 674 (-12.47 %)
Spilled VGPRs: 14 -> 21 (50.00 %)
Scratch size: 16 -> 12 (-25.00 %) dwords per thread
Code Size: 22007488 -> 21358984 (-2.95 %) bytes
LDS: 65 -> 65 (0.00 %) blocks
Max Waves: 28668 -> 27860 (-2.82 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
0837471463 aco: use soffset for MUBUF instructions on SI/CI
pipeline-db changes for GFX7:

80310 shaders in 40472 tests
Totals:
SGPRS: 3655300 -> 3655900 (0.02 %)
VGPRS: 2677732 -> 2678324 (0.02 %)
Spilled SGPRs: 1730 -> 1730 (0.00 %)
Spilled VGPRs: 14 -> 14 (0.00 %)
Scratch size: 15540 -> 15540 (0.00 %) dwords per thread
Code Size: 136488364 -> 136106120 (-0.28 %) bytes
LDS: 1259 -> 1259 (0.00 %) blocks
Max Waves: 601039 -> 601014 (-0.00 %)

Totals from affected shaders:
SGPRS: 316312 -> 316912 (0.19 %)
VGPRS: 273844 -> 274436 (0.22 %)
Spilled SGPRs: 770 -> 770 (0.00 %)
Spilled VGPRs: 14 -> 14 (0.00 %)
Scratch size: 16 -> 16 (0.00 %) dwords per thread
Code Size: 22724904 -> 22342660 (-1.68 %) bytes
LDS: 114 -> 114 (0.00 %) blocks
Max Waves: 30861 -> 30836 (-0.08 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
7b38d95b32 radv: Enable ACO on GFX7 (Sea Islands)
This patch also disables AMD_shader_ballot on GFX7 by default if ACO is used.
Note that shader_ballot works correctly, but performance seems inferior.
To enable shader_ballot use RADV_PERFTEST=shader_ballot.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00
Daniel Schürmann
28c95cc402 aco: return to loop_active mask at continue_or_break blocks
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-12-07 11:23:11 +01:00