This is needed for the next commit because pColorBlendState can
actually be NULL but some fields might have to be initialized
(eg. alpha to coverage with no color attachments).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3167>
Fixes a hang with geekbench.
The existence of RX 580 and NAVI10 results shows that the generations
before and after this do not have the issue. (They show up on the
website). So this is likely a GFX9 only issue.
This is not something weird like LDS size since none of the shaders
seem to use LDS.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3145>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3145>
The Vulkan spec doesn't have any words for vertex attributes alignment.
Fixes a test failure on GFX6 and a GPU hang on GFX10 with:
dEQP-VK.spirv_assembly.instruction.spirv1p4.entrypoint.tess_con_pc_entry_point
vkpipeline-db results on GFX10:
Totals from affected shaders:
SGPRS: 463772 -> 472972 (1.98 %)
VGPRS: 343208 -> 343752 (0.16 %)
Spilled SGPRs: 323 -> 336 (4.02 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 13806200 -> 14164472 (2.60 %) bytes
Max Waves: 84021 -> 83755 (-0.32 %)
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2161
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
The Vulkan spec says that StorageImageMultisample and ImageMSArray
SPIRV-V capabilities must be enabled if the
shaderStorageImageMultisample feature is supported.
This fixes a warning with RenderDoc.
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2212
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This is correct per the Vulkan spec format equivalence table.
Fixes: f36b52740a "radv/android: Add android hardware buffer queries."
Reviewed-by: Eric Anholt <eric@anholt.net>
It either clears the whole HTILE buffer or part of it depending
on the HTILE mask parameter.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
I don't think this makes much differences and a potential clear
following the initialization will overwrite HTILE anyways.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
For depth+stencil images, the driver might use an optimized path
if only one aspect is cleared. It either clears the depth or the
stencil part of HTILE. Because the two separate aspects might use
the same HTILE memory we have to synchronize.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
PA_SC_AA_CONFIG might be updated when conversative rasterization is
enabled. Because the driver only re-emits the multisample state if
the number of samples is different, that register value might not
be updated correctly.
Found by inspection, doesn't fix anything known.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
They don't have to be updated dynamically.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This patch also disables AMD_shader_ballot on GFX7 by default if ACO is used.
Note that shader_ballot works correctly, but performance seems inferior.
To enable shader_ballot use RADV_PERFTEST=shader_ballot.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
My fix wasn't totally correct as pointed out by Marek.
Ported from RadeonSI.
Fixes: deafe4cc58 ("radv/gfx10: fix primitive indices orientation for NGG GS")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
The number of loaded channels should always be > 0 now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: 13ab63bb62 ('radv: Implement VK_EXT_buffer_device_address.')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This shouldn't introduce any functional changes for RadeonSI
when NIR is enabled because these operations are already lowered.
pipeline-db (NAVI10/LLVM):
SGPRS: 9043 -> 9051 (0.09 %)
VGPRS: 7272 -> 7292 (0.28 %)
Code Size: 638892 -> 621628 (-2.70 %) bytes
LDS: 1333 -> 1331 (-0.15 %) blocks
Max Waves: 1614 -> 1608 (-0.37 %)
Found this while glancing at some F12019 shaders.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Was totally broken ...
Removed two if(point) {} because point is always non-NULL and we
were counting on that already for counting, since we NULL our
references to semaphores without active point earlier.
Fixes: 4aa75bb3bd "radv: Add wait-before-submit support for timelines."
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2137
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
They were out of sync. Besides syncing, lets ensure they never diverge
again.
Fixes: 8d2654a419 "radv: Support VK_EXT_inline_uniform_block."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
When a fragment shader includes an input variable decorated with
SampleId or SamplePosition, sample shading should be enabled
because minSampleShadingFactor is expected to be 1.0.
Cc: 19.2, 19.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
No functional changes as the driver still uses the depth+stencil
pipeline.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This refactoring will help for creating more decompress pipelines.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
No functional changes, but it will be used to decompress
separate depth/stencil aspects.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
No functional changes because the aspect mask is still not used
during image transitions but it will be needed for the separate
depth/stencil aspects logic.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Otherwise, nir_intrinsic_align() will assert when called on the intrinsics
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
The former was always true and hence dead code. We will want to
explicitly declare the ring offset register with ACO, but we also want
to declare the scratch offset too, and we can't try to disable it since
ACO also supports spilling and the determination of whether spilling has
to happen occurs well after setting up registers. So replace
supports_spill with something that will actually be used for ACO.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
We'll duplicate this in a header file in the next commit, and then
remove the original enum. Just rename it temporarily so that things
keep building.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>