Make sure we read the updated data from the gpu in cases where WAIT_BIT
is not set.
Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Some users of this function (e.g. GS inputs) currently only work with
constant offsets. We got lucky since all the tests used an array index
of 0, so the non-constant part was always 0. But we still need to handle
this.
This doesn't fix any CTS test, but was noticed while debugging one.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Only gfx9 and older use it to get InstanceID in VGPR1.
Ported from RadeonSI.
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Now that LLVM 9 will be released soon, we will only support
LLVM 8, 9 and master (10).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
A bunch of remaining issues including some that affect users.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111248
Fixes: ee21bd7440 "radv/gfx10: implement NGG support (VS only)"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
This fixes a regression introduced with scan&reduce operations
on GFX10. Note that some subgroups CTS still fail on GFX10 but
I assume it's a different issue.
This fixes dEQP-VK.subgroups.arithmetic.*.subgroupexclusive*.
Fixes: 227c29a80d "amd/common/gfx10: implement scan & reduce operations"
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This debug option allows vkGet[Instance/Device]ProcAddr() to succeed
even if the extension associated with the requested entrypoint was not
enabled.
This has come in handy in a few instances when debugging VR
applications, so I thought it would be good to have a cleaned up version
upstreamed.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This better matches all the other atomic intrinsics such as those for
SSBOs and shared variables where the sign is part of the intrinsic
opcode. Both generators (GLSL and SPIR-V) know the sign from the type
of the image variable or handle. In SPIR-V, signed min/max are separate
opcodes from unsigned.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Trivial extension that matches PAL.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This gives a nice boost, +20% at this time on my Vega 56. Shader
ballot should be enabled by default at some point but it reduces
performance a bit (-6%) with Wolfeinstein II. Enable it only for
Youngblood at the moment, like what we did for Talos in the past.
As a bonus point, it gets rid of some minor artifacts that only
happens when ballot is disabled for some reasons.
Cc: 19.2 <mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Shader ballot will be enabled by default for Wolfenstein
Youngblood. This follows what we did for sisched.
Cc: 19.2 <mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Otherwise hangs are possible. This register was already set for
GS and NGG.
Fixes: 5eaed7ecfc "radv/gfx10: enable support for NAVI10, NAVI12 and NAVI14"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Should take the max of the 2.
Fixes: ea337c8b7e "radv/gfx10: fix VS input VGPRs with the legacy path"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
The script doesn't handle them correctly and D16_UNORM_S8_UINT
isn't supported by the hardware, mark it as invalid.
This fixes warning when generating gfx10_format_table.h.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111393
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Took the freedom to enable dfsm even though I don't have benchmark
results yet, but it seems Raven-like.
Rest is from radeonsi.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
By adding one more helper to ac_llvm_build, we can also easily keep
vector stores together.
Fixes the
tests/spec/glsl-1.30/execution/fs-large-local-array-vec4.shader_test
piglit test.
Fixes: 74470baebb ("ac/nir: Lower large indirect variables to scratch")
Reviewed-by: Marek Olšák <marek.olsak@amd.com
Quite useless without DCC for LAYOUT_GENERAL.
Fixes: b4dad3afaa Revert "radv: Do not decompress on LAYOUT_GENERAL."
Acked-by: Dave Airlie <airlied@redhat.com>
Causes issues with a bunch of games with DXVK.
Fixes: 50add1b33a "radv: Do not decompress on LAYOUT_GENERAL."
Acked-by: Dave Airlie <airlied@redhat.com>
v2: add to series
v3: update Makefile.sources
v4: don't remove a comment and break statement
v4: use nir_can_move_instr
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
This allows enabling the shader info keeping on a per shader basis.
Also disables the cache on a per shader basis.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Test that found this: dEQP-VK.geometry.layered.1d_array.secondary_cmd_buffer
Fixes: 49e6c2fb78 "radv: Store color/depth surface info in attachment info instead of framebuffer."
Reviewed-by: Dave Airlie <airlied@redhat.com>
Can result in different shaders.
Fixes: 8a86908e9a "radv/gfx10: add Wave32 support for vertex, tessellation and geometry shaders"
Reviewed-by: Dave Airlie <airlied@redhat.com>
Instead of having the three values everywhere. This is also more
future proof if we want the driver to make those decisions eventually.
Reviewed-by: Dave Airlie <airlied@redhat.com>