It's still disabled by default because transform feedback randomly
hangs and it seems like it's related to GDS (cf. RadeonSI).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Otherwise the next streamout operation will overwrite GDS. This
can be improved by tracking if there is a streamout operation in
flight. Currently the driver unconditionally flushes but that
doesn't matter much as NGG streamout is disabled by default.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
NGG streamout uses GDS and we have to make sure that another
process isn't going to overwrite GDS while our shaders are busy.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Otherwise the wave IDs are probably 0 and it hangs. NGG_WAVE_ID_EN
generates wave IDs for GDS OA.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
For streamout we have to the number of streamout outputs.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
It's used to determined the max emit per buffer.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
It should account for the number of streamout outputs.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Streamout outputs are stored in the ESGS ring.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
It needs more space for multiple streams.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This allocates two BOs for GFX10 NGG streamout.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This internal option is turned off by default because NGG streamout
still hangs. It seems like it's related to GDS as RadeonSI.
That option will be turned on once all issues are resolved.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This option strictly allocate the minImageCount given by the
application at swapchain creation.
This works around application that do not deal with the fact that the
implementation allocates more images than the minimum specified.
v2: Add values in default drirc (Bas)
v3: specify engine name/version (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111522
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Vulkan applications can register with the following structure :
typedef struct VkApplicationInfo {
VkStructureType sType;
const void* pNext;
const char* pApplicationName;
uint32_t applicationVersion;
const char* pEngineName;
uint32_t engineVersion;
uint32_t apiVersion;
} VkApplicationInfo;
This enables the Vulkan implementations to apply workarounds based off
matching this description.
Here we add a new parameter for matching the driconfig options with
the following :
<device driver="anv">
<application engine_name_match="MyOwnEngine.*" engine_versions="10:12,40:42">
<option name="blaaah" value="true" />
</application>
</device>
v2: switch engine name match to use regexps
v3: Verify that the regexec returns REG_NOMATCH for match failure (Eric)
v4: Add missing bit that went to the following commit (Eric)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
We'll use this later for a new driconfig matching parameter.
v2: Avoid leak in device creation error case (Bas)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
streamout_buffers is assigned after that function, so the previous
fix was completely wrong. This probably fix something when streamout
buffers and push constants are used/inlined in the same shader.
Fixes: 378e2d2414 ("radv: fix computing number of user SGPRs for streamout buffers")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This fixes some interactions when NGG GS is enabled. It fixes:
- dEQP-VK.clipping.user_defined.clip_cull_distance_dynamic_index.*geom*
- dEQP-VK.tessellation.geometry_interaction.passthrough.*
For some reasons, using the computed ESGS ring size randomly hangs
with CTS. For now, just use the maximum LDS size for ESGS.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This shouldn't be in NIR->LLVM because ACO also needs the shader
info. This will also help for computing some NGG values that are
necessary for declaring LDS symbols.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Only the pipeline layout and the shader keys are needed.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: 3844ed8d44 ("radv: Add adaptive_sync driconfig option and enable it by default.")
Fixes: e260493f2a ("radeonsi: Enable adaptive_sync by default for radeon")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
This doesn't fix anything known but it's correct now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
It shouldn't matter much because output varyings should have been
compacted during NIR shader linking but it mirrors what the driver
does when emitting NGG GS vertex parameters.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
If the fragment shader needs the layer index, we have to allocate
one more dword in the NGG GS storage. Found by inspection. This
doesn't fix anything known.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Having two different structs is useless.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
The codegen handles it and it adds the correct casts. This fixes
a bunch of LLVM validation errors when enabling Wave32 for compute.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Set of opcodes doesn't have enough flexibility in certain cases. E.g.
Utgard PP has vector conditional select operation, but condition is always
scalar. Lowering all the vector selects to scalar increases instruction
number, so we need a way to filter only those ops that can't be handled
in hardware.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Without this, invariant qualifiers don't do anything. Together with a
fix to the game, this fixes flickering in No Man's Sky.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
We usually use these counts as a simple way to figure out if a change
reduces the number of instructions or shrinks an instruction. However,
since .rodata sections aren't executed, we shouldn't be counting their
size for this analysis. Make the linker return the total executable
size, and use it to report the more useful size in both drivers.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This avoids multiple copies for nothing and it's more elegant.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
RADV no longer uses specific LLVM options compared to the common code.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>