This lets us pre-compile a prolog and avoid a hash table lookup during
command buffer recording, most of the time.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11717>
This doesn't actually use the functionality or implement prolog
compilation yet.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11717>
Way faster than the previous one, especially with a large number of
shaders.
This doesn't have much of an effect right now, but the previous allocator
was expensive compared to the cost of compiling vertex shader prologs.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11717>
This will be useful for compiling vertex shader prologs, where we
basically use ACO as an assembler.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11717>
In the following cases:
- lower 16 bits are extracted and the shift amount is 16 or more
- lower 8 bits are extracted and the shift amount is 24 or more
the undesireable upper bits are already shifted out, and therefore
there is no need to add SDWA to the v_lshlrev instruction.
Fossil DB stats on Sienna Cichlid with NGGC on:
Totals from 58239 (45.27% of 128647) affected shaders:
CodeSize: 153498624 -> 153265616 (-0.15%); split: -0.15%, +0.00%
Instrs: 29636304 -> 29578064 (-0.20%); split: -0.20%, +0.00%
Latency: 136931496 -> 136876379 (-0.04%); split: -0.04%, +0.00%
InvThroughput: 21134367 -> 21078861 (-0.26%); split: -0.26%, +0.00%
Copies: 2777550 -> 2777548 (-0.00%); split: -0.00%, +0.00%
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13121>
When the determinant that we use for calculating triangle area
is NaN, it's not possible to decide the facing of the triangle.
This can happen when a coordinate of one of the triangle's vertices
is INFINITY. It's better to just accept these triangles in the shader
and let the PA deal with them.
Let's do the same for +/- Infinity too.
Though we haven't seen this yet, it may be troublesome as well.
Fixes: 651a3da1b5Closes: #5470
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13299>
Traces with clock = 0 are totally useless due to RGP getting very
confused.
Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13301>
We don't know some information regarding DCC image stores and therefore fast clears until we know the surface info.
We should work towards eliminating this, but the cases where this will hit on GFX10_3 is basically 0.
Finally fixes a perf regression in Doom Eternal.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13283>
To avoid confusion with radv_pipeline_key::has_multiview_view_index.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13243>
The view index is always lowered to map the layer ID for fragment
shaders. This was never reached.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13243>
This was never used because the layer ID isn't a system value.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13243>
This will allow us to reduce the size of radv_shader_info which is
stored in the cache entry.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12992>
This will allow us to avoid postprocessing binaries when they are
loaded from the shaders cache.
LLVM binaries already contain the shader config as part of the ELF,
so it's duplicated and increase the cache entry by 48 bytes. Though,
I don't think that should matter for LLVM.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12992>
These also have a higher compressed block size.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13056>
Some of our modifiers only support upto a certain range, expose this in ImageFormatProperties.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13056>
Currently, we aren't checking if the modifier supports the extent of the image.
DCN only works with !64B && 128B on extents < 4K.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13056>
From the VK_KHR_maintenance4 spec:
"Allow the application to destroy their VkPipelineLayout object
immediately after it was used to create another object. It is no
longer necessary to keep its handle valid while the created object
is in use."
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13193>
AHARDWAREBUFFER_USAGE_CAMERA_MASK enum is defined later and gets
included in the stub headers.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13255>
This commit enables NGG culling on all GFX10.3 GPUs by default.
A new debug flag environment variable RADV_DEBUG=nonggc is added to
disable this feature on GPUs where it is enabled by default.
The previous perf test flag RADV_PERFTEST=nggc will not be needed on
GFX10.3 anymore but it can still be used to enable the feature on
GPUs where it isn't on by default.
Totals from 58239 (45.27% of 128647) affected shaders:
VGPRs: 1989752 -> 2049408 (+3.00%); split: -3.21%, +6.21%
SpillSGPRs: 675 -> 883 (+30.81%); split: -78.07%, +108.89%
CodeSize: 72205968 -> 153572764 (+112.69%)
LDS: 0 -> 227125248 (+inf%)
MaxWaves: 1614598 -> 1646934 (+2.00%); split: +3.08%, -1.08%
Instrs: 14202239 -> 29654042 (+108.80%)
Latency: 87986508 -> 136960419 (+55.66%); split: -0.23%, +55.89%
InvThroughput: 14444832 -> 21141875 (+46.36%); split: -0.01%, +46.37%
VClause: 340794 -> 493067 (+44.68%); split: -1.33%, +46.01%
SClause: 520983 -> 738636 (+41.78%); split: -0.25%, +42.03%
Copies: 775639 -> 2787382 (+259.37%)
Branches: 296911 -> 1225431 (+312.73%)
PreSGPRs: 1316896 -> 2057270 (+56.22%); split: -0.14%, +56.36%
PreVGPRs: 1473558 -> 1658432 (+12.55%); split: -1.44%, +13.99%
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13086>