The number of IBs per submit isn't infinite, it depends on the IP type
(ie. some initial setup needed for a submit) and the packet size.
It can be calculated according to the kernel source code as:
(ring->max_dw - emit_frame_size) / emit_ib_size
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22354>
Skipping the continue preamble can allow other processes to mess
up some registers set by the current process.
Originally, we could omit generating the continue preamble when
no shader rings were used, because the register initialization
happened at the beginning of every main cmdbuf. However, this
isn't the case anymore.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22354>
The base SGPR and the number of SGPRs can be equal but it was incorrect
because one VS can have draw_id and one can have base_instance. Fix
this by invalidating the vertex user SGPRs unconditionally.
Though they should also be invalidated after executing secondaries,
otherwise nothing is invalidated if the same pipeline is bind to the
primary again.
This fixes dEQP-VK.dynamic_rendering.primary_cmd_buff.random.seed*.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21652>
Replaces the RADV pipeline cache with an implementation
based on the common vk_pipeline_cache.
We use a dual-layer approach with two types of cache entries.
1. radv_shader:
- serialized as radv_shader_binary
- uses SHA1 of the binary as key
2. radv_pipeline_cache_object:
- contains pointers to associated radv_shaders
- serialized as list of SHA1
- uses the pipeline hash as key
In combination with single-file disk-cache, this reduces the cache size by ~60%.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22030>
This function takes a radv_shader_binary and writes it to the
disk cache before creating and returning a radv_shader cache entry.
The key of the cache entry is the full SHA1 hash of the binary.
This way, we will be able to deduplicate identical shaders.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22030>
It was only read by RRA which can infer it from the parenbt internal
node.
Change in average build time (Control):
84.69471 ms -> 84.25319 ms
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22400>
is_gs_copy_shader and is_trap_handler_shader were cleared in
radv_init_shader_args. This restores the original behaviour.
Fixes: 67635bb ("radv: zero-initialize radv_shader_args right before declaring them")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22382>
Avoid recompiling some RADV files when something changes in NIR.
Also clean up a few other includes.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22241>
Don't recompile entire ACO when something changes in NIR.
Instead, only use some headers which are actually needed,
include these in ACO files instead of relying on nir.h to
include them.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22241>
We don't want to rely on any NIR structures in ACO, because
we would like to avoid the need to include nir.h in aco_ir.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22241>
This case is a new addition in GFX11, and according to PAL, when no
depth/stencil attachment is bound, it must be set to the number of coverage
samples (the number of SampleMask bits - which is MSAA_EXPOSED_SAMPLES):
4640888b57/src/core/hw/gfxip/gfx9/gfx9UniversalCmdBuffer.cpp (L6978)
Without this change, the maximum of depth/stencil and color sample counts
is used, and if there are no depth/stencil or color attachments (target-
independent rasterization), the Depth Block assumes 1 coverage sample, and
thus Primitive Ordered Pixel Shading doesn't work correctly (and fails 4xAA
fragment shader interlock CTS tests), and occlusion queries don't count the
correct number of samples (according to the "Sample Counting" section of
the Vulkan specification, "the occlusion query sample counter increments by
one for each sample with a coverage value of 1...")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22375>
Implement check_status function so the driver can check if the GPU has
been reset by the application, and thus if it's still available.
AMDGPU_CTX_QUERY ioctls work by asking amdgpu if this context was the
cause of a previous GPU reset.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: André Almeida <andrealmeid@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22253>
You won't get your money back!
It's been a very long time but everything should be working great now.
This replaces RADV_PERFTEST=gpl by RADV_DEBUG=nogpl to disable the
extension for debugging purposes.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22362>
Previously, the workaround only covered compile-time zero, but
this is insufficient and can cause GPU hangs in RadeonSI when
NGG culling is enabled.
Fix this by handling runtime zero in the workaround.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22370>
When the driver gets a cache hit for the binary, we still have to
retain shaders because we can't know if the LTO pipeline will be a
cache hit as well.
Though, serializing the NIR is too costly and most of the libraries
took more than 10ms to be created, which isn't acceptable. To fix this,
keep track of the shaders stage info for libs with the RETAIN flag.
This might be replaced by NIR caching later if it's worth a try.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22327>
For stream!=0, this align_mul=4 is not true. Not observe any
problem yet, just for correctness.
Fixes: 60ac5dda82 ("ac: Add NIR lowering for NGG GS.")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22304>
If vertex does not complete a primitive, it should not set the odd
flag which miss lead liveness check when culling is enabled.
For example, if odd flag is set regardless of complete flag, when
culling is enabled, 3 vertices of a triangle's init prim flag:
[0x00 0x04 0x01]
then after culling, this triangle has been culled, their prim flag:
[0x00 0x04 0x00]
the second vertex is miss treat as live because its odd flag (code
check prim_flag!=0 for liveness).
Fixes: 1bdeb961bd ("ac/nir/ngg: add gs culling")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8725
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22304>