this had a number of issues:
* pipe_loader_get_driinfo_xml() frees driver_driconf immediately,
except the driOptionCache struct string pointers are all just copied
in merge_driconf instead of having the strings copied, which means any
subsequent access of driver_driconf strings is invalid access
* pipe_loader_drm_get_driconf_by_name() is a disaster that only happened
to work because the dlopen here is the same lib that gets opened elsewhere
by mesa and not closed. if the lib here is actually closed, then all
the statically allocated strings become invalid, which means they need to
be manually copied
cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30494>
Like AHB, we don't know the layout for an image backed by gralloc
swapchain memory until bind when gralloc information is passed by the
platform.
Signed-off-by: Juston Li <justonli@google.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29850>
Refactor out shared code for the u_gralloc tiling query so it can also
be used by ahw and later anb resolves.
Signed-off-by: Juston Li <justonli@google.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29850>
We increased VS_EXPORT_COUNT to 8 for streamout in gfx10_shader_ngg,
but we forgot to increase the attribute ring stride, causing all waves
except the first one to get corrupted VS outputs.
Fixes: f703dfd1bb - radeonsi: add gfx12
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30503>
This fixes random GPU hangs on gfx12 due to incoherent indirect buffer data,
causing random indirect vertex and instance counts, which timeouts if
the random numbers are large.
Fixes: a8abbbb172 - radeonsi: remove r600_pipe_common.h
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30503>
There was a case where if we don't sync, we wouldn't set TC_L2_dirty either,
which could cause problems later.
Fixes: f703dfd1bb - radeonsi: add gfx12
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30503>
I guess incorrect packet interrupts have been enabled, so this started hanging.
radeon_set_context_reg_seq shouldn't be used with gfx12_set_context_reg.
Fixes: f703dfd1bb - radeonsi: add gfx12
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30503>
EXT_post_depth_coverage was already wired up but the tests were
failing. Through experimentation I found that running them in
combination with SET_HYBRID_ANTI_ALIAS_CONTROL would cause the
tests to fail.
This patch simply skips SET_HYBRID_ANTI_ALIAS_CONTROL when post
depth coverage is in use
Test results for *post_depth_cover*:
Test run totals:
Passed: 21/104 (20.2%)
Failed: 0/104 (0.0%)
Not supported: 83/104 (79.8%)
Warnings: 0/104 (0.0%)
Waived: 0/104 (0.0%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29194>
The pixld.covmask instruction returns the coverage mask for the entire
pixel being shaded, not the set of samples covered by the current FS
invocation as required by GL/Vulkan. In order to get the GL/Vulkan
behavior, we have to mask off samples not covered by the current FS
invocation.
Previously, we handled this by masking by 1 << gl_SampleID. This
required us to force full sample shading whenever gl_SampleMaskIn[] was
used. Otherwise, we didn't know what to mask. Instead, this commit
switches us to using an array in CB0 which has a sample mask for each
sample, representing the set of samples in that sample's pass. Masking
by this allows us to get the full range of variability provided by
NVIDIA's multi-pass MSAA hardware. It also allows us to eliminate the
workaround that forced per-sample shading for gl_SampleMaskIn[] because
we can adjust the masks from the API side as needed.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29194>
Dota 2 and Counter-Strike 2 really want to be able to allocate memory
for both VkImages and VkBuffers from the same memory type. Xe2's
special compression-only memory type does not support buffers, which
makes these games crash. Disable CCS on these games as a workaround.
This is a temporary workaround as we're still working towards a
long-term solution (either by fixing the engine or finding a way
better expose our memory types).
Backport-to: 24.2
Fixes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11520
Fixes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11521
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30481>
These memory types are useless when CCS is disabled, don't leave them
there so they don't confuse applications.
Backport-to: 24.2
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30481>
Fixes the following test, as far as you enable Vulkan 1.3 (if not it
is skipped):
dEQP-VK.api.info.vulkan1p3_limits_validation.max_inline_uniform_total_size
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29476>
Xe KMD will now report the different topology mask types based on the
type of the EU of running platform.
With this we don't need to divide the EU count by 2 in intel_perf.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30127>
Before resorting to scheduling instructions based on max_delay only,
postsched would prioritize instructions that have no hard delay (i.e.,
delays for which nops should be inserted) but might still have soft
delays (for which ss/sy needs to be inserted). Removing this has a
slight negative effect on nops but improves sstall/systall. This seems
to improve actual render pass time.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30437>
max_delay is a measure for the maximum delay from a node to the end of a
block. However, the current calculation would include the delay of the
node itself (which is the maximum delay from the node's sources). At the
point a node is scheduled, this source delay might already be much
smaller (or even zero) because its sources have hopefully been scheduled
much earlier. This means the max_delay value would be an overestimate of
the actual delay from the node to the block's end.
This commit fixes this and makes sure max_delay is always the maximum
delay from a node to the end of the block from the point where the node
is scheduled.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30437>
Since v_interp_p2_f32 with constant operands only happens on GFX11.5, this
should actually be fine in all cases where this is currently possible
(GFX11.5+ allows DPP with scalar src1). However, it does fail validation
because we haven't updated that yet.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Fixes: bee487df48 ("aco/gfx11.5+: use vinterp for fddx/fddy")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30477>