This is a step to emitting VTX and TEX block start instructions in the
sfn assembler instead of relying on the old backend asm code.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37096>
To simplify the scheduling we need the method to be callable for all
instruction types.
v2: remove now useless override in ExportInstr
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37096>
The hardware actually compares a pair of 64-bit values, rather than
comparing a single value against zero like we previously assumed.
This wasn't an issue in most cases before because if the buffer is
zero-initialized the previous code happens to work. If we get a
buffer with garbage in it though we would run into issues.
Fixes: 80eac1337d ("nvk: Always copy conditional rendering value before compare")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13821
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37153>
These buffers get recycled, so we can't rely on their contents.
Unfortunately, conditional rendering was relying on these buffers being
all zero, implicitly, because we didn't fully understand the hardware.
By supporting this debug flag here, we add a way to check for code that
accidentally assumes zero-initialization.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37153>
nvk_cmd_pool_free_gart_mem_list frees this buffer, so we need to clear
the pointers to it in order to avoid a use after free.
Fixes: 07c70c77de ("nvk: add cond render upload buffer.")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37153>
My first attempt to fix it ended up stomping classes to zero because it
happened too eary. Now it happens after we have the whole method parsed
but before we go searching for strings.
Fixes: 8e93a763a3 ("nouveau/push: Handle more recent versions of 6F")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37180>
If the PlatformDisplay is initialized to EGL_DEFAULt_DEVICE (i.e. 0)
acquire a connection to the display to query the thread savety.
v2: rework after getting a better understanding of what is ging on.
v3: check whether XOpenDisplay was successfull (Yonggang Luo)
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13740
Fixes: ecbe35d878 ("egl,glx: allow OpenGL with old libx11, but disable glthread if it's unsafe")
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37006>
The previous implementation was using temporal clamping variable within geometry shader.
It caused the framebuffer_layer_id input to be ignored, so gl_Layer would end up with a value of 0.
The fix removes the use of the temp variable.
Tested using CTS 4.6.6.0 test cases:
./glcts --deqp-case=dEQP-GL45-ES31.functional.geometry_shading.layered.fragment_layer_cubemap
./glcts --deqp-case=dEQP-GL45-ES31.functional.geometry_shading.layered.fragment_layer_3d
./glcts --deqp-case=dEQP-GL45-ES31.functional.geometry_shading.layered.fragment_layer_2d_array
./glcts --deqp-case=dEQP-GL45-ES31.functional.geometry_shading.layered.fragment_layer_2d_multisample_array
These tests fail before and pass after the change.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37069>
Partial results should be computed for all types of queries.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36916>
ir3_ra allocates registers in a round-robin fashion to avoid false
dependencies. In order to do this, it keeps track of a "file start"
register for each register file and will search starting from there for
available registers.
This file start is initialized at the beginning of RA of kept across
blocks, including across the preamble. This means that a change that
only affects the preamble may cause changes in how registers are
allocated in the main shader. This may result in more or less copies,
and more or less false dependencies which changes the behavior of
postsched.
Changes in the preamble affecting the main shader makes it more
difficult to analyze shader-db results, as I often find myself chasing
down a regression that is just caused by RA/postsched "bad luck" in a
main shader that didn't actually change. Prevent this by resetting the
file start at the beginning of the main shader.
Totals:
Instrs: 364710030 -> 364631384 (-0.02%); split: -0.19%, +0.17%
CodeSize: 926766046 -> 926671488 (-0.01%); split: -0.10%, +0.09%
NOPs: 47703035 -> 47653319 (-0.10%); split: -1.05%, +0.94%
MOVs: 17072354 -> 17075112 (+0.02%); split: -1.28%, +1.29%
COVs: 4098062 -> 4096784 (-0.03%); split: -0.04%, +0.01%
Full: 15164359 -> 15112404 (-0.34%); split: -0.34%, +0.00%
(ss): 7818796 -> 7819147 (+0.00%); split: -1.10%, +1.11%
(sy): 3985674 -> 3983435 (-0.06%); split: -0.72%, +0.67%
(ss)-stall: 26535279 -> 26525929 (-0.04%); split: -1.36%, +1.32%
(sy)-stall: 111983489 -> 111716382 (-0.24%); split: -1.26%, +1.02%
Last helper: 116734916 -> 116595531 (-0.12%); split: -0.62%, +0.50%
Cat0: 53338794 -> 53289450 (-0.09%); split: -0.94%, +0.85%
Cat1: 22352349 -> 22328303 (-0.11%); split: -1.28%, +1.17%
Cat2: 155348173 -> 155348012 (-0.00%); split: -0.00%, +0.00%
Cat7: 9314194 -> 9309099 (-0.05%); split: -0.88%, +0.82%
Totals from 224302 (16.59% of 1352016) affected shaders:
Instrs: 148838101 -> 148759455 (-0.05%); split: -0.47%, +0.42%
CodeSize: 404838970 -> 404744412 (-0.02%); split: -0.22%, +0.20%
NOPs: 26261983 -> 26212267 (-0.19%); split: -1.90%, +1.71%
MOVs: 8372715 -> 8375473 (+0.03%); split: -2.60%, +2.63%
COVs: 2061488 -> 2060210 (-0.06%); split: -0.09%, +0.02%
Full: 3420300 -> 3368345 (-1.52%); split: -1.52%, +0.00%
(ss): 3848423 -> 3848774 (+0.01%); split: -2.24%, +2.25%
(sy): 2021040 -> 2018801 (-0.11%); split: -1.43%, +1.32%
(ss)-stall: 13554064 -> 13544714 (-0.07%); split: -2.65%, +2.59%
(sy)-stall: 59778475 -> 59511368 (-0.45%); split: -2.36%, +1.91%
Last helper: 52847662 -> 52708277 (-0.26%); split: -1.38%, +1.12%
Cat0: 29270336 -> 29220992 (-0.17%); split: -1.72%, +1.55%
Cat1: 10820261 -> 10796215 (-0.22%); split: -2.63%, +2.41%
Cat2: 57289060 -> 57288899 (-0.00%); split: -0.00%, +0.00%
Cat7: 5686726 -> 5681631 (-0.09%); split: -1.43%, +1.34%
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37003>
Something like this already exists in a few drivers, move it to common
code. This specific version was pulled from honeykrisp, which is the
only one that handles META_RECT_LIST_MESA.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37038>
Per spec: If the maintenance6 feature is enabled, this command must
attempt to perform all of the memory binding operations described by
pBindInfos, and must not early exit on the first failure.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37099>
The vulkan spec says that we should ignore memoryOffset when
VkBindImageMemorySwapchainInfoKHR is present. wsi common assumes that we
bind the wsi image at offset 0, so set the offset to 0. This change
aligns with common wsi, and also obeys dedicated alloc requirement.
Fixes: f887116c49 ("turnip: adopt wsi_common_get_memory")
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37099>
Fixes: 5f757bb95c ("nir: Make the load_store_vectorizer provide align_mul + align_offset.")
This is found when I am trying to narrow bit_size and num_components to uint8_t
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37042>
Implement HSD 16028171704/14025112257:
LSC state cache livelock:- Once state cache entries are full,
subsequent walker dispatches with two threads per thread group maybe
gets stuck infinitely because of state cache live lock.
One thread continuously stuck in loop doing UGM fence + evict and UGM
read is waiting on UGM read to have certain value. while other thread
supposed to update the value that first thread is waiting for. But
since entries are full in state cache, there is second thread never
make progress.
Closes: #12352
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37128>
Implement HSD 16028171704/14025112257:
LSC state cache livelock:- Once state cache entries are full,
subsequent walker dispatches with two threads per thread group maybe
gets stuck infinitely because of state cache live lock.
One thread continuously stuck in loop doing UGM fence + evict and UGM
read is waiting on UGM read to have certain value. while other thread
supposed to update the value that first thread is waiting for. But
since entries are full in state cache, there is second thread never
make progress.
Closes: #12352
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37128>
Implement HSD 16028171704/14025112257:
LSC state cache livelock:- Once state cache entries are full,
subsequent walker dispatches with two threads per thread group maybe
gets stuck infinitely because of state cache live lock.
One thread continuously stuck in loop doing UGM fence + evict and UGM
read is waiting on UGM read to have certain value. while other thread
supposed to update the value that first thread is waiting for. But
since entries are full in state cache, there is second thread never
make progress.
Closes: #12352
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37128>
When spilling registers on Valhall we are careful to leave the TLS
pointer aligned on 16 byte boundaries (so as to avoid accesses
crossing those boundaries). However, within the spill code we don't
need to have 16 byte alignment for spills of 32 or 64 bit values.
In the common case where most spills are 32 bits, we can save nearly
75% of the memory used by just aligning to 32 bit boundaries.
Reviewed-by: Aksel Hjerpbakk <aksel.hjerpbakk@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36676>
The hardware support for tg4 is unclear from RE and feature databases.
Enable this extension on halti5 GPUs as a conservative starting point.
Support for 128 bit formats is missing, so it's gated behind ETNA_MESA_DEBUG=deqp.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36622>
Implement offset lowering by converting pixel offsets to normalized
coordinate space and adjusting coordinates accordingly.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36622>
This instruction is used to implement textureGather.
Blob generates such tex_gather's for dEQP-GLES31.functional.texture.gather.*
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36622>