Here we switch to using a sorted arrays for the binary search which
significantly speeds up the lookups. For a shader with ~8000
uniforms its up to 10x faster and the godot-tps-gles3-high.trace
in issue #13894 returns to its original runtime length before we
switched to using range remap in e052254066
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37754>
This creates an array of the linked list elements to be used for
faster binary searches in the following patch, and frees the old
linked list.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37754>
This will allow us to use the linked list for validation of inserts
during linking then switch to using an array for fast binary searches.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37754>
If the ptr value is NULL and we match an existing entry during an
insert call we now just return the existing value. This allows us
to drop an extra lookup, this will become important as the
following patches change `util_range_remap()` lookups to use a
sorted array that is not created until after we have added all
entries to our linked list.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37754>
This works around a long-standing synchronization issue consequence of
the HALT instruction used to implement FS discard not being considered
a control flow instruction by the back-end -- The fact that it doesn't
cause the CFG pass to introduce an edge in the graph means that the
software scoreboard pass is completely blind to the effect of discard
jumps on control flow, so it doesn't introduce the required
annotations to avoid data hazards when the discard path of the CFG is
taken. Note that because of the very limited set of instructions that
can follow the HALT target in a fragment shader this was very unlikely
to lead to issues in practice, but starting on xe3 it appears to have
become far more likely due to the use of SENDG, since SENDG requires
the scalar register to be set prior to the submission of the render
target write payloads, which can easily lead to a WaR hazard if there
was another SENDG before the HALT jump that wasn't done reading out
its payload from the GRF.
In an ideal world this would be avoided by having HALT be a normal
control flow instruction represented as an edge in the control flow
graph -- But unfortunately that would prevent the optimizations we
currently do that take advantage of the ability of reordering code
past the HALT instruction, so it would have a pretty large performance
cost. Instead this simply adds a SYNC.ALLWR instruction after the
HALT target to guarantee that all pending SEND messages have finished
execution -- That may also seem costly, however its cost in practice
appears to be minimal since at the point of the program when the
target HALT is executed there is almost nothing left to do other than
send out the render target write payloads, so any pending operations
had to be waited on at roughly this point of the program regardless.
There appear to be no statistically significant regressions in Traci
on neither BMG nor PTL. Fixes hangs observed on Dying Light 2 and
Cyberpunk on PTL.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13896
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13965
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14092
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37674>
Instead of a bunch of switches which have to match, this introduces a
table which we can use to map bidirectionally from GOBType to
(GOBKindVersion, SectorLayout).
Backport-to: 25.2
Reviewed-by: James Jones <jajones@nvidia.com>
Reviewed-by: Mohamed Ahmed <mohamedahmedegypt2001@gmail.com>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37824>
Recursion in Rust can be annoyingly expensive. The recursive DFS
implementation we have now is probably not terrible but these things
bloat in debug builds more than you'd expect. This replaceas it with a
non-recursive implementation which uses a Vec instead.
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Lorenzo Rossi <git@rossilorenzo.dev>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37536>
There are four depth first searches in cfg.rs. This adds a DFS
abstraction which we can later optimize.
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Lorenzo Rossi <git@rossilorenzo.dev>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37536>
The proprietary driver seems to always do this before a graphics
semaphore release. This is presumably weaker than a WFI because it's
never emitted for pipeline barriers.
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37857>
tc_reserve_set_vertex_elements_and_buffers_call slots data are only valid
after the call to tc_set_vertex_elements_for_call.
If a batch flush occurs between these 2 calls, random memory will be read
leading to crashes.
The only user of tc_reserve_set_vertex_elements_and_buffers_call being
st_update_array_templ, we can determine that only 2 tc_buffer_unmap calls
can be inserted, so we reserve slots for them.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37763>
Call it before tc_add_set_vertex_elements_and_buffers_call to
make sure that its use from st_setup_current -> st_add_releasebuf
won't insert calls into the batch.
Fixes: 1638d486 ("gallium/u_threaded,st/mesa: add a merged set_vertex_elements_and_buffers call")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37763>
_mesa_glthread_release_upload_buffer will call _mesa_reference_buffer_object
which will then end up in tc_resource_release.
Doing so from glthread is bad because we now have 2 threads concurrently
accessing TC batch: glthread and the unmarshalling/tc thread.
Fix this by queuing the release call through an "Internal" glthread function.
Fixes: b3133e250e ("gallium: add pipe_context::resource_release to eliminate buffer refcounting")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37763>
In release builds, the assertion checking whether spilling failed isn't
evaluated, which can result in shader compilation continuing despite this.
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Reviewed-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37872>
New Vulkan CTS 1.4.4 started requiring glx.pc pkg-config file. Provide
one if GLVND is not used in order to let VK CTS and other programs find
Mesa GLX implementation.
Cc: mesa-stable
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37834>
Some XML files are missing when trying to compile within Android
build system (using ninja-to-soong to convert meson to blueprint).
Sorted list extracted with:
`find src/mesa/glapi/glapi/gen -name *.xml | sed
"s|src/mesa/glapi/\(.*\)| \'\1\',|" | sort`
Ref #14072
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37864>
Monado calls vkGetRandROutputDisplayEXT, which calls wsi_display_get_output, which then calls
wsi_display_alloc_connector with fd = wsi->fd, which at this point is still -1 as the display is not leased yet.
Signed-off-by: Torge Matthies <openglfreak@googlemail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36325>
By dynamic setting num_channel we can share more code in
lower_lsc_varying_pull_constant_logical_send().
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37853>
WSL version 1 has a known limitation that the kernel does not support
the memfd_create() syscall, and it always returns -1 (see
https://github.com/microsoft/WSL/issues/3542). This results in
lavapipe/llvmpipe failing to create the anonymous file it needs for
allocations from the pipe_screen object, which in turn results in
results in failed application calls to Vulkan APIs (in the case I
observed, vkMapMemory returned an invalid (0xffffffff) pointer along
with VK_SUCCESS, leading to the application segfaulting).
This issue can be reproduced by simply running `vkcube` (or, presumably
any other simple Vulkan application) inside WSL 1, e.g.:
xvfb-run -s '-screen 0 1024x768x24' vkcube --c 300
This patch addresses the issue with several changes:
1. llvmpipe_create_screen() now checks for errors from the
os_create_anonymous_file function and errors out early, making it
easier to track down issues similar to this. Previously, the invalid
fd of -1 would be stored in the pipe_screen struct and the problems
would only appear later.
2. os_create_anonymous_file() now attempts to handle the case where
memfd_create() fails, by falling back to creating a file in a
temporary directory. This fallback is the same as is already done for
builds that don't have memfd_create available at build time - note
that the difference here is that this is a _runtime_ fallback, as is
needed for the case of WSL 1.
3. The fallback logic previously relied on the XDG_RUNTIME_DIR
environment variable, which might not be set when running inside WSL
because there is unlikely to be a desktop environment configured.
This patch adds a fallback of creating a new directory inside /tmp in
this case.
Reviewed-by: Lucas Fryzek <lfryzek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37679>
VK_KHR_device_group has some interactions with other extensions that requires
some additional bits and pieces to be supported. One such interaction is with
VK_KHR_swapchain.
Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37726>