This parallelize min max index search and avoid running that logic per
layer.
This should speed up indexed draw a bit.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35724>
Now that we know that indirect draw works, we can switch to the new
indexed draw codepath.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35724>
On JM hardware, we need to allocate a buffer depending on vertex count.
As a result, for indirect and indexed draw we allocate a large buffer with
alloc on fault set.
The size of that buffer is calculated assuming a max of 2 millions vertices
and 18 attributes per vertex (16 user attributes, 2 specials)
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35724>
Most of cmd_draw logic could be shared, let's move what we can.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35724>
We are going to integrate the helper CL shader.
This shader requires certain extra infos that we need to provide, this
patch adds logic to allocate and fill those infos.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35724>
For indirect draws, we need to depends on some previous job in the chain
so we use the grid info to pass this.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35724>
Useful when using C11 atomics with CL C.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35724>
Tracking for those jobs were missing and not reset when the batch was reissued.
Fixes: d1934e44fc ("panvk: Implement occlusion queries for JM")
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35724>
We could end up in situations where the active count wouldn't match the
varying_count causing a memory corruption.
This fix it by not relying on the active count anymore.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: 05020699b9 ("panvk: Move the linking bits to panvk_shader")
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35724>
this catches the case where an app resolves both color and depth buffers
previously the inlining would only catch the first color buffer, then the depth
resolve which followed would cause the whole of rp tracking to desync and
explode, as seen in Transport Fever 2
Fixes: 8933b3ed39 ("tc: add resolve resource to rp info")
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36521>
even with a 256x256 map, it is over 1 GiB of texture memory
allocated. Also, individually, it was disabled in most of
the tests as it is either too slow or results in an OOM
Signed-off-by: Ritesh Raj Sarraf <ritesh.sarraf@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36493>
Signed-off-by: Ritesh Raj Sarraf <ritesh.sarraf@collabora.com>
ci: Downgrade to Linux 6.14 for venus-lavapipe jobs
In Linux 6.16 (and possibly 6.15 as well), the virtio gfx device
initialization seems to have regressed, resulting in device initialization
failure.
```
deqp-runner 0.20.3
+ deqp-runner suite --suite /builds/RickXy/mesa/install/deqp-venus.toml --output /builds/RickXy/mesa/results --skips /builds/RickXy/mesa/install/all-skips.txt /builds/RickXy/mesa/install/venus-skips.txt --flakes /builds/RickXy/mesa/install/venus-flakes.txt --testlog-to-xml /deqp-tools/testlog-to-xml --fraction-start 1 --fraction 60 --jobs 16 --baseline /builds/RickXy/mesa/install/venus-fails.txt
Error: Failed to invoke dEQP for dEQP-VK.info.device:
stdout:
Writing test log into /builds/RickXy/mesa/results/dEQP-VK.info.device
dEQP Core 3299a07b86cf0b15f86d1a441e323e515b15f255 (0x3299a07b) starting..
target implementation = 'Default'
stderr:
MESA-VIRTIO: debug: one of required kernel params (4 or 9) is missing
FATAL ERROR: vk.enumeratePhysicalDevices(instance, &numDevices, nullptr): VK_ERROR_INITIALIZATION_FAILED at vkQueryUtil.cpp:83
```
Signed-off-by: Ritesh Raj Sarraf <ritesh.sarraf@collabora.com>
ci: Drop the test from the fail list
It is reported to pass with Linux 6.16
```
Unexpected results:
07:33:07.167: KHR-GL46.sparse_texture2_tests.UncommittedRegionsAccess_texture_cube_map_r32i,Crash
07:33:07.167: spec@!opengl 1.1@streaming-texture-leak,UnexpectedImprovement(Pass)
```
Signed-off-by: Ritesh Raj Sarraf <ritesh.sarraf@collabora.com>
ci: Update zink-avn-adl flakes list
Signed-off-by: Ritesh Raj Sarraf <ritesh.sarraf@collabora.com>
ci: Add flake to zink-anv-adl skip list
Signed-off-by: Ritesh Raj Sarraf <ritesh.sarraf@collabora.com>
ci: Add api@clgetdeviceinfo to Intel fails list
This api call is failing for Intel as well, like many of the other
types.
Signed-off-by: Ritesh Raj Sarraf <ritesh.sarraf@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36493>
SPIR-V->NIR now inserts this barrier itself.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36513>
Backends probably already deal with this, but these would be needed to
prevent NIR passes from moving accesses outside the critical section.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36513>
The variables are evaluated by GitLab before the job starts, while the
intent here seemed to have been to append the path at run time; that's
not a thing if done here, but luckily this was also not necessary, so
let's drop the invalid literal `$PYTHONPATH` from the `PYTHONPATH` list.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36078>
Compiling my shader-db with the gallium noop driver produces too noisy
time results to make a conclusion about the improvement. Theoretical
stat-based results are below, which don't always reflect real results.
When compiling Heaven shaders with the gallium noop driver,
213438 calloc calls are removed.
213438 / ralloc count = 9.6%, so it's roughly the equivalent of 9.6% of
the cost of all ralloc calls that's removed. The shift from calloc to
linear_alloc increases ralloc calls by 0.3%, so the approximate reduction
is 9.6% -> 0.3% overhead change.
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36539>
Compiling my shader-db with the gallium noop driver is 6.8% faster now.
Theoretical stat-based results are below, which don't always reflect real
results.
When compiling Heaven shaders with the gallium noop driver,
134610 calloc calls are removed.
134610 / ralloc count = 6%, so it's roughly the equivalent of 6% of
the cost of all ralloc calls that's removed. The shift from calloc to
linear_alloc increases ralloc calls by 0.4%, so the approximate reduction
is 6% -> 0.4% overhead change.
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36539>
Compiling my shader-db with the gallium noop driver is 3.6% faster now.
malloc calls from ralloc+linear_alloc are reduced by 34% when compiling
Heaven shaders with the gallium noop driver. That's due to a shift of
malloc calls from ralloc to linear_alloc.
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36539>
The type of the "new operator" parameter determines whether ir_instruction
is allocated with linear_ctx or ralloc. The ralloc operators will be
removed in the next commit.
GCC expects classes with virtual functions to have a virtual destructor,
but linear_ctx has static assertions that expects that no destructor is
present. Remove the assertions, as that's our only option. The destructor
is empty including in all derived classes, so it doesn't have to execute.
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36539>
also mark those functions as pub while at it, because they are meant to be
exported anyway. It's already done with a linker script correctly, but
better to do it correctly on the rust side as well.
Reviewed-by: @LingMan
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33688>