This can be very slow on dGPU.
I tried a different version that would allocate a full row
and then do a single memcpy per row but the performance
was similar so I kept the simple version.
Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18484>
in/out/sv are arrays, so &array[i] is a non-null pointer. Presumably
numSysVals/Inputs/Outputs are only incremented when there's data in the
arrays, anyway.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18536>
It doesn't matter what we put in the swizzle for the unused components,
but if we try to stuff out-of-bounds PIPE_SWIZZLE_0/1/NONE values,
we'll crash in GenXML. Fixes failing tests in
dEQP-GLES3.functional.fragment_out.basic.fixed.*
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18525>
There's no native txs instruction... but we can emulate one :-) This is
heavy on shader ALU, but in the production driver, it'll all be hoisted
up to the preamble shader and so it shouldn't matter much. This
keeps the driver itself simple and low overhead, with a completely
obvious generalization to bindless.
Passes dEQP-GLES3.functional.shaders.texture_functions.texturesize.*
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18525>
Texture offsets and shadow comparison values get grouped into a vector
passed by register. Comparison values are provided as-is (fp32). Texture
offsets are packed into nibbles, but we can do this on the CPU, as
nonconstant offsets are forbidden in GLSL at least. They're also
forbidden in Vulkan/SPIR-V without ImageGatherExtended/
shaderImageGatherExtended. I'm happy kicking the NIR lowering can down
the line, this commit is complicated enough already.
Passes dEQP-GLES3.functional.shaders.texture_functions.texture.* and
dEQP-GLES3.functional.shaders.texture_functions.textureoffset.*
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18525>
For non-bindless textures, get the base address of the texture
descriptor array, so we can crawl descriptors in the shader. For
bindless, this isn't needed (since the bindless handle will be the
address itself).
jekstrand suggested the idea of the descriptor crawl. It worked out
pretty well, all considered.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18525>
This is so dumb. Panfrost port of d98b82a103 ("iris/cs: take buffer offsets
into account for CL")
Fixes buffer.sub_buffers_read_write
Fixes: 80b90a0f2b ("panfrost: Implement panfrost_set_global_binding")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Suggested-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18560>
Invoking this script takes checksums from all failed jobs and update
them in $driver-traces.yml files.
```
.gitlab-ci/bin/update_traces_checksum.py --rev $(git rev-parse HEAD)
```
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18329>
Traces with label `no-perf` will be skipped in performance testing.
This commit adds the yq tool, which preprocesses the traces.yml file
before sending it to the piglit.
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18329>
GE_CNTL is the equivalent of IA_MULTI_VGT_PARAM on GFX9 and older.
Calling this function for every draw shouldn't really hurt in practice
because only non-NGG pipelines need this.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18344>
The number of tessellation patches that is computed from the number
of patch control points might change dynamically too.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18344>
The number of patch control points (TCS) and the number of patches
(TCS/TES) is read from user SGPRs.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18344>
This introduces two new user SGPRS:
- tcs_offchip_layout: input patch size and number of patches in TCS
- tes_num_patches: number of patches in TES
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18344>