This is mainly to get perfetto's commit 3e7228376 ("tracing: Clean up
platform TLS state on shutdown").
Acked-by: Rob Clark <robdclark@chromium.org>
Acked-by: Sami Kyöstilä <skyostil@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18502>
zink has a push descriptor template layout that has every possible stage,
which gets used regardless of what stages are in the pipeline. By
skipping over the unused stages, we cut the CP overhead.
Improves TU_DEBUG=sysmem gfxbench gl_driver2 on zink by 6.57% +/-
0.331143% (n=5).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18562>
This was sort of well intentioned, but wrong. bits_per_rgb_value is the
number of significant bits in the color (channel) specification, not the
number of bits used to name that color within the pixel. If you have a
depth 24 visual but the colormap is 11 bits deep then each of those
channels selects one of 256 11-bit color values in the output ramp.
The open source drivers mostly don't expose anything like that, but
nvidia does, and we refuse to work. That's silly. Practically speaking
we can probably render to any TrueColor or DirectColor visual that your
X server exposes, since it is probably not going to have visuals for
non-color-renderable formats. Just check the visual class instead.
Likewise when matching formats to visuals, count the bits in the rgb
masks in the visual.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6995
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18381>
Optimizes patterns which are created by recent versions of vkd3d-proton,
when constant folding doesn't eliminate it entirely:
- ubitfield_extract(value, offset, umin(bits, 32-(offset&0x1f)))
- ibitfield_extract(value, offset, umin(bits, 32-(offset&0x1f)))
- bitfield_insert(base, insert, offset, umin(bits, 32-(offset&0x1f)))
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13225>
st_TexSubImage has this "default to fallback for depth-stencil" since
2013. I think it's time to remove this limitation - hopefully all
drivers will be happy with the change to avoid adding yet another CAP.
This helps CS:GO startup a lot, because the fallback path is very very
slow.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18484>
The spec says:
If the base internal format is DEPTH_STENCIL and format
is not DEPTH_STENCIL, then the values of the stencil
index texture components are undefined.
Which can be translated as: we don't need to bother preserving
the original stencil values.
Suggested by Emma Anholt.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18484>
This can be very slow on dGPU.
I tried a different version that would allocate a full row
and then do a single memcpy per row but the performance
was similar so I kept the simple version.
Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18484>
in/out/sv are arrays, so &array[i] is a non-null pointer. Presumably
numSysVals/Inputs/Outputs are only incremented when there's data in the
arrays, anyway.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18536>
It doesn't matter what we put in the swizzle for the unused components,
but if we try to stuff out-of-bounds PIPE_SWIZZLE_0/1/NONE values,
we'll crash in GenXML. Fixes failing tests in
dEQP-GLES3.functional.fragment_out.basic.fixed.*
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18525>
There's no native txs instruction... but we can emulate one :-) This is
heavy on shader ALU, but in the production driver, it'll all be hoisted
up to the preamble shader and so it shouldn't matter much. This
keeps the driver itself simple and low overhead, with a completely
obvious generalization to bindless.
Passes dEQP-GLES3.functional.shaders.texture_functions.texturesize.*
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18525>
Texture offsets and shadow comparison values get grouped into a vector
passed by register. Comparison values are provided as-is (fp32). Texture
offsets are packed into nibbles, but we can do this on the CPU, as
nonconstant offsets are forbidden in GLSL at least. They're also
forbidden in Vulkan/SPIR-V without ImageGatherExtended/
shaderImageGatherExtended. I'm happy kicking the NIR lowering can down
the line, this commit is complicated enough already.
Passes dEQP-GLES3.functional.shaders.texture_functions.texture.* and
dEQP-GLES3.functional.shaders.texture_functions.textureoffset.*
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18525>
For non-bindless textures, get the base address of the texture
descriptor array, so we can crawl descriptors in the shader. For
bindless, this isn't needed (since the bindless handle will be the
address itself).
jekstrand suggested the idea of the descriptor crawl. It worked out
pretty well, all considered.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18525>
This is so dumb. Panfrost port of d98b82a103 ("iris/cs: take buffer offsets
into account for CL")
Fixes buffer.sub_buffers_read_write
Fixes: 80b90a0f2b ("panfrost: Implement panfrost_set_global_binding")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Suggested-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18560>
Invoking this script takes checksums from all failed jobs and update
them in $driver-traces.yml files.
```
.gitlab-ci/bin/update_traces_checksum.py --rev $(git rev-parse HEAD)
```
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18329>
Traces with label `no-perf` will be skipped in performance testing.
This commit adds the yq tool, which preprocesses the traces.yml file
before sending it to the piglit.
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18329>