Avoids later moves with extractions from the vector.
Reduces VALU operation in the raytrace loop by ~6%, increasing
the RT performance in Q2RTX on a 6800 XT by about ~1.3%.
Suggested by Georg.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19148>
Which avoids meson needing to wrap the generator to capture the output,
and makes it faster
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19058>
v2:
- Give variable more descriptive name to avoid shadowing
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
and avoid shadowing
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19058>
This avoids meson creating a wrapper to redirect stdout, and makes the
generator faster
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19058>
problem:
When playing back some clips with loop restoration parameters
enabled, the display image could be corrupted.
solution:
correct loop restoration unit size logic in vaapi interface.
CC: 22.2 <mesa-stable>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19146>
There are three tiers of drivers:
* Drivers that support MRT and support mixed colorbuffer formats. All modern
hardware fits this as it becomes a spec requirement.
* Drivers that do not support MRT. Then this CAP is a no-op, so we might as well
set it by default even here (this commit trivially enables the CAP for lima,
vc4, etanviv).
* Drivers that support MRT but do not support mixed colorbuffer formats! Very
little hardware fits this category as it doesn't suffice for MRT in most APIs.
Unfortunately we have a few drivers that are in this category, preventing us
from bulldozing the CAP altogether.
Given that the CAP only exists for a few legacy drivers, default it to being
enabled to avoid new drivers falling into the trap of forgetting to enable it.
Failing to set this CAP causes failures in
dEQP-GLES3.functional.fbo.completeness.*
Drivers which still do not set this CAP: nv30, r300 (older than r500), virgl in
some cases. r300/r400 is due to a hardware requirement as Emma points out.
v2: Advertise the cap on lima like the commit message claims (delete the special
case).
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19079>
Fix defect reported by Coverity Scan.
Evaluation order violation (EVALUATION_ORDER)
write_write_typo: In
zm = zm = create_shader_module_for_stage_optimal(ctx, screen, prog->shaders[i], prog, i, state),
zm is written twice with the same value.
Fixes: 325c703624 ("zink: add 'optimal_keys' handling for shader keys")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19086>
When copying the secondary command buffer's deferred control stream
to the main stream we have to first allocate space in the main
stream. In case the allocation failed we were attempting to
memcpy() to a NULL destination causing a NULL dereference.
This fixes CID 1515977.
Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19136>
We don't actually really use this on a6xx, since SQE can early-exit IB2
when there are no more remaining primitives in the bin.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19100>
RBBM_PRIMCTR_7 is pre-clipped, whereas RBBM_PRIMCTR_8 is after clipping.
I believe we want pre-clipping, and this is what tu does.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19100>
It is possible to expose ARB_gpu_shader_fp64 without supporting
ARB_vertex_attrib_64bit. The supported GLSL version should tell
us if 64b vertex attribs are also supported or not.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19100>
This results in faster lookups because memcmp and the hash computation
can be unrolled.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19129>
One function is moved to the header file, the other two functions are inlined
manually.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19129>
I think this can't happen in practice, but better safe than sorry.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19129>
Since the key size wasn't a constant expression, all the function inlining
didn't do much. This makes it a constant expression.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Fixes: c4e18cd4dd - mesa/st: add PIPE_QUIRK_TEXTURE_BORDER_COLOR_SWIZZLE_FREEDRENO
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19129>
This is a small optimization for viewperf. See the comment in the code.
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18199>
It's always called for the first and second draw in a row, and if they
can't be merged, the second draw becomes the first draw in the next call,
and simplify_draw_info is called again on that. Thus, it's called twice
on any draw that isn't merged. To prevent that, call it when we add
the draws.
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18199>
This looks simpler and hopefully faster, but it may just move the overhead
to the app thread.
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18199>
Since we had the freedom to write our own marshal and unmarshal Draw
functions, we turned it into a mess by doing whatever we want in those
functions. For example, DrawElementsInstancedBaseVertex actually
implemented DrawArraysInstancedBaseInstanceBaseInstance.
Add 4 internal GL functions that pass user buffers through marshal
/unmarshal functions, and clean up the other names to match their
behavior.
The new functions don't need any parameters in their definitions because
we don't call them directly.
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18199>