Saves emitting a MOV at the end of the program to store the output.
softpipe glmark2 -b buffer +9.73451% +/- 3.17924% (n=6)
softpipe glmark2 -b build +5.57621% +/- 1.35074% (n=9)
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8023>
Expand the list of bind flags to cache depth and stencil
buffers.
Extensive testing shows the following performance improvements:
Game | % difference in FPS
Plague Inc | 7
Portal 2 | 21
Overcooked 2 | 1.2
Hollow Knight | -1.1
Civilization V | 3.8
Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8560>
Currently, r600/nir doens't have a proper scheduler or optimizer backend,
to be able to make use of this code path without performance regressions,
we enable the sb optimizer also for NIR.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8563>
Some emulated fp64 piglit create code that seems to make the register
allocation with sb worse than the original shader created from NIR, so
fall back to using the un-optimized shader.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8563>
When trying to blit using the TFU, as we are doing an exact copy with no
conversions, we can choose a supported format that is compatible with the
underlying format's texel size.
This allows to use the TFU to blit formats that are not supported, like
r8ui or r16ui.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8495>
ARB_enhanced_layouts allows specifying overlapping variable locations
for xfb outputs, so we need to explode the arrays here to a full 128
components so we can do per-component mapping
sometimes this fails though, as in the case where xfb is just selecting
a single component from a vec but still considering the whole slot, and
so for those cases we just decrement our array index until we get to the
base, which will be within 3 components
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8515>
Fix defects reported by Coverity Scan.
uninit_member: Non-static class member array is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member arrayIdx is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member baseAddr is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member arrayLen is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member baseSym is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member vecDim is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member eltSize is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member file is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member regOnly is not initialized in this constructor nor in any functions that it calls.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7764>
The min/max indices are valid. Set the bit to true to indicate that.
Fixes glClear (+ clear_with_quads) on nouveau.
Fixes: 72ff53098c (gallium: add pipe_draw_info::index_bounds_valid)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reported-by: Simon Ser <contact@emersion.fr>
Tested-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8546>
Passing NULL for the views parameter should be the same as passing an
array of NULL, according to the documentation. So let's respect that
detail.
This fixes a crash when using GALLIUM_HUD.
Fixes: 8d46e35d16 ("zink: introduce opengl over vulkan")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8564>
It's incorrect because si_get_vs_state returns gs_copy_shader for legacy
GS. It was harmless, but let's use si_get_vs, which is simpler.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8548>
This radically simplifies the code to decrease CPU overhead in si_draw_vbo.
The generic CP DMA copy function is too complicated.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8548>
This is a great candidate for a template. There are a lot of conditions
that are already templated in si_draw_vbo.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8548>
It's probably not needed and we also have draw merging on gfx10,
so we should be able to use total_driver_count in theory.
(I may be wrong, but I don't know if having avg_direct_count really
improves anything)
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8548>