Remove a legacy workaround where presence of modifiers in framebuffer
state results in `needs_present` to be set without a good reason.
This prevents hitting an assertion for framebuffers that use DRM
modifiers, e.g. via GBM BO alloc -> EGLImage import -> GL FBO bind.
Co-authored-by: Daniel Stone <daniels@collabora.com>
Signed-off-by: Heinrich Fink <hfink@snap.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29715>
We were not restoring an outer loop as the current loop after we had
finished processing a nested loop.
Fixes: 9995f336e6 ("nir: add merge loop terminators optimisation")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29686>
Makes it possible to use Virtual Kernel Mode-Setting (VKMS) in combination
with a render-only GPU. Is quite helpful to test the GPU when no kms driver
is ready.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29042>
When llvmpipe adds on a layer it uses mip_offset[0] for it, so it
should still be respected even for multisample.
Fixes KHR-GL45.texture_view.view_sampling
Fixes: 839045bcc8 ("gallivm/lp: merge sample info into normal info")
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29685>
This change updates the affected calls to the proper function
which is radeon_set_config_reg().
For instance, this issue is triggered with
"piglit/bin/textureSize tes isampler2DMSArray -auto -fbo":
vertex-program-two-side: ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:4981: void si_emit_spi_ge_ring_state(si_context*, unsigned int): Assertion `(0x008988) >= CIK_UCONFIG_REG_OFFSET && (0x008988) < CIK_UCONFIG_REG_END' failed.
Fixes: bd71d62b8f ("radeonsi: program tessellation rings right before draws")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29645>
nir_lower_frag_coord_to_pixel_coord was adding .5 to work around that the
drivers were mistakenly setting PIXEL_COORD_HALF_INTEGER. With the
setting corrected, the GL frontend handles it appropriately (instead of
subtracting half in the frontend for ARB_fragment_coord_conventions
integer setting and then adding the half back here), and makes the pass
reusable from Intel.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29585>
This is required by the spec and fixes
VUID-VkPipelineMultisampleStateCreateInfo-minSampleShading-00786
when running
spec@arb_stencil_texturing@glblitframebuffer corrupts state@gl_texture_2d_multisample_array
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29625>
Vcn4 tile number calculation doesn't consider
some input case, which could result in output
bitstream corruption, this fixed the issue.
Not using the number_of_tiles directly but from
calculation. Also change to re-use some macros
from local vcn_5_0.c to ac_vcn_enc.h header file.
Updated vcn4 spec_misc_av1 ip package.
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29556>
Backend only recognizes the MUL a*0.5 pattern if there is an immediate
as one of the sources, however by then the source would we already
coverted to RC_FILE_NONE with constant half swizzles. So teach
peephole_omod to recognize this pattern as well.
RV530:
total instructions in shared programs: 128860 -> 128750 (-0.09%)
instructions in affected programs: 11942 -> 11832 (-0.92%)
helped: 106
HURT: 17
total presub in shared programs: 8739 -> 8736 (-0.03%)
presub in affected programs: 32 -> 29 (-9.38%)
helped: 3
HURT: 0
total omod in shared programs: 427 -> 1212 (183.84%)
omod in affected programs: 38 -> 823 (2065.79%)
helped: 0
HURT: 160
total temps in shared programs: 17544 -> 17554 (0.06%)
temps in affected programs: 70 -> 80 (14.29%)
helped: 0
HURT: 10
total lits in shared programs: 3153 -> 3159 (0.19%)
lits in affected programs: 9 -> 15 (66.67%)
helped: 0
HURT: 6
total cycles in shared programs: 191334 -> 191253 (-0.04%)
cycles in affected programs: 21240 -> 21159 (-0.38%)
helped: 101
HURT: 27
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28784>
Consider the following case:
0: MUL temp[1].y, input[0]._x__, input[1]._y__;
1: MOV temp[1].x, input[0].x___;
2: MOV temp[1].z, const[0].__x_;
3: MUL temp[2].xyz, const[1].xxx_, temp[1].yxz_;
...
We correctly recognize that we can convert mul into omod for all three
instructions, however the mul swizzle was not handled correctly:
0: MUL temp[2].y / 2, input[0]._x__, input[1]._y__;
1: MOV temp[2].x / 2, input[0].x___;
2: MOV temp[2].z / 2, const[0].__x_;
...
Just create the conversion swizzle from the initial mul swizzle when rewriting
the original instruction writemasks.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28784>
We add a cycles penalty when we see a begin tex and than subtract from
it based on when first alu comes that needs the results. However if the
only instruction in the TEX block is just KIL, we don't have to add any
penalty as nothing waits for it.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28784>
This is faster for 8 samples because it forms a VMEM clause, unlike
the default shader.
It also uses 16-bit types in the shader when possible and averages fewer
components if the format has less than 4.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28917>