Some apps (old FFmpeg, contemporary CTS) send down pMi{Col,Row}Starts in
SB units, not MI units. Instead of dependening on those values which
could be unreliable, derive the tile sizes in SB using other parameters.
Cc: mesa-stable
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39492>
Added support for externally provided motion hints by reading the
MFSampleExtension_MoveRegions sample attribute.
The motion hint data is converted into pipe_enc_move_info and passed
down to the driver for use during encoding.
Reviewed-by: Yubo Xie <yuboxie@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39515>
This change fixes the clamp to max_texel_buffer_elements
issue related to rv770 and older gpus.
Here are the tests fixed on rv770:
spec/arb_texture_buffer_object/texture-buffer-size-clamp/r8ui_texture_buffer_size_via_sampler: fail pass
spec/arb_texture_buffer_object/texture-buffer-size-clamp/rg8ui_texture_buffer_size_via_sampler: fail pass
spec/arb_texture_buffer_object/texture-buffer-size-clamp/rgba8ui_texture_buffer_size_via_sampler: fail pass
Fixes: 1a441ad5cb ("r600: clamp to max_texel_buffer_elements")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39385>
This is a gl4.3 issue very similar to e8fa3b4950.
The mode r10g10b10a2_sscaled processed as vertex on palm at the
hardware level doesn't follow the current standard. Indeed, the .w
component (2-bits) is not calculated as expected. The table below
describes the situation.
This change fixes this issue by adding two gpu instructions at
the vertex fetch shader stage. An equivalent C representation and
a gpu asm dump of the generated sequence are available below.
.w(2-bits) expected palm cypress
0 0 0 0
1 1 1 1
2 -2 2 -2
3 -1 3 -1
w_out = w_in - (w_in > 1. ? 4. : 0.);
0002 00000024 A0040000 ALU 2 @72
0072 801F2C0A 600004C0 1 w: SETGT*4 __.w, R10.w, 1.0
0074 839FCC0A 61400010 2 w: ADD R10.w, R10.w, -PV.w
Note: cypress returns the expected value, and does not need
this correction.
This change was tested on palm, barts and cayman. Here are the tests fixed:
khr-gl4[3-6]/vertex_attrib_binding/basic-input-case6: fail pass
khr-gles31/core/vertex_attrib_binding/basic-input-case6: fail pass
Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38849>
Using a PV register which is not PV.x, after a dot4 operation,
does not work on rv770. Anyway, this does work on evergreen
but this is not documented.
This change updates this behavior for all the r600 gpus
which fixes the issue on rv770. It adds max4 which has the
same requirement in the case of max4 being implemented.
Here are some of the affected tests on rv770:
piglit/bin/fp-abs-01 -auto -fbo
glcts --deqp-case=KHR-GL31.buffer_objects.triangles
piglit/bin/shader_runner generated_tests/spec/glsl-1.10/execution/built-in-functions/fs-distance-vec2-vec2.shader_test -auto -fbo
Fixes: 942e6af40b ("r600/sfn: use PS and PV inline registers when possible")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39101>
The gamma is not processed by the hardware when processing a one
component format texture (FMT_8). This change triggers a fall back to
the r8g8b8a8_srgb format which is properly supported by the hardware
of these older gpus.
Here are the tests fixed on rv770:
spec/arb_framebuffer_srgb/fbo-fast-clear: fail pass
spec/ext_texture_srgb/fbo-fast-clear: fail pass
spec/!opengl 1.1/teximage-colors gl_sluminance8/gl_sluminance8 texture with gl_.*: fail pass
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39159>
The functionality was working properly at glMinSampleShading(0.)
and glMinSampleShading(1.). The issue was with the intermediary
values. This change makes this function compatible with the
evergreen setup.
Note: this was one of the few functionalities which were working
properly on evergreen but not on cayman.
Here are the tests fixed:
spec/arb_sample_shading/samplemask 4 all/0.500000 partition: fail pass
spec/arb_sample_shading/samplemask 4/0.500000 partition: fail pass
spec/arb_sample_shading/samplemask 6 all/0.250000 partition: fail pass
spec/arb_sample_shading/samplemask 6 all/0.500000 partition: fail pass
spec/arb_sample_shading/samplemask 6/0.250000 partition: fail pass
spec/arb_sample_shading/samplemask 6/0.500000 partition: fail pass
spec/arb_sample_shading/samplemask 8 all/0.250000 partition: fail pass
spec/arb_sample_shading/samplemask 8 all/0.500000 partition: fail pass
spec/arb_sample_shading/samplemask 8/0.250000 partition: fail pass
spec/arb_sample_shading/samplemask 8/0.500000 partition: fail pass
deqp-gles31/functional/shaders/sample_variables/sample_mask_in/bit_count_per_two_samples/multisample_rbo_4: fail pass
deqp-gles31/functional/shaders/sample_variables/sample_mask_in/bit_count_per_two_samples/multisample_rbo_8: fail pass
deqp-gles31/functional/shaders/sample_variables/sample_mask_in/bit_count_per_two_samples/multisample_texture_4: fail pass
deqp-gles31/functional/shaders/sample_variables/sample_mask_in/bit_count_per_two_samples/multisample_texture_8: fail pass
Fixes: f7796a966d ("radeonsi: add basic code for overrasterization")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38615>
We don't actually care if previous suspended RP had depth-only draws
with color writes skipped, we only care if previous RP disabled LRZ
writes due to this; the mere fact of first draws being depth-only
doesn't affect LRZ of next draws in any way.
However, for next RPs in suspend-resume chain we have to assume that
previous RP may have had color writes.
For secondary cmdbufs with ordinary renderpasses it is easy to be
less pessimistic, and that's what we do in order to not regress
DXVK performance.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39293>
radv_dump_shader_stats() printed stats for every shader with a certain
stage, and we called this function each time an RT shader is compiled.
This means we could repeat the stats for a shader.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39484>
This can happen if a loop has no continues, and the later code should work
fine in this situation.
This fixes war_thunder/0013a69e097b2471 on navi21.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Fixes: 6b9d28ab9b ("aco/insert_fp_mode: insert fp mode in reverse")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39481>
Use wl_display_roundtrip() instead of wl_display_flush() when freeing
a swapchain to ensure the compositor has processed buffer release
events before continuing.
wl_display_flush() only sends pending requests without waiting for
the compositor to process them. When rapidly creating and destroying
large swapchain buffers, buffer references may not be released quickly
enough (e.g., during CTS testing), causing memory to accumulate.
Using wl_display_roundtrip() ensures synchronization with the
compositor, allowing buffers to be released promptly.
Signed-off-by: Wei Zhao <Wei.Zhao@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39475>
We were initializing to a nir_const_value of undefined (in practice on x86
builds, a pointer value), with .b set to 0. Those values would get dumped
in the annotated shader disassembly at the end of a test where all inputs
where unexpectedly skipped, producing very surprising output.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369>
The mul is 24-bit sign-extended, so in simplifying we should retain that.
If nothing else, this keeps us on the happy path of mul24s.
I didn't fix the other broken pattern, since it's not really part of this
MR.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369>
We were enumerating enough for a single component, but not all the
combinations. This helps show that our fdots fail pretty consistently.
And triggers more skipping from the fany_equal16s thanks to varied inputs.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369>
As I introduced another layer of iteration for signed zero testing, the
former logic got unwieldy. In fact, it was already unwieldy enough that I
forgot to clear all_skipped when the assert failed, allowing a failing
test to be marked UNSUPPORTED instead of XFAIL.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369>