Some HW may have native SSBO instructions that only support a limited
buffer size. It may be beneficial to use those instructions for small
SSBOs and only fall back to global memory accesses for large ones.
This commit adds an option (min_ssbo_size) that, if non-zero, will cause
code like this to be emitted:
if (@get_ssbo_size >= min_ssbo_size) {
// global memory access
} else {
// original SSBO access
}
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41477>
Even if the front buffer isn't locked yet, it will normally get locked,
so we can't reuse it as a back buffer.
Pointed out by Daniel Stone.
Fixes: 4a976b60b1 ("egl_dri2: use gbm_surface as the native window type in drm platform")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42111>
Exposes higher limits than the ones supported by the HW and several
ArrayDynamicIndexing features not yet implemented so the Dawn WebGPU
implementation can be used while it doesn't exercise these limits or
features.
The override is enabled using the V3D_WEBGPU_OVERRIDE=1 envvar. When
it is enabled it:
- Increases the framebuffer dimension limit from the real HW value
(4096 on RPi4, 7680 on RPi5) to 8192.
- Bumps the advertised maxMipLevels reported per format from 13 to
14 to match the bumped 8192-wide images and 15 for non-2D images.
The TMU HW already supports that for sampling.
- Increases max_varying_components from 64 to 72 (HW limit is 64).
- Exposes features that are not actually implemented; CTS tests that
exercise them will hit asserts in debug builds:
- shaderUniformBufferArrayDynamicIndexing
- shaderSampledImageArrayDynamicIndexing
- shaderStorageBufferArrayDynamicIndexing
- shaderStorageImageArrayDynamicIndexing
- Increases maxImageDimension1D from 4096 to 16384
- Increases maxImageDimension2D from 4096 to 8192
- Increases maxImageDimension3D from 4096 to 16384
- Increases maxImageDimensionCube from 4096 to 16384
When V3D_WEBGPU_OVERRIDE is unset (the default), the driver
advertises the real HW limits already set up by the preceding
"use real HW framebuffer limit" change, so Vulkan CTS conformance
is unaffected.
To help diagnose applications that hit the over-advertised paths,
mesa_loge errors are emitted from three places:
- lower_vulkan_resource_index() warns before the existing UNREACHABLE
for dynamic descriptor indexing, so the cause is visible in release
builds where the assertion is compiled out.
- create_image() warns when vkCreateImage is called with attachment
usage and dimensions above the real HW framebuffer limit. Storage
and sampled-only images above that limit work fine via the TMU.
- job_compute_frame_tiling() erros when a render job width/height
exceeds the real HW framebuffer limit.
The per-plane slices[] array in struct v3dv_image is sized at
V3D_MAX_MIP_LEVELS + 2 so the override case (which advertises 14/15
mip levels for the bumped 8192-wide 2D images and 16384 for 1D/3D images)
still fits without enlarging the default array.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42117>
Add a new max_framebuffer_size to devinfo so V3D 4.2 and V3D 7.1 can
expose different framebuffer dimensions: 4096 on RPi4 and 7680 on RPi5.
This is bounded by the maximum clip size supported by the framebuffer.
Take advantage of this to also raise maxImageDimensions* to
max_framebuffer_size.
A non-power-of-two framebuffer means framebuffer_size_for_pixel_count can
compute a height larger than max_framebuffer_size. Clamp the height to the
maximum and recompute the width from the division so w * h <= num_pixels.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42117>
Use the new devinfo value instead the V3D_MAX_RENDER_TARGETS
macro.
We only maintain the usage of the macro in devinfo initialization
and the V3D in the versioned file src/gallium/drivers/v3d/v3dx_state.c
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42117>
pack_null_texture_state(), introduced to support VK_KHR_robustness2
nullDescriptor for image bindings, left the TEXTURE_SHADER_STATE
"Array Stride (64-byte aligned)" field at 0.
On real V3D HW it is fine: a TMU read against a null descriptor
returns zero regardless of the descriptor contents, but V3D simulator
validates the TMU array stride before issuing the read.
Setting array_stride_64_byte_aligned = 1 (64 bytes raw) fixes failing
dEQP-VK.robustness.robustness2.bind.*.null_descriptor.samples_1.3d.*
tests case under the simulator.
Fixes: 990d76eae6 ("v3dv: Implement and enable nullDescriptor support")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42112>
bindgen up to at least 0.72.1 generates invalid code (see below) and
that function is not used, so simply skip it.
src/gallium/frontends/rusticl/rusticl_mesa_bindings.c:795:81: error: duplicate ‘const’ declaration specifier [-Werror=duplicate-decl-specifier]
795 | void pipe_shader_state_from_tgsi__extern(struct pipe_shader_state *state, const const struct tgsi_token *tokens) { pipe_shader_state_from_tgsi(state, tokens); }
| ^~~~~
Backport-to: *
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41620>
Just use an existing flag to increase the bo size slightly.
Fixes a ring gfx timeout with
dEQP-VK.spirv_assembly.instruction.compute.opfma.fp32.vec3.undef.denorm_flush.directed
on vega10.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: *
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41937>
This delays the waitcnt for has_attr_ring_wait_bug by a few instructions.
fossil-db (gfx1201):
Totals from 9 (0.00% of 208640) affected shaders:
Instrs: 19352 -> 19506 (+0.80%)
CodeSize: 101180 -> 101716 (+0.53%)
Latency: 660221 -> 678782 (+2.81%); split: -0.00%, +2.81%
InvThroughput: 95106 -> 97398 (+2.41%)
fossil-db (navi33):
Totals from 58834 (28.20% of 208626) affected shaders:
Instrs: 22424304 -> 22424571 (+0.00%)
CodeSize: 110198112 -> 110199184 (+0.00%)
Latency: 115894319 -> 126491124 (+9.14%); split: -0.00%, +9.14%
InvThroughput: 19424631 -> 19754358 (+1.70%); split: -0.00%, +1.70%
I don't think the stats are very accurate. This seems to often move the
s_waitcnt down into a divergent branch, but the wait still happens later
if the branch isn't taken, so the wait is counted twice.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41364>
This shouldn't fix anything, because event_vmem_bvh was never used here.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41364>
First, use 64bit values everywhere since shader_info::outputs_written
is a 64bit field.
Second, alpha to coverage should only be considered for draw buffer 0
as stated in the GL spec (quoting Version 4.6 (Core Profile), 17.3.1
Alpha To Coverage) :
"All alpha values in this section refer only to the alpha component
of the fragment shader output linked to color number zero, index
zero (see section 15.2.3)."
Third, the write message setup in brw_compile_fs.cpp was not taking
into account alpha-to-coverage being disabled anymore.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 294644643e ("brw: avoid requiring a valid render target for empty fragment shaders")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15625
Tested-by: Christoph Neuhauser <christoph.neuhauser@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42115>
This helper was only meant to be called once the driver knows it
doesn't have any render target setup, to figure out whether an empty
one needs to be created.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Tested-by: Christoph Neuhauser <christoph.neuhauser@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/42115>