This does two things:
1. Flush the command buffer and associate a fence with each
glLinkProgram().
2. Force the application calling glLinkProgram() to wait on the
associated fence, matching the semantics of native drivers.
This important for some workloads and some environments. For example, on
ChromeOS devices supporting VM-based android (ARCVM), an app's HWUI thread
may be configured to use skiagl, while the app may create its own GLES
context for custom rendering. Virgl+virtio_gpu supports a single fencing
timeline, so all guest GL/GLES contexts are serialized by submission
order to the guest kernel.
If the app's submits multiple heavy shaders for compliation+linking
(glCompileShader + glLinkProgram()), these are batched into a single
virtgpu execbuffer (with one fence). Then rendering performed by the
HWUI thread is blocked until the unrelated heavy host-side work is
finished. To the user, the app appears completely frozen until finished.
With this change, the app is throttled in its calls to glLinkProgram(),
and the HWUI work can fill in the gaps between each while hitting most
display update deadlines. To the user, the UI may render at reduced
framerate, but remains mostly responsive to interaction.
Signed-off-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22341>
This deduplicate two identical bytes_to_hex implementation into one.
The intention is to ease the introduction of a new hash algorithm, which
will also have its formatting helper (to ensure seamless transition from
sha1).
Note that the new functions always take the size of the binary buffer,
unlike the old disk_cache_format_hex_id which took `binary * 2` which was
inconsistent (binary size is `binary` and string size is `binary * 2 + 1`).
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22527>
They use same instruction. Just because when the time
nir_load_smem_buffer_amd was introduced, radeonsi didn't support
pass buffer descriptor to nir_load_ubo directly.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22523>
GFX11 format table is different than GFX10, the change is
required to pass below deqp tests for gfx11:
dEQP-GLES3.functional.texture.specification.teximage2d_pbo*,
texsubimage2d_pbo*, teximage3d_pbo*, texsubimage3d_pbo*.
Signed-off-by: Ikshwaku Chauhan <ikshwaku.chauhan@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22466>
For MUX.v2i16 should be:
`MUX.v2i16.bit A, B, mask` calculates `(A & mask) | (B & ~mask)`
For MUX.v4i8 should be:
`MUX.v4i8.bit A, B, mask` calculates `(A & mask) | (B & ~mask)`
Signed-off-by: Signed-off-by: Aleksey Komarov <q4arus@ya.ru>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20467>
This commit introduces multiple changes:
1. Now we check for mainline artifacts only when NOT running on
the mainline branch
2. if we run on the fork and get 404-like error, it doesn't retry.
Reviewed-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22386>
Same/similar issues are seen on MTL platform as DG2 so disable for both.
Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22435>
This puts the CL -> pipe_box logic in one place and also make sure the
pipe_box is filled in correctly so we neither read out of bounds nor do
nothing at all.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22506>
This gives anv the same behavior as turnip in not asserting, and just not
filling out feedback for those stages.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15637>
Since there are concerns that the VK_EXT_GPL implementation may have
issues with mesh shading, disable it by default but give users a knob to
turn it on to experiment.
This doesn't automatically enable GPL use in zink, because we lack
extendedDynamicState2PatchControlPoints, but it means that you only need
to set ZINK_DEBUG=gpl and not both env vars.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15637>
With independent sets, we're not able to compute immediate values for
the index at which to read anv_push_constants::dynamic_offsets to get
the offset of a dynamic buffer. This is because the pipeline layout
may not have all the descriptor set layouts when we compile the
shader.
To solve that issue, we insert a layer of indirection.
This reworks the dynamic buffer offset storage with a 2D array in
anv_cmd_pipeline_state :
dynamic_offsets[MAX_SETS][MAX_DYN_BUFFERS]
When the pipeline or the dynamic buffer offsets are updated, we
flatten that array into the
anv_push_constants::dynamic_offsets[MAX_DYN_BUFFERS] array.
For shaders compiled with independent sets, the bottom 6 bits of
element X in anv_push_constants::desc_sets[] is used to specify the
base offsets into the anv_push_constants::dynamic_offsets[] for the
set X.
The computation in the shader is now something like :
base_dyn_buffer_set_idx = anv_push_constants::desc_sets[set_idx] & 0x3f
dyn_buffer_offset = anv_push_constants::dynamic_offsets[base_dyn_buffer_set_idx + dynamic_buffer_idx]
It was suggested by Faith to use a different push constant buffer with
dynamic_offsets prepared for each stage when using independent sets
instead, but it feels easier to understand this way. And there is some
room for optimization if you are set X and that you know all the sets in
the range [0, X], then you can still avoid the indirection. Separate
push constant allocations per stage do have a CPU cost.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15637>