We have an SSA register allocator now that should be a lot faster on
most (if not all) of these tests now. Let's re-enable them to gain more
CI coverage.
Acked-by: Valentine Burley <valentine.burley@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36310>
This change is inspired by 1021d6fe62 ("dri: deal
with ARGB1555")
This issue is now mostly fixed with
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36081
Anyway, the dri3_cpp_for_fourcc entry is still missing and should
be added.
This change is useful for instance with r600 which
can handle this format.
Note: this mode was generated at the "glx visuals" level
on r600 by default before the commit d709b42180.
This change was tested on r600 palm and cayman with X11
loaded with a version of mesa generating this very mode:
glx/glx-visuals-depth -pixmap: fail pass
glx/glx-visuals-stencil -pixmap: fail pass
Fixes: 00aa095d53 ("dri: Support 1555/4444 formats")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34294>
When the rasterizer state is updated, we only need to update
the scissoring state if the rasterizer scissor state has changed.
This avoids re-sending the same scissor state any time the rasterizer
is changed.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36352>
We need to update the CFG_BITS packet if the early_fragment_test status
changed vs previous draw call. But we don't need to update it every
time the FS is changed, we only need to update it when disable_ez
value is different from previous FS.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36352>
We were not supporting non replicate swizzle and this trigger an
assertion on fossils/parallel-rdp/small_subgroup.foz.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: 1481b14fcb ("pan/bi: Lower SWZ.v4i8 to multiple MKVEC.v2i8 on v11+")
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36349>
Now that we have we have the concept of "dummy" registers, we can use it
for descriptor prefetches as well. Currently, they are represented as
having no dst, and a fixup pass during legalization adds the actual
needed dummy dst. This can be prevented by representing their dst using
a dummy register from the start.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36365>
A sam used as descriptor prefetch has a dummy src which is hard-coded
to r48.x. Neither postsched nor legalization are aware this is a dummy
src though, causing false dependencies to be added. This often results
in unnecessary syncs and/or sub-optimal scheduling decisions.
Fix this by adding a "dummy" register flag and teaching postsched and
legalization to ignore registers with that flag set.
Totals from 24681 (15.00% of 164575) affected shaders:
Instrs: 17953856 -> 17953751 (-0.00%); split: -0.00%, +0.00%
CodeSize: 36166530 -> 36163008 (-0.01%); split: -0.09%, +0.08%
NOPs: 3466012 -> 3465943 (-0.00%); split: -0.00%, +0.00%
MOVs: 550649 -> 550613 (-0.01%)
(ss): 460398 -> 460402 (+0.00%)
(ss)-stall: 1780969 -> 1780916 (-0.00%); split: -0.00%, +0.00%
(sy)-stall: 5876641 -> 5876604 (-0.00%); split: -0.00%, +0.00%
Preamble Instrs: 4118242 -> 4087950 (-0.74%); split: -1.13%, +0.39%
Last helper: 7258848 -> 7258837 (-0.00%); split: -0.00%, +0.00%
Cat0: 3795308 -> 3795239 (-0.00%); split: -0.00%, +0.00%
Cat1: 746570 -> 746534 (-0.00%)
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36365>
When invalid registers are passed to `get_ready_slot`, it may cause an
OOB array access. Instead of running into UB when this happens, catch it
early by asserting.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36365>
The programming model matches very closely to that of NVIDIA's NVDLA.
Enough is implemented to run SSDLite MobileDet with roughly the same
performance as the blob (when running on a single NPU core).
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29698>
On Rockchip, we need a tolerance of 8 to pass all tests (especifically
the whole MobileNetV1 model).
Though all other tests pass with a tolerance of just 2, 8 is still not
that high that we would risk letting bugs slip in.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29698>
Remove the preceeding nir as that is generally reserved for helpers
used across files.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36366>
Since we no longer share this with the old glsl ir linker just move
it to where it is called from.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36366>
While these GUIDs make their way through the SDK, add them for now
directly, to be able to perform Input QP related development.
Reviewed-by: Pohsiang (John) Hsu <pohhsu@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36379>
For samplers the type_size() callback can return 0, which triggers
a NIR validation error.
In this case set range to ~0 which means the range is unknown.
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36263>
VCN1 doesn't have FW interface to enable cu_qp_delta with rate control
disabled, which means we can only support either rate control enabled or
disabled. Spec requires VK_VIDEO_ENCODE_RATE_CONTROL_MODE_DISABLED_BIT_KHR
to always be supported, thus the rate control modes needs to be disabled
on VCN1.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36353>