Use the genxml helpers to pack/unpack midgard and bifrost surface
descriptors. Also changed how we describe midgard's descriptors, to make
it more straightforward. We currently only use SURFACE_WITH_STRIDE for
midgard, but pandecode should still be able to decode other surface
descriptor types.
This commit shouldn't change panfrost/pandecode behavior.
Consequenctly, this refactor also prepares pandecode to handle the
SURFACE_YUV descriptor for bifrost in the following patch.
Signed-off-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21109>
As u_pack_color.h is used in vulkan drivers, so decouple it from gallium by this move
And dbghelp.h is included in u_debug_symbol.c and that's resident in src/util/
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19522>
sed + ninja clang-format + fix up spacing for common code.
If you are unhappy that I did not manually change the whitespace of your driver,
you need to enable clang-format for it so the formatting would happen
automatically.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24428>
Even on Valhall, vertex_id is zero-based in a transform feedback program. Lower
that for transform feedback programs properly since it wouldn't happen
automatically on Valhall. Fixes assertion fails.
Fixes: 91ffd10351 ("pan/bi: Lower gl_VertexID in NIR")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24198>
This removes all the users of the compiler enums, and is a lot more natural now
that nir_lower_blend speaks PIPE_BLEND enums.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24076>
This avoids the silly compiler versions. Some bits are slightly more
complicated, because they have to account for inverted enum values (rather than
a separate invert bit), but this is a LOT friendlier to drivers using the pass
and it makes the pass itself more readable.
The conversion functions in panfrost/panvk will go away momentarily.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24076>
Switch to register intrinsics, using the helpers. Since our backend copyprop
chokes on non-SSA moves, we get better coalescing with this approach, hence the
small improvements to instruction count / cycle count in shader-db. Changes to
register pressure seem to be noise from iteration order. I'm not too worried.
total instructions in shared programs: 1508444 -> 1508193 (-0.02%)
instructions in affected programs: 42581 -> 42330 (-0.59%)
helped: 482
HURT: 41
Inconclusive result (value mean confidence interval includes 0).
total bundles in shared programs: 643023 -> 643136 (0.02%)
bundles in affected programs: 16318 -> 16431 (0.69%)
helped: 230
HURT: 85
Inconclusive result (value mean confidence interval includes 0).
total quadwords in shared programs: 1125992 -> 1125600 (-0.03%)
quadwords in affected programs: 125366 -> 124974 (-0.31%)
helped: 507
HURT: 351
Quadwords are helped.
total registers in shared programs: 90632 -> 90554 (-0.09%)
registers in affected programs: 669 -> 591 (-11.66%)
helped: 114
HURT: 31
Registers are helped.
total threads in shared programs: 55607 -> 55600 (-0.01%)
threads in affected programs: 20 -> 13 (-35.00%)
helped: 1
HURT: 7
Inconclusive result (value mean confidence interval includes 0).
total spills in shared programs: 1371 -> 1437 (4.81%)
spills in affected programs: 44 -> 110 (150.00%)
helped: 0
HURT: 2
total fills in shared programs: 5133 -> 5273 (2.73%)
fills in affected programs: 84 -> 224 (166.67%)
helped: 0
HURT: 2
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>
It doesn't do anything yet. We leave that to the subsequent patches so we can
keep the tree-wide refactor as simple as possible.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>
Since the mesa state tracker can promote RGB texture formats
to RGBA texture formats (among other formats) without exposing
any of that information to a driver, it is more desirable to
have the behaviour of `PIPE_CAP_RGB_OVERRIDE_DST_ALPHA_BLEND`
be the default. This avoids rendering bugs where an application
sets `DST_ALPHA` blending on a format where there is no alpha
channel, that has been promoted to a format that actually has an
alpha channel. The driver can instead rely on the common code
in the state tracker to convert the blending parameter to one
that reflects the limitations of the application requested format,
as long as `PIPE_CAP_INDEP_BLEND_FUNC` is supported.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24044>
Skips are regexes, which means the `*` would've needed to be escaped. As
is, they can't match any existing test.
Since these lines are also all in `-fails.txt` as `Crash`es, let's just
remove them from the skips.
Signed-off-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24022>
If the GEM is closed before setting the BO in the sparse array to zero,
a newly allocated GEM may be associated with a stale BO that is left in
the cache reusing an old BO.
Zero the BO before closing the GEM to make sure that the BO is removed
from the cache and won't be associated with a different GEM.
Signed-off-by: Michael Tretter <m.tretter@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23744>
Sets the float color component type in st_visual_to_context_mode()
ensuring float color values are not clamped.
Fixes dEQP-EGL.functional.wide_color.window_fp16_default_colorspace on
asahi, iris and most likely every other driver having it marked as fail
or flake.
Closes: mesa/mesa#9276
Signed-off-by: Janne Grunau <j@jannau.net>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23914>
It really isn't that hard. This drops the roundmode optimization but otherwise
should be at parity to what there was before, and it's massively more competent
at it anyway.
total instructions in shared programs: 1514477 -> 1508444 (-0.40%)
instructions in affected programs: 645848 -> 639815 (-0.93%)
helped: 2712
HURT: 187
Instructions are helped.
total bundles in shared programs: 645069 -> 642999 (-0.32%)
bundles in affected programs: 136233 -> 134163 (-1.52%)
helped: 1242
HURT: 319
Bundles are helped.
total quadwords in shared programs: 1130469 -> 1125969 (-0.40%)
quadwords in affected programs: 379780 -> 375280 (-1.18%)
helped: 1878
HURT: 376
Quadwords are helped.
total registers in shared programs: 90577 -> 90633 (0.06%)
registers in affected programs: 5627 -> 5683 (1.00%)
helped: 309
HURT: 294
Inconclusive result (value mean confidence interval includes 0).
total threads in shared programs: 55594 -> 55607 (0.02%)
threads in affected programs: 118 -> 131 (11.02%)
helped: 43
HURT: 33
Inconclusive result (value mean confidence interval includes 0).
total spills in shared programs: 1399 -> 1371 (-2.00%)
spills in affected programs: 345 -> 317 (-8.12%)
helped: 10
HURT: 4
total fills in shared programs: 5273 -> 5133 (-2.66%)
fills in affected programs: 1035 -> 895 (-13.53%)
helped: 12
HURT: 4
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23769>
Some instructions are not able to swizzle their sources, so we conservatively
refused to propagate moves into them to avoid needing a swizzle on the source.
This is too conservative: we only need to do this if the move swizzles. If there
is only an identity swizzle on the move, we can propagate it without issue. This
will mitigate some instruction count regression from the later modifier
propagation, which will leave lots of moves that need to be propagated.
total instructions in shared programs: 1514834 -> 1514477 (-0.02%)
instructions in affected programs: 132297 -> 131940 (-0.27%)
helped: 349
HURT: 3
Instructions are helped.
total bundles in shared programs: 645093 -> 645069 (<.01%)
bundles in affected programs: 9650 -> 9626 (-0.25%)
helped: 42
HURT: 23
Bundles are helped.
total quadwords in shared programs: 1130751 -> 1130469 (-0.02%)
quadwords in affected programs: 78790 -> 78508 (-0.36%)
helped: 269
HURT: 21
Quadwords are helped.
total registers in shared programs: 90563 -> 90577 (0.02%)
registers in affected programs: 163 -> 177 (8.59%)
helped: 4
HURT: 16
Registers are HURT.
total spills in shared programs: 1400 -> 1399 (-0.07%)
spills in affected programs: 2 -> 1 (-50.00%)
helped: 1
HURT: 0
total fills in shared programs: 5276 -> 5273 (-0.06%)
fills in affected programs: 151 -> 148 (-1.99%)
helped: 1
HURT: 3
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23769>
If we need to insert a mov in order to schedule a branch, we do not schedule
anything writer to the source of that mov in the same bundle to avoid a
data race between the read and the write. That's too conservative, though: it is
legitimate to write in the first part of the ALU word (VMUL/SADD stages) and
then read from the second part (VADD/SMUL/VLUT stages). Reset the
predicate.exclude when going from scheduling the latter stages to the former, to
allow a sequence of code like:
FCMP.vector 0.xyzw, ...
branch 0.x
to be scheduled as
vmul.FCMP.vector 0.xyzw
smul r31.w, 0.x
branch 0.x
rather than getting split up into two bundles.
This mitigates a cycle count regression from the copyprop change.
total instructions in shared programs: 1514856 -> 1514834 (<.01%)
instructions in affected programs: 3087 -> 3065 (-0.71%)
helped: 5
HURT: 1
Inconclusive result (value mean confidence interval includes 0).
total bundles in shared programs: 645327 -> 645093 (-0.04%)
bundles in affected programs: 40498 -> 40264 (-0.58%)
helped: 230
HURT: 68
Bundles are helped.
total quadwords in shared programs: 1130554 -> 1130751 (0.02%)
quadwords in affected programs: 75323 -> 75520 (0.26%)
helped: 49
HURT: 231
Quadwords are HURT.
total registers in shared programs: 90559 -> 90563 (<.01%)
registers in affected programs: 119 -> 123 (3.36%)
helped: 5
HURT: 8
Inconclusive result (value mean confidence interval includes 0).
total threads in shared programs: 55590 -> 55594 (<.01%)
threads in affected programs: 4 -> 8 (100.00%)
helped: 4
HURT: 0
Threads are helped.
total spills in shared programs: 1402 -> 1400 (-0.14%)
spills in affected programs: 289 -> 287 (-0.69%)
helped: 1
HURT: 1
total fills in shared programs: 5285 -> 5276 (-0.17%)
fills in affected programs: 448 -> 439 (-2.01%)
helped: 2
HURT: 0
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23769>
If we have multiple reads of the same SSA def in the same block, we don't need
to emit multiple copies for it, we can just reuse a copy (OR'ing in the mask,
knowing the source is already fully written since it's SSA). This will prevent
some regressions in moves from the copyprop patch.
There is a bit of a tradeoff here between increased pressure and reduced
instruction count but I'm not too worried. The affect on pressure seems all over
the place -- register use decreases overall, threads increase (great!) but a few
shaders that were *already spilling*, spill a bit worse. I'm not terribly
worried there.
total instructions in shared programs: 1518289 -> 1514856 (-0.23%)
instructions in affected programs: 292854 -> 289421 (-1.17%)
helped: 1557
HURT: 232
Instructions are helped.
total bundles in shared programs: 646903 -> 645327 (-0.24%)
bundles in affected programs: 91872 -> 90296 (-1.72%)
helped: 910
HURT: 256
Bundles are helped.
total quadwords in shared programs: 1133728 -> 1130554 (-0.28%)
quadwords in affected programs: 187170 -> 183996 (-1.70%)
helped: 1399
HURT: 44
Quadwords are helped.
total registers in shared programs: 90640 -> 90559 (-0.09%)
registers in affected programs: 2676 -> 2595 (-3.03%)
helped: 202
HURT: 124
Inconclusive result (%-change mean confidence interval includes 0).
total threads in shared programs: 55561 -> 55590 (0.05%)
threads in affected programs: 50 -> 79 (58.00%)
helped: 23
HURT: 6
Threads are helped.
total spills in shared programs: 1386 -> 1402 (1.15%)
spills in affected programs: 231 -> 247 (6.93%)
helped: 2
HURT: 13
total fills in shared programs: 5159 -> 5285 (2.44%)
fills in affected programs: 1282 -> 1408 (9.83%)
helped: 11
HURT: 16
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23769>
1. Always calculate when asked. This is the sort of optimization that just
introduces bugs. Like one I hit when shuffling register indices around with
the register access changes.
2. Ask before using in RA.
3. Account for precoloured blend inputs.
Small shader-db hit, didn't investigate too much.
total instructions in shared programs: 1518017 -> 1518168 (<.01%)
instructions in affected programs: 2895 -> 3046 (5.22%)
helped: 0
HURT: 24
Instructions are HURT.
total bundles in shared programs: 646756 -> 646782 (<.01%)
bundles in affected programs: 1119 -> 1145 (2.32%)
helped: 1
HURT: 19
Bundles are HURT.
total quadwords in shared programs: 1133694 -> 1133728 (<.01%)
quadwords in affected programs: 1736 -> 1770 (1.96%)
helped: 0
HURT: 20
Quadwords are HURT.
total registers in shared programs: 90596 -> 90612 (0.02%)
registers in affected programs: 108 -> 124 (14.81%)
helped: 0
HURT: 16
Registers are HURT.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Cc: mesa-stable
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23769>
mir_prev_op will point to the last instruction of the block in that case because
the block instruction list is circular. That would cause an invald
write-after-read relationship between the move we insert with the constants and
the CSEL reading them, which DCE "helpfully" optimizes out, leaving a read from
an undefined def. That ends up getting RA'd to an invalid register.
All in all, pretty bad.
Identified due to a new assert fail after the proper temp_count fix.
Affects dEQP-GLES31.functional.separate_shader.random.12.
No shader-db changes.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23769>
If we start with unscheduled IR:
0 = comparison
csel 1, 2, 0
the old code will schedule this as
r31.w = comparison
csel 1, 2, 0
leaving 0 as a dangling source, which can confuse the rest of the compiler.
Instead rewrite this to
r31.w = comparison
csel 1, 2, r31.w
Note the swizzle as already taken care of (i.e. turned to .x for scalar
conditions) by the time we get to scheduling so we can force to .w.
This keeps register allocation from doing stupid things.
total instructions in shared programs: 1518138 -> 1518017 (<.01%)
instructions in affected programs: 37714 -> 37593 (-0.32%)
helped: 48
HURT: 42
Instructions are helped.
total bundles in shared programs: 646877 -> 646756 (-0.02%)
bundles in affected programs: 17024 -> 16903 (-0.71%)
helped: 48
HURT: 42
Bundles are helped.
total registers in shared programs: 90624 -> 90596 (-0.03%)
registers in affected programs: 361 -> 333 (-7.76%)
helped: 31
HURT: 5
Registers are helped.
total threads in shared programs: 55561 -> 55566 (<.01%)
threads in affected programs: 5 -> 10 (100.00%)
helped: 4
HURT: 0
Threads are helped.
total spills in shared programs: 1386 -> 1383 (-0.22%)
spills in affected programs: 19 -> 16 (-15.79%)
helped: 3
HURT: 0
total fills in shared programs: 5159 -> 5077 (-1.59%)
fills in affected programs: 1305 -> 1223 (-6.28%)
helped: 20
HURT: 0
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23769>