Decompressing HTILE should also reset the HTILE metadata to initial
state which means that re-initializing it for non-compressed to
compressed transitions is redundant.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30122>
This fixes the compute blitter with compression in the general case, and then
flips the switch since the compute blitter is faster / less buggy than the
traditional path.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30633>
This is like shared_size but for spatial data instead, for compute shaders that
use the tilebuffer.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30633>
I don't know what Apple calls these, so we're using the name "explicit
coordinates".
AGX has instructions for loading/stores register <---> tilebuffer ---> storage
images. Usually these are used in the fragment shader and end-of-tile shader to
implement colour attachments, with implicitly specified coordinates based on the
shader stage. However they can also be used in compute shaders with explicitly
specified coordinates ("imageblocks" in Apple parlance). Model this in NIR.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30633>
for preambles and for peephole selection.
total instructions in shared programs: 2159359 -> 2114124 (-2.09%)
instructions in affected programs: 359763 -> 314528 (-12.57%)
helped: 814
HURT: 6
Instructions are helped.
total alu in shared programs: 1685059 -> 1670200 (-0.88%)
alu in affected programs: 217210 -> 202351 (-6.84%)
helped: 589
HURT: 45
Alu are helped.
total fscib in shared programs: 1681202 -> 1666324 (-0.88%)
fscib in affected programs: 217477 -> 202599 (-6.84%)
helped: 590
HURT: 45
Fscib are helped.
total ic in shared programs: 460856 -> 455502 (-1.16%)
ic in affected programs: 41350 -> 35996 (-12.95%)
helped: 174
HURT: 8
Ic are helped.
total bytes in shared programs: 14302484 -> 14053982 (-1.74%)
bytes in affected programs: 2380614 -> 2132112 (-10.44%)
helped: 814
HURT: 7
Bytes are helped.
total regs in shared programs: 662302 -> 656517 (-0.87%)
regs in affected programs: 26979 -> 21194 (-21.44%)
helped: 432
HURT: 9
Regs are helped.
total uniforms in shared programs: 1651909 -> 1687077 (2.13%)
uniforms in affected programs: 95383 -> 130551 (36.87%)
helped: 17
HURT: 783
Uniforms are HURT.
total threads in shared programs: 20324608 -> 20326592 (<.01%)
threads in affected programs: 16192 -> 18176 (12.25%)
helped: 17
HURT: 3
Threads are helped.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30633>
../src/gallium/drivers/asahi/agx_uniforms.c:60:10: warning: taking address of packed member of ‘struct agx_draw_uniforms’ may result in an unaligned pointer value [-Waddress-of-packed-member]
60 | &batch->uniforms.attrib_base[i]);
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30633>
We previously introduced cross-context sync points to make ordering work
with multiple queues. Unfortunately, that also adds a CPU round trip in
the kernel when a single context flushes and then keeps submitting,
since it introduces a sync against itself. That's pointless.
To fix this without introducing races, on flush, we check the previous sync
point. If it's foreign, we record it, and we also keep track of our last
local sync point. Then, when waiting, if we're about to wait on our last
flush sync point from our own queue, we instead wait for the foreign
one. A foreign sync after that will cause the equality check to fail and
future submits from this queue to sync against the most up to date
point, and the next flush will then record it as the last known foreign
sync point for this queue (and continue flushing against it until
another foreign queue flushes again).
Fixes glmark2 perf regression (particularly with `build` and similar
high-FPS tests).
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30633>
Incompatible changes:
- Make VM layout more flexible to allow for SVM with rusticl
(eventually, hopefully)
Compatible changes:
- Expose soft fault state to userspace as a flag
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30633>
Currently, vulkan doesn't use gbm. So, don't run vulkan
related jobs for gbm changes.
Signed-off-by: Sai Teja <saiteja13427@gmail.com>
Suggested-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30590>
the CL CTS added a new test being printf("\n", "foo"), but we ended up
printing the new line twice. If we can't find a specifier anymore, ignore
the argument as after the loop processing all arguments we'll print the
remaining format string anyway.
Cc: mesa-stable
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30574>