Commit graph

349 commits

Author SHA1 Message Date
Olivia Lee
275ebde06d pan/va: fix bi_is_imm_desc_handle early return
In the bi_emit_load_attr call site, we can use the imm_index value even
if the function returns false. The bifrost path handles this correctly.

Fixes: 652e1c2e13 ("pan/bi: Rework indices for attributes on Valhall")
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37464>
2025-09-18 16:38:39 +00:00
Christoph Pillmayer
cc5c1c65ef pan/va: Remove redundant MOVs from va_lower_split_64bit
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
To lower 64bit sources we emit a COLLECT -> SPLIT pair to force
allocation into consecutive registers. When the sources for COLLECT
are outputs of the same SPLIT already, we can omit the COLLECT + SPLIT
pair.

Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Aksel Hjerpbakk <aksel.hjerpbakk@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37398>
2025-09-17 10:44:14 +00:00
Christoph Pillmayer
a8ff0176de pan/bi: Normalize with pan_model.rates
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37231>
2025-09-16 15:54:48 +00:00
Christoph Pillmayer
f091bdf392 pan: Lift pan_get_model into its own lib
The following commit needs to use it from panfrost/compiler. But compiler
depends itself on panfrost/lib.

Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37231>
2025-09-16 15:54:48 +00:00
Mary Guillemard
67a662ed05 pan/bi: Propagate MKVEC.v2i8 and V2X8_TO_V2X16 for replicate swizzle
On Valhall, we can end up with a lot of convertions for 8-bit and 16-bit
values.

However, since Valhall, we have access to a lot more swizzles on widen
sources.

The idea of this pass is to propagate replicate swizzle usages to
simplify things.

We do not attempt to propagate MKVEC.v2i16 as it is already handled by
bi_lower_swizzle.

This changes the following:
   9 = V2S8_TO_V2S16 !7.b0
   11 = IADD.v2s16 !9.h00, u4
   88 = MKVEC.v2i8 11.b0, u256.b0, u256
   13 = IMUL.v4i8 !88.b0, 8.b0
   14 = V2S8_TO_V2S16 !13.b0
   15 = IADD.v2s16 14.h00, !11.h00
   89 = MKVEC.v2i8 !15.b0, u256.b0, u256
   17 = IMUL.v4i8 !89.b0, !8.b0

Into this:
   11 = IADD.v2s16 !7.b0, u4
   13 = IMUL.v4i8 11.b0, 8.b0
   15 = IADD.v2s16 13.b0, !11.h00
   17 = IMUL.v4i8 !15.b0, !8.b0

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37167>
2025-09-08 14:25:22 +00:00
Mary Guillemard
59e0a15c47 pan/bi: Make va_optimize_forward run until there is no progress
We are going to do more things here that will likely benefit from
possibly running multiple time.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37167>
2025-09-08 14:25:22 +00:00
Eric R. Smith
b03cd7bdce panfrost: align spills to reduce TLS memory usage
When spilling registers on Valhall we are careful to leave the TLS
pointer aligned on 16 byte boundaries (so as to avoid accesses
crossing those boundaries). However, within the spill code we don't
need to have 16 byte alignment for spills of 32 or 64 bit values.
In the common case where most spills are 32 bits, we can save nearly
75% of the memory used by just aligning to 32 bit boundaries.

Reviewed-by: Aksel Hjerpbakk <aksel.hjerpbakk@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36676>
2025-09-03 23:54:32 +00:00
Eric R. Smith
e3552c427e panfrost: fix debug print of spilled registers
We were testing some conditions in the wrong order, so spilled
registers were being printed as if they were uniforms. This is
incorrect, but only subtly so, and lead to confusion.

Fixes: 6c64ad934f ("panfrost: spill registers in SSA form")
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Ashley Smith <ashley.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37092>
2025-09-03 16:19:42 -03:00
Eric R. Smith
d482b6ca68 panfrost: fix typo in register allocation
The intention of the code was to allow PHI values to be propagated
if they were in registers (as opposed to in memory). As written though
values were never propagated. I think this typo was due to some
debug code that wasn't removed properly.

Fixes: 6c64ad934f ("panfrost: spill registers in SSA form")
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Ashley Smith <ashley.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37092>
2025-09-03 16:19:29 -03:00
Mary Guillemard
fac8c9def0 pan/bi: Reintroduce bi_fuse_small_int_to_f32 on v11+
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
On v11+, all small integers instruction variants are gone, however we
can now use widen on src0 just fine.

That mean we can get ride of mid conversion by relying on swizzle
instead while respecting signess of the inner instruction.

This helps a little bit on clpeak with panvk+clvk, shader-db is also
happy:

Totals:
Instrs: 109541 -> 109354 (-0.17%)
CodeSize: 1110528 -> 1108864 (-0.15%)
Estimated normalized CVT cycles: 667.609375 -> 664.5625 (-0.46%)

Totals from 17 (2.12% of 803) affected shaders:
Instrs: 13637 -> 13450 (-1.37%)
CodeSize: 112256 -> 110592 (-1.48%)
Estimated normalized CVT cycles: 100.203125 -> 97.15625 (-3.04%)

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37125>
2025-09-03 17:32:02 +00:00
Mary Guillemard
e84262b77d pan/bi: Ensure to merge adjacent ifs after bifrost_nir_lower_shader_output
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
nir_opt_if was unable to optimize some ifs later on so let's get ride of
them as soon as we generated them for simplicity.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37030>
2025-09-01 09:42:03 +00:00
Christoph Pillmayer
5acedf5b31 pan/bi: Prioritize consts moved to the FAU
Instead of allocating constants to the FAU entries on a
first-come-first-serve basis, it would be more efficient to put the most
frequently used constants in the FAU so we save the most amount of ADD_IMM
to push constants into registers.

This commit does so using a simple pass before the main constant lowering
pass.

Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36872>
2025-08-27 10:48:21 +00:00
Christoph Pillmayer
e83ca0e954 pan/va: Pull out constant swizzle handling
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36872>
2025-08-27 10:48:21 +00:00
Christoph Pillmayer
aa43ac4e7c pan/bi: Move some constants into FAU entries
Following on the previous commit, this commit adds support for selecting
and reporting constants to pull from the FAU instead of loading them
into registers first with ADD_IMM. This is beneficial because we can then
use them as a source directly and save ourselves one instruction to move
them into the register first.

Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36872>
2025-08-27 10:48:20 +00:00
Erik Faye-Lund
e098bf399a pan/va: check branch_offset for overflow
The branch offset needs to fit in 8 bits, and with the shr(3) modifier,
this means the max legal value is 2040. Let's verify that while packing.

CID: 1503283
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36724>
2025-08-21 08:45:33 +00:00
Erik Faye-Lund
49183bfb79 pan/bi: use os_read_file-helper
We already have a more robust helper for this, so let's use it rather
than open-coding the same.

While we're at it, return early on error for readability here. There's
no need to continue the logic in those cases.

CID: 1444074
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36724>
2025-08-21 08:45:33 +00:00
Erik Faye-Lund
22ebe3e9e8 pan/bi: use ralloc
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36724>
2025-08-21 08:45:32 +00:00
Erik Faye-Lund
4bedd8c35c pan/bi: bail from optimizing on oom
Allocations can fail, and since this is an optimization pass, let's just
skip the pass and let some other code deal with the OOM situation.

Fixes: 800a861431 ("pan/bi: Fuse FCMP/ICMP on Valhall")
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36724>
2025-08-21 08:45:32 +00:00
Erik Faye-Lund
a369800822 pan/bi: plug leak
We need to free the LUT here also.

Fixes: 800a861431 ("pan/bi: Fuse FCMP/ICMP on Valhall")
CID: 1659312
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36724>
2025-08-21 08:45:32 +00:00
Mary Guillemard
1d03897564 pan/bi: Run opt_sink and opt_move in preprocess
We can do some movement for UBO and SSBO after they are lowered in
preprocess.

We already do this in postprocess but this now also catch SSBOs as they
are lowered in postprocess.

Overall, reduce fills (less load from TLS) in fossils (excluding
parallel-rdp as it crash still):

Totals:
Instrs: 115242 -> 115046 (-0.17%); split: -0.20%, +0.03%
CodeSize: 1168896 -> 1164928 (-0.34%); split: -0.35%, +0.01%
Estimated normalized CVT cycles: 762.015625 -> 757.109375 (-0.64%); split: -0.75%, +0.11%
Estimated normalized Load/Store cycles: 12693.0 -> 12680.0 (-0.10%); split: -0.11%, +0.01%
Number of spill instructions: 358 -> 359 (+0.28%)
Number of fill instructions: 1600 -> 1584 (-1.00%)

Totals from 127 (15.82% of 803) affected shaders:
Instrs: 31753 -> 31557 (-0.62%); split: -0.73%, +0.12%
CodeSize: 335104 -> 331136 (-1.18%); split: -1.22%, +0.04%
Estimated normalized CVT cycles: 205.546875 -> 200.640625 (-2.39%); split: -2.78%, +0.40%
Estimated normalized Load/Store cycles: 3935.0 -> 3922.0 (-0.33%); split: -0.36%, +0.03%
Number of spill instructions: 124 -> 125 (+0.81%)
Number of fill instructions: 452 -> 436 (-3.54%)

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36776>
2025-08-20 12:11:43 +00:00
Mary Guillemard
7e86653a6f pan/bi: remove dead variables in preprocess
This should have no effect apart cleaning up NIR_DEBUG print outputs a
bit.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36776>
2025-08-20 12:11:43 +00:00
Mary Guillemard
bc8a277551 pan/bi: Split bi_optimize_nir and run bi_optimize_loop_nir in preprocess
We now have bi_optimize_loop_nir following optimize_nir from NAK.

Overall the more we can cleanup early the better, shouldn't cause much
changes.

For fossils/sascha-willems:
Totals:
Instrs: 40884 -> 40879 (-0.01%); split: -0.02%, +0.01%
Estimated normalized FMA cycles: 588.078125 -> 588.015625 (-0.01%)
Estimated normalized CVT cycles: 249.875 -> 249.859375 (-0.01%); split: -0.04%, +0.04%

Totals from 9 (1.44% of 627) affected shaders:
Instrs: 1521 -> 1516 (-0.33%); split: -0.66%, +0.33%
Estimated normalized FMA cycles: 9.1875 -> 9.125 (-0.68%)
Estimated normalized CVT cycles: 11.125 -> 11.109375 (-0.14%); split: -0.98%, +0.84%

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36776>
2025-08-20 12:11:43 +00:00
Mary Guillemard
6ab7a03aef panfrost: Split texture lowering passes
We now have lower_texture_early and lower_texture.

lower_texture_early handle nir_lower_tex and (in the future) could handle
anything that is backend specific that need to happen before nir_lower_io.

lower_texture handles actual lowering of backend specific things that
must happen after nir_lower_tex and nir_lower_io.

This allows us to finally not run nir_lower_tex two times in panvk.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36776>
2025-08-20 12:11:43 +00:00
Mary Guillemard
310eabacc0 panfrost: Move nir_lower_io outside of postprocess
Moving it out of there will allow us to shuffle and move API specific parts
out of there.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36776>
2025-08-20 12:11:43 +00:00
Mary Guillemard
a3f935c850 panfrost: Split compilers preprocess_nir
As we are going to move texture and IO lowering, this split preprocess
functions in two, one handling preprocess the other postprocess.

The split is done right before lower_io and has no functional change for
now.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36776>
2025-08-20 12:11:43 +00:00
Mary Guillemard
5aba96d4ac pan/bi: Stop exposing bifrost_nir_lower_load_output
Unused outside of pan/bi and also remove orphan bifrost_nir_lower_xfb
declaration.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36776>
2025-08-20 12:11:43 +00:00
Mary Guillemard
7ba81b5f95 pan/bi: Move pan_lower_sample_pos to next block
This should only run on frag shaders, let's group it the same way we
have it in midgard compiler.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36776>
2025-08-20 12:11:43 +00:00
Mary Guillemard
c5c196e6d5 pan/bi: Revamp bi_optimize_nir
This reorder things a bit, ensure we attempt more agressive vectorisation,
attempt to optimize cf and more.

Inspiration from NAK's optimize_nir function.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36629>
2025-08-11 11:38:57 +00:00
Mary Guillemard
ef7095c85b pan/bi: Handle needless conversions in nir_lower_bool_to_bitsize
We can end up with conversion instructions to the same type of integer
with nir_lower_bool_to_bitsize so let's make
bifrost_nir_lower_algebraic_late handle those cases.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36629>
2025-08-11 11:38:57 +00:00
Mary Guillemard
9ec1bb0111 pan/bi: Vectorize UBOs load/store
We can benifit from it, let's allow it when we aren't forced to follow
robustness2.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36629>
2025-08-11 11:38:57 +00:00
Mary Guillemard
b4ce8998d7 pan/bi: Switch to nir_lower_alu_width
Embrace modernity and consistency between alu and vectorization passes.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36629>
2025-08-11 11:38:57 +00:00
Romaric Jodin
7dc4f28507 pan/bi: schedule simple iterators to avoid extra move
Try to move iterator as close to the end of the block as possible. The
goal is to avoid the iterator being used after being updated, to
prevent the need for an extra move instruction.

shader-db report on `Mali-G725`:
```
total instrs in shared programs: 720530 -> 716482 (-0.56%)
instrs in affected programs: 231842 -> 227794 (-1.75%)
helped: 3804
HURT: 1
helped stats (abs) min: 1.0 max: 8.0 x̄: 1.06 x̃: 1
helped stats (rel) min: 0.14% max: 6.25% x̄: 2.75% x̃: 2.86%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.05% max: 0.05% x̄: 0.05% x̃: 0.05%
95% mean confidence interval for instrs value: -1.08 -1.04
95% mean confidence interval for instrs %-change: -2.79% -2.71%
Instrs are helped.

total cycles in shared programs: 35295.80 -> 35295.55 (<.01%)
cycles in affected programs: 3.50 -> 3.25 (-7.14%)
helped: 8
HURT: 0
helped stats (abs) min: 0.03125 max: 0.03125 x̄: 0.03 x̃: 0
helped stats (rel) min: 6.90% max: 7.41% x̄: 7.15% x̃: 7.15%
95% mean confidence interval for cycles value: -0.03 -0.03
95% mean confidence interval for cycles %-change: -7.38% -6.92%
Cycles are helped.

total fma in shared programs: 5054.34 -> 5054.34 (0.00%)
fma in affected programs: 0 -> 0
helped: 0
HURT: 0

total cvt in shared programs: 4707.69 -> 4644.44 (-1.34%)
cvt in affected programs: 1471.28 -> 1408.03 (-4.30%)
helped: 3804
HURT: 1
helped stats (abs) min: 0.015625 max: 0.125 x̄: 0.02 x̃: 0
helped stats (rel) min: 0.37% max: 12.50% x̄: 6.01% x̃: 6.67%
HURT stats (abs)   min: 0.015625 max: 0.015625 x̄: 0.02 x̃: 0
HURT stats (rel)   min: 0.13% max: 0.13% x̄: 0.13% x̃: 0.13%
95% mean confidence interval for cvt value: -0.02 -0.02
95% mean confidence interval for cvt %-change: -6.07% -5.94%
Cvt are helped.

total sfu in shared programs: 1878.25 -> 1878.25 (0.00%)
sfu in affected programs: 0 -> 0
helped: 0
HURT: 0

total v in shared programs: 2353 -> 2353 (0.00%)
v in affected programs: 0 -> 0
helped: 0
HURT: 0

total t in shared programs: 5530 -> 5530 (0.00%)
t in affected programs: 0 -> 0
helped: 0
HURT: 0

total ls in shared programs: 27975 -> 27975 (0.00%)
ls in affected programs: 0 -> 0
helped: 0
HURT: 0

total code size in shared programs: 6386560 -> 6289664 (-1.52%)
code size in affected programs: 508544 -> 411648 (-19.05%)
helped: 757
HURT: 0
helped stats (abs) min: 128.0 max: 128.0 x̄: 128.00 x̃: 128
helped stats (rel) min: 0.83% max: 33.33% x̄: 31.09% x̃: 33.33%
95% mean confidence interval for code size value: -128.00 -128.00
95% mean confidence interval for code size %-change: -31.57% -30.60%
Code size are helped.

total threads in shared programs: 14698 -> 14698 (0.00%)
threads in affected programs: 0 -> 0
helped: 0
HURT: 0

total loops in shared programs: 166 -> 166 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0

total spills in shared programs: 37 -> 37 (0.00%)
spills in affected programs: 0 -> 0
helped: 0
HURT: 0

total fills in shared programs: 111 -> 111 (0.00%)
fills in affected programs: 0 -> 0
helped: 0
HURT: 0
```

Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36021>
2025-08-07 15:09:56 +00:00
Romaric Jodin
4a53eca97d pan/bi: add pass to simplify control flow
That pass tries to remove blocks that are not needed:
blocks with only 1 predecessor and 1 successor containing no
instruction or only 1 branch instruction.

Also remove unnecessary branch at the end of block with only 1
successor which is the next block

Run this pass at the end of the compiler flow once all optimisations
have been applied.

Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36021>
2025-08-07 15:09:56 +00:00
Romaric Jodin
6b693e281a pan/va: improve lowering of SWZ_V4I8
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Use bi_make_vec_to to allow to use only 1 MKVEC.v2i8 when possible.

Also add support for all swizzles instead of only mono-byte ones,
using bi_swizzle_to_byte_channels.

Update assert in bi_byte.

Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35643>
2025-08-07 13:48:33 +00:00
Romaric Jodin
857f29d67b pan/bi: use only 1 MKVEC.v2i8 to generate v4i8 when possible
When making a vector of 4 elements, try to make it with only 1
instruction instead of 2 at the moment, if the last 2 elements respect
the pattern supported by MKVEC.v2i8

Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35643>
2025-08-07 13:48:32 +00:00
John Anthony
c46407de88 pan/va: Add support for SPV_ARM_core_builtins
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36019>
2025-08-07 11:46:33 +02:00
Qiang Yu
c135ed1eb9 all: rename gl_shader_stage_name to mesa_shader_stage_name
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569>
2025-08-06 10:28:41 +08:00
Qiang Yu
7a91473192 all: rename gl_shader_stage_is_compute to mesa_shader_stage_is_compute
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569>
2025-08-06 10:28:41 +08:00
Qiang Yu
196569b1a4 all: rename gl_shader_stage to mesa_shader_stage
It's not only for GL, change to a generic name.

Use command:
  find . -type f -not -path '*/.git/*' -exec sed -i 's/\bgl_shader_stage\b/mesa_shader_stage/g' {} +

Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569>
2025-08-06 10:28:40 +08:00
Alyssa Rosenzweig
a52cdc08b7 pan/bi: replace specialize_idvs with nir_inline_sysval
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: John Anthony <john.anthony@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36516>
2025-08-03 21:27:48 +00:00
Antonio Ospite
ddf2aa3a4d build: avoid redefining unreachable() which is standard in C23
In the C23 standard unreachable() is now a predefined function-like
macro in <stddef.h>

See https://android.googlesource.com/platform/bionic/+/HEAD/docs/c23.md#is-now-a-predefined-function_like-macro-in

And this causes build errors when building for C23:

-----------------------------------------------------------------------
In file included from ../src/util/log.h:30,
                 from ../src/util/log.c:30:
../src/util/macros.h:123:9: warning: "unreachable" redefined
  123 | #define unreachable(str)    \
      |         ^~~~~~~~~~~
In file included from ../src/util/macros.h:31:
/usr/lib/gcc/x86_64-linux-gnu/14/include/stddef.h:456:9: note: this is the location of the previous definition
  456 | #define unreachable() (__builtin_unreachable ())
      |         ^~~~~~~~~~~
-----------------------------------------------------------------------

So don't redefine it with the same name, but use the name UNREACHABLE()
to also signify it's a macro.

Using a different name also makes sense because the behavior of the
macro was extending the one of __builtin_unreachable() anyway, and it
also had a different signature, accepting one argument, compared to the
standard unreachable() with no arguments.

This change improves the chances of building mesa with the C23 standard,
which for instance is the default in recent AOSP versions.

All the instances of the macro, including the definition, were updated
with the following command line:

  git grep -l '[^_]unreachable(' -- "src/**" | sort | uniq | \
  while read file; \
  do \
    sed -e 's/\([^_]\)unreachable(/\1UNREACHABLE(/g' -i "$file"; \
  done && \
  sed -e 's/#undef unreachable/#undef UNREACHABLE/g' -i src/intel/isl/isl_aux_info.c

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36437>
2025-07-31 17:49:42 +00:00
Marek Olšák
8d3e76c250 nir: split nir_move_load_frag_coord from nir_move_load_input
It's a pure system value on AMD, not an input.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36357>
2025-07-29 16:20:48 -04:00
Mary Guillemard
525f2972a6 pan/bi: Properly handle SWZ.v4i8 lowering on v11+
We were not supporting non replicate swizzle and this trigger an
assertion on fossils/parallel-rdp/small_subgroup.foz.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: 1481b14fcb ("pan/bi: Lower SWZ.v4i8 to multiple MKVEC.v2i8 on v11+")
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36349>
2025-07-28 08:00:07 +00:00
Mary Guillemard
800a861431 pan/bi: Fuse FCMP/ICMP on Valhall
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
We have a lot of pattern like this on Valhall:

FCMP_OR.f32.lt.m1 r28, ^r28, r27, 0x0
FCMP_OR.f32.lt.m1 r29, r27, r25, 0x0
LSHIFT_AND.i32 r28, ^r28, 0x0.b00, ^r29

That can be simplified into:

FCMP_OR.f32.lt.m1 r29, r27, r25, 0x0
FCMP_AND.f32.lt.m1 r28, ^r28, r27, ^r29

This pass merge those specific cases while setting the appropriate
logical variant of the CMP instruction.

We do not try to merge the srcs that do not originate from a matching CMP
instruction with matching result type as the logical operation is
performed before the result type transformation.

Now this is enough to optimise a lot of common cases anyway so it is
still a win.

Results on fossils/sascha-willems:

Totals:
Instrs: 42157 -> 42059 (-0.23%)
CodeSize: 582784 -> 581760 (-0.18%)
Estimated normalized SFU cycles: 159.9375 -> 153.75 (-3.87%)

Totals from 13 (2.07% of 627) affected shaders:
Instrs: 3490 -> 3392 (-2.81%)
CodeSize: 29696 -> 28672 (-3.45%)
Estimated normalized SFU cycles: 15.8125 -> 9.625 (-39.13%)

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36327>
2025-07-24 18:23:07 +00:00
Mary Guillemard
621f334a4c panvk: Wire robustness2 buffer info down to pan/bi
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36198>
2025-07-21 10:36:16 +00:00
Eric R. Smith
65bcae096a panfrost: fix SSA register allocation
We were allocating a fixed number of temporary registers; this isn't
always enough, and in fact we should have calculated the number of
temporaries required.

Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: 6c64ad934f ("panfrost: spill registers in SSA form")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36135>
2025-07-16 12:16:50 +00:00
Ashley Smith
c88c66754c pan/va: Add support for 64-bit atomic operations
Adds support for 64-bit atomic operations for KHR_shader_atomic_int64
using 64-bit atomic instructions. Valhall is working but Bifrost will
require some more work to implement as it requires two instructions to
execute a 64-bit atomic.

Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Signed-off-by: Ashley Smith <ashley.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35789>
2025-07-11 12:42:30 +00:00
Ashley Smith
c3a21fb0af bi/va: Add instructions required for KHR_shader_atomic_int64
Add 64-bit atomic instructions for bifrost/valhall

Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Signed-off-by: Ashley Smith <ashley.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35789>
2025-07-11 12:42:30 +00:00
Mary Guillemard
db5ad8e3d2 pan/bi: Disallow FAU for CLPER in bi_check_fau_src
Previously this was allowing invalid forms like
"CLPER.i32.subgroup8.zero lane-id, src1" to reach bi_pack.

This fixes the assert that can be seen with
"dEQP-VK.glsl.derivate.dfdxsubgroup.*" but doesn't fix failures.

Fixes: 0acc6b564e ("pan/bi: Rework FAU lowering")
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36006>
2025-07-10 08:01:25 +00:00
Mary Guillemard
48d716a05f pan/bi: Do not allow passthrough for instructions disallowing temps
Previously we were allowing passthrough to temps without using
bi_reads_temps.

This was causing instructions like CLPER to create undefined encodings.

We now check if the instruction support temps.

Fixes: 4252fb84f4 ("pan/bi: Add passthrough register rewriting helper")
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36006>
2025-07-10 08:01:25 +00:00