Commit graph

334 commits

Author SHA1 Message Date
Radu Costas
721c1b8f65 pco: Amend errant nir_move_option
Move options were bit or-ing from the wrong enum, causing undefined
behaviour when the number of intrinsics changed.
Replaced it with the values from the right nir_move_options enum that
were previously working. (Further refinement needed on these after
extensive testing.)

Fixes: f1b24267d2 ("pco: rework nir processing and passes")
Signed-off-by: Radu Costas <radu.costas@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40568>
2026-03-23 22:27:02 +00:00
Connor Abbott
22a061fb91 nir: Use better calculation for alpha-to-coverage mask
The old calculation depended on the sample count, and gave subpar
results for 8x MSAA with standard sample locations. The new calculation
is based on the Intel pass, with some changing of the constants so that
the sample count is always proportional to alpha for 2xMSAA and 4xMSAA
and the addition of rotating the sample mask based on the pixel.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39335>
2026-03-20 18:09:48 +00:00
Simon Perretta
8eee60fa78 pco: use vm/icm for tile buffer store coverage mask
Use the valid/input coverage masks for tile buffer store coverage masks
when running single/multi-sampled fragment shaders respectively.

Fixes: 297a0c269a ("pvr, pco: tile buffer support")

Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Reported-by: Nick Hamilton <nick.hamilton@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40456>
2026-03-20 14:57:21 +00:00
Georg Lehmann
ec331cc48a nir: replace lower_ldexp with has_ldexp
I can be bothered to fix all the backends that don't set lower_ldexp,
and only two backends have ldexp anyway.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33900>
2026-03-20 08:15:08 +00:00
Radu Costas
d089947266 pco: Commonize atomic sync operations
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Replace loop with macros
Rewrite channel op to multi channel select to avoid extra swizzle

Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Signed-off-by: Radu Costas <radu.costas@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40320>
2026-03-13 13:06:11 +02:00
Faith Ekstrand
f2f792996d Revert "nir: Add a type parameter to nir_lower_point_size()"
This reverts commit 6ee4ea5ea3.

Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38681>
2026-03-12 22:59:13 +00:00
Georg Lehmann
a25f00eaed nir: merge xfb and xfb2 into one 64bit intrinsic index
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40299>
2026-03-10 07:46:22 +00:00
Eric R. Smith
8521051cfa pco: fix a typo in the check for optimization looping
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The count isn't incremented anywhere else.

Signed-off-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Fixes: f1b24267d2 ("pco: rework nir processing and passes")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40240>
2026-03-09 11:27:27 +00:00
Radu Costas
9e8faa6eea pco: Add hwinfo check for features in sampler code
Add checks for integer coordinates and array indexing HW features
Features require HW support and the PCO_DEBUG env var to contain the
adv_smp entry
Integer coordinates are supported for images and textures without an LOD
setting
Array indexing is not supported and will trigger an abort

Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Signed-off-by: Radu Costas <radu.costas@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40139>
2026-03-05 15:33:10 +00:00
Simon Perretta
99c8d88be1 pco: add encodings and mappings for smp integer and array flags
Acked-by: Frank Binns <frank.binns@imgtec.com>
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40139>
2026-03-05 15:33:10 +00:00
Luigi Santivetti
df6d398d45 pco: fix Mesa-CI regression in pco texture packed formats
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Fixes: 9caa563bc9 ("pvr, pco: Commonize texture packing code")
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40156>
2026-02-28 20:25:53 +00:00
Radu Costas
9caa563bc9 pvr, pco: Commonize texture packing code
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Format extraction moved to separate function
Removed some magic numbers

Signed-off-by: Radu Costas <radu.costas@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40067>
2026-02-27 09:08:29 +00:00
Duncan Brawley
0ea39c6305 pco: Use vertex input registers in register allocation
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Add support for the use of vertex input registers as additional general
purpose registers which previously was restricted to temporary
registers. Use of vertex input registers as additional general purpose
registers is not available for fragment shaders.

Vertex input registers are similar to temporary registers. The only
difference is that vertex input registers can contain pre-initialised
data when the shader starts.

By default, the number of vertex input registers used for register
allocation is the number of vertex input registers used for their
pre-initialised data rounded up to the nearest multiple of 4, as vertex
input registers are allocated in blocks of 4.

If PCO_DEBUG=alloc_extra_vtxins is used, a mimimum of 12 vertex input
registers are available for register allocation.

Signed-off-by: Duncan Brawley <duncan.brawley@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39886>
2026-02-24 16:27:45 +00:00
Nick Hamilton
cf0786c766 pco: Fix multiview sampling of subpass input attachments
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
When sampling a subpass input attachment when multiview is been used
the the view index is required.

This becomes an extra output from the vertex shader which is then
iterated into the fragment shader input where it is then used as the
array layer index when sampling the subpass input attachment.

However this extra output was not having its interpolation mode
configured correctly leading to incorrect instructions being added
causing the view index to always be zero and thus sampling the
subpass input attachment incorrectly.

Fix is to make sure the view index interpolation mode is set to flat.

Fix:
dEQP-VK.multiview.input_attachments.no_queries.1_2_4_8_16_32
dEQP-VK.multiview.input_attachments.no_queries.1_2_4_8
dEQP-VK.multiview.input_attachments.no_queries.15_15_15_15
dEQP-VK.multiview.input_attachments.no_queries.15
dEQP-VK.multiview.input_attachments.no_queries.5_10_5_10
dEQP-VK.multiview.input_attachments.no_queries.8_1_1_8
dEQP-VK.multiview.input_attachments.no_queries.8
dEQP-VK.multiview.input_attachments.no_queries.max_multi_view_view_count
dEQP-VK.multiview.renderpass2.input_attachments.no_queries.1_2_4_8_16_32
dEQP-VK.multiview.renderpass2.input_attachments.no_queries.1_2_4_8
dEQP-VK.multiview.renderpass2.input_attachments.no_queries.15_15_15_15
dEQP-VK.multiview.renderpass2.input_attachments.no_queries.15
dEQP-VK.multiview.renderpass2.input_attachments.no_queries.5_10_5_10
dEQP-VK.multiview.renderpass2.input_attachments.no_queries.8_1_1_8
dEQP-VK.multiview.renderpass2.input_attachments.no_queries.8
dEQP-VK.multiview.renderpass2.input_attachments.no_queries.max_multi_view_view_count

Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39715>
2026-02-09 14:14:55 +00:00
Alyssa Rosenzweig
3b20cfc589 pvr,pan,agx: drop cargo-culted nir_opt_loop calls
The comment claims this was to unroll loops, but nir_opt_loop doesn't do that.
Whatever issue the AGX code was originally working around, it doesn't apply now
(I confirmed we produce similar code with or without the pass). In the meantime,
Panfrost and PowerVR cargo-culted the same broken logic. Drop it all.

Closes: #14732
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39588>
2026-02-02 23:16:22 +00:00
Nick Hamilton
079377c767 pco: Fix for atomic operations on an image buffer
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Within the driver buffers are treated as 2D as sampling them as 1D
will run into HW restrictions on max size.

The compiler does the same however for atomic image ops the address
is manually calculated and doing this via the 2D path leads to
incorrect offsets.

The fix is to treat buffers as 1D for atomic ops which calculates
the correct offsets for the operations.

Fix deqp:
dEQP-VK.image.atomic_operations.add.buffer.*
dEQP-VK.image.atomic_operations.and.buffer.*
dEQP-VK.image.atomic_operations.compare_exchange.buffer.*
dEQP-VK.image.atomic_operations.dec.buffer.*
dEQP-VK.image.atomic_operations.exchange.buffer.*
dEQP-VK.image.atomic_operations.inc.buffer.*
dEQP-VK.image.atomic_operations.max.buffer.*
dEQP-VK.image.atomic_operations.min.buffer.*
dEQP-VK.image.atomic_operations.or.buffer.*
dEQP-VK.image.atomic_operations.sub.buffer.*
dEQP-VK.image.atomic_operations.xor.buffer.*

Fixes: 6dc5e1e109 ("pco: fully support Vulkan 1.2 image atomics")

Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39521>
2026-01-28 08:54:28 +00:00
Duncan Brawley
b8889f5eaa pvr: add basic support for shader statistics framework
Mesa now has a statistics framework. This adds support for emitting
additional statistics about PowerVR shaders for the Rogue architecture.

Add support for emitting the following statistics: Code size, scratch
size, spill count, temp count, loop count, number of inst groups, number
of main inst groups, number of bitwise inst groups and number of control
inst groups.
Add support for new PCO_DEBUG_PRINT option "stats" to emit shader stats.

Signed-off-by: Duncan Brawley <duncan.brawley@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39523>
2026-01-27 16:58:30 +00:00
Simon Perretta
c5b70dcb48 pco: update formatless skip check
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The skip check should only be checking the format rather than the entire
packed word.

Fixes: 52ddc40a75 ("pco: restrict shadow sampler comparator clamping to unorm formats")
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39428>
2026-01-22 08:09:36 +00:00
Simon Perretta
52ddc40a75 pco: restrict shadow sampler comparator clamping to unorm formats
Only clamp shadow sampler comparators for "unsigned normalized
fixed-point format[s]" as per the Vulkan spec.

Fixes: 69a56d33de ("pco: Fix for shadow sampler comparison not clamping the compare value")
Reported-by: Georg Lehmann <dadschoorse@gmail.com>
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39389>
2026-01-20 17:51:07 +00:00
Icenowy Zheng
aa6208b6b7 pco: add NIR global_atomic lowering
PCO has global_atomic_{,_swap}pco intrinsics that are different from
generic 2x32 ones, but the 2x32 ones are emitted by
nir_lower_explicit_io.

Add code to lower the generic 2x32 intrinsics to the PCO variant.

Passed Vulkan CTS dEQP-VK.glsl.atomic_operations.* and fixes crash of
dEQP-VK.glsl.atomic_operations.*_reference .

Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39054>
2026-01-20 15:37:34 +00:00
Nick Hamilton
69a56d33de pco: Fix for shadow sampler comparison not clamping the compare value
When doing sample comparisons on shadow images the compare value
should be clamped to [0,1]

Fix:
dEQP-VK.glsl.texture_functions.texture.samplercubearrayshadow_fragment

Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39327>
2026-01-15 21:15:48 +00:00
Nick Hamilton
68cb76de5d pco: Fix encoding of branch to an empty block
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
When calculating the relative offset for a branch the pco_first_igrp
function is used to find the first instruction of a block.

However if the block is empty the function does not return NULL as it
description implies but returns a pointer to the list head which is not a
valid node. Using this leads to a garbage relative offset been calculated
which leads to unexpected behaviour.

Fix is to add a check for the list been empty and return NULL (the same
issue also exists in pco_last_igrp). This leads to the calling function,
pco_cf_node_offset, searching for the next none empty block which is the
expected behaviour.

Fix deqp:
dEQP-VK.graphicsfuzz.cov-two-nested-loops-switch-case-matrix-array-increment
dEQP-VK.graphicsfuzz.stable-binarysearch-tree-false-if-discard-loop

Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39287>
2026-01-15 12:09:26 +00:00
Nick Hamilton
eccdfb397e pvr: Fix missing sample mask test instructions
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The sample mask test instructions were only being added if the sample
mask was an input or no other discard test was being done.

In the failing test cases, instructions to handle alpha to one and/or
alpha to coverage were being added and the program marked as having
a discard test. This would lead to the instructions for the sample
mask test to not be added.

The fix is to specifically check for whether the instructions for sample mask
have been added instead of check for any discard test.

Fix deqp:

dEQP-VK.pipeline.monolithic.extended_dynamic_state.after_pipelines.multi_sample_sample_mask_disable
dEQP-VK.pipeline.monolithic.extended_dynamic_state.after_pipelines.single_sample_sample_mask_disable
dEQP-VK.pipeline.monolithic.extended_dynamic_state.before_draw.large_static_rasterization_samples_off
dEQP-VK.pipeline.monolithic.extended_dynamic_state.before_draw.multi_sample_sample_mask_disable
dEQP-VK.pipeline.monolithic.extended_dynamic_state.before_draw.single_sample_sample_mask_disable
dEQP-VK.pipeline.monolithic.extended_dynamic_state.before_good_static.multi_sample_sample_mask_disable
dEQP-VK.pipeline.monolithic.extended_dynamic_state.before_good_static.single_sample_sample_mask_disable
dEQP-VK.pipeline.monolithic.extended_dynamic_state.between_pipelines.multi_sample_sample_mask_disable
dEQP-VK.pipeline.monolithic.extended_dynamic_state.between_pipelines.single_sample_sample_mask_disable
dEQP-VK.pipeline.monolithic.extended_dynamic_state.cmd_buffer_start.large_static_rasterization_samples_off
dEQP-VK.pipeline.monolithic.extended_dynamic_state.cmd_buffer_start.multi_sample_sample_mask_disable
dEQP-VK.pipeline.monolithic.extended_dynamic_state.cmd_buffer_start.single_sample_sample_mask_disable
dEQP-VK.pipeline.monolithic.extended_dynamic_state.three_draws_dynamic.multi_sample_sample_mask_disable
dEQP-VK.pipeline.monolithic.extended_dynamic_state.three_draws_dynamic.single_sample_sample_mask_disable
dEQP-VK.pipeline.monolithic.extended_dynamic_state.two_draws_dynamic.multi_sample_sample_mask_disable
dEQP-VK.pipeline.monolithic.extended_dynamic_state.two_draws_dynamic.single_sample_sample_mask_disable
dEQP-VK.pipeline.monolithic.extended_dynamic_state.two_draws_static.multi_sample_sample_mask_disable
dEQP-VK.pipeline.monolithic.extended_dynamic_state.two_draws_static.single_sample_sample_mask_disable

Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Tested-by: Icenowy Zheng <uwu@icenowy.me>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39040>
2026-01-06 12:37:20 +00:00
Emma Anholt
059d301c79 nir: Drop the mode argument of nir_lower_vars_to_scratch().
It only makes sense for function temps, and that's the only way it's been
used.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37245>
2025-12-17 19:50:28 +00:00
Emma Anholt
10ba7675c8 nir/uub: Use an optional max_samples from drivers for sample counts.
This triggers some unrolling in Fallout 4, GTAV, and Rocky Planet in my
shader-db.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38585>
2025-12-11 14:26:11 +00:00
Arcady Goldmints-Orlov
0df8aa940c nir: Use nir_shader_intrinsics_pass in nir_lower_io_to_scalar
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38816>
2025-12-05 22:30:22 +00:00
Marek Olšák
fa0bea5ff8 nir: remove nir_io_add_const_offset_to_base
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
nir_opt_constant_folding does it now.

Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38277>
2025-11-29 00:16:38 +00:00
Marek Olšák
9e339f4b32 nir: rename nir_lower_indirect_derefs -> nir_lower_indirect_derefs_to_if_else_trees
This describes better what it does.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38471>
2025-11-20 05:42:11 +00:00
Boris Brezillon
ea4d4d2a77 nir: Prepare nir_lower_io_vars_to_temporaries() for optional PLS lowering
Rather than adding another boolean to optionally lower PLS vars, pass
the types we want to lowers through a nir_variable_mode bitmask.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37110>
2025-11-18 20:25:42 +00:00
Erik Faye-Lund
8233f77caa pvr: split idep_pco_uscgen_programs_h in two
When we do multiarch, we want to be able to refer to the headers
separately from the sources here, so let's split this dependency in two.

Reviewed-by: Ashish Chauhan <ashish.chauhan@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38423>
2025-11-17 16:04:30 +00:00
Marek Olšák
e372365cf4 nir: rename nir_copy_prop -> nir_opt_copy_prop
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38411>
2025-11-15 02:16:38 +00:00
Konstantin Seurer
de32f9275f treewide: add & use parent instr helpers
We add a bunch of new helpers to avoid the need to touch >parent_instr,
including the full set of:

* nir_def_is_*
* nir_def_as_*_or_null
* nir_def_as_* [assumes the right instr type]
* nir_src_is_*
* nir_src_as_*
* nir_scalar_is_*
* nir_scalar_as_*

Plus nir_def_instr() where there's no more suitable helper.

Also an existing helper is renamed to unify all the names, while we're
churning the tree:

* nir_src_as_alu_instr -> nir_src_as_alu

..and then we port the tree to use the helpers as much as possible, using
nir_def_instr() where that does not work.

Acked-by: Marek Olšák <maraeo@gmail.com>

---

To eliminate nir_def::parent_instr we need to churn the tree anyway, so I'm
taking this opportunity to clean up a lot of NIR patterns.

Co-authored-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38313>
2025-11-12 21:22:13 +00:00
Faith Ekstrand
6ee4ea5ea3 nir: Add a type parameter to nir_lower_point_size()
On Mali, we need not only clamp but also convert to float16 on Valhall+.
We could have a separate pass for this but it fits in nicely with the
rest of nir_lower_point_size() so we might as well put it there.

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38379>
2025-11-12 01:34:36 +00:00
Alyssa Rosenzweig
9c2a2deee6 treewide: use BITSET_BYTES, BITSET_RZALLOC
Via Coccinelle patches:

    @@
    expression bits;
    typedef BITSET_WORD;
    @@

    -BITSET_WORDS(bits) * sizeof(BITSET_WORD)
    +BITSET_BYTES(bits)

    @@
    expression memctx, bits;
    typedef BITSET_WORD;
    @@

    -rzalloc_array(memctx, BITSET_WORD, BITSET_WORDS(bits))
    +BITSET_RZALLOC(memctx, bits)

     @@
     expression memctx, bits;
     @@

     -rzalloc_size(memctx, BITSET_BYTES(bits))
     +BITSET_RZALLOC(memctx, bits)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38245>
2025-11-05 18:44:23 +00:00
Alyssa Rosenzweig
17355f716b treewide: use UTIL_DYNARRAY_INIT
Instead of util_dynarray_init(&dynarray, NULL), just use
UTIL_DYNARRAY_INIT instead. This is more ergonomic.

Via Coccinelle patch:

    @@
    identifier dynarray;
    @@

    -struct util_dynarray dynarray = {0};
    -util_dynarray_init(&dynarray, NULL);
    +struct util_dynarray dynarray = UTIL_DYNARRAY_INIT;

    @@
    identifier dynarray;
    @@

    -struct util_dynarray dynarray;
    -util_dynarray_init(&dynarray, NULL);
    +struct util_dynarray dynarray = UTIL_DYNARRAY_INIT;

    @@
    expression dynarray;
    @@

    -util_dynarray_init(&(dynarray), NULL);
    +dynarray = UTIL_DYNARRAY_INIT;

    @@
    expression dynarray;
    @@

    -util_dynarray_init(dynarray, NULL);
    +(*dynarray) = UTIL_DYNARRAY_INIT;

Followed by sed:

    bash -c "find . -type f -exec sed -i -e 's/util_dynarray_init(&\(.*\), NULL)/\1 = UTIL_DYNARRAY_INIT/g' \{} \;"
    bash -c "find . -type f -exec sed -i -e 's/util_dynarray_init( &\(.*\), NULL )/\1 = UTIL_DYNARRAY_INIT/g' \{} \;"
    bash -c "find . -type f -exec sed -i -e 's/util_dynarray_init(\(.*\), NULL)/*\1 = UTIL_DYNARRAY_INIT/g' \{} \;"

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38189>
2025-11-04 13:39:48 +00:00
Alyssa Rosenzweig
b824ef83ab util/dynarray: infer type in append
Most of the time, we can infer the type to append in
util_dynarray_append using __typeof__, which is standardized in C23 and
support in Jesse's MSMSVCV. This patch drops the type argument most of
the time, making util_dynarray a little more ergonomic to use.

This is done in four steps.

First, rename util_dynarray_append -> util_dynarray_append_typed

    bash -c "find . -type f -exec sed -i -e 's/util_dynarray_append(/util_dynarray_append_typed(/g' \{} \;"

Then, add a new append that infers the type. This is much more ergonomic
for what you want most of the time.

Next, use type-inferred append as much as possible, via Coccinelle
patch (plus manual fixup):

    @@
    expression dynarray, element;
    type type;
    @@

    -util_dynarray_append_typed(dynarray, type, element);
    +util_dynarray_append(dynarray, element);

Finally, hand fixup cases that Coccinelle missed or incorrectly
translated, of which there were several because we can't used the
untyped append with a literal (since the sizeof won't do what you want).

All four steps are squashed to produce a single patch changing every
util_dynarray_append call site in tree to either drop a type parameter
(if possible) or insert a _typed suffix (if we can't infer). As such,
the final patch is best reviewed by hand even though it was
tool-assisted.

No Long Linguine Meals were involved in the making of this patch.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38038>
2025-10-24 18:32:07 +00:00
Simon Perretta
ff51e6dc9e nir: commonize barycentric intrinsic opt pass
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Introduces an opt pass that attempts to optimize
load_barycentric_at_{sample,offset} with simpler load_barycentric_*
equivalents where possible, and optionally lowers
load_barycentric_at_sample to load_barycentric_at_offset with a position
derived from the sample ID instead.

Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37658>
2025-10-22 16:48:01 +00:00
Simon Perretta
7b222d532b pco/ra: abort if spilling fails
In release builds, the assertion checking whether spilling failed isn't
evaluated, which can result in shader compilation continuing despite this.

Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Reviewed-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37872>
2025-10-14 21:47:30 +00:00
Simon Perretta
38c3c89b0d pco: add support for global memory
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37726>
2025-10-14 17:22:00 +00:00
Job Noorman
0b82b803d9 nir,ir3: rename umul_low to umul_16x16
This is more in line with similar opcodes like umul_32x16.

Also change its const expr: the masking based on bit size was
unnecessary as it is only defined for 32 bits. Use simple casts instead.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37863>
2025-10-14 12:54:54 +00:00
Simon Perretta
d41c34c5ca pco: ensure a variable exists for the multiview index
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37724>
2025-10-11 20:28:16 +01:00
Simon Perretta
b0609a30b1 pco: improve early and late algebraic pass ordering
Ensures early algebraic passes aren't called again following late
algebraic passes, so that the latter's opts aren't undone (e.g.
unfusing ffmas).

Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37724>
2025-10-11 20:28:16 +01:00
Simon Perretta
e637d01ef2 pco: tidy and commonize conversion ops
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37724>
2025-10-11 20:28:16 +01:00
Simon Perretta
34b4b35ca8 pco: apply rounding mode to relevant conversion ops
The rounding behaviour on [iu]2f32 ops needs to be explicitly set in
order to match the implicit behaviour described in the
KHR_shader_float_controls properties.

Fixes: e306abc6e6 ("pvr: implement KHR_shader_float_controls")
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37724>
2025-10-11 20:28:16 +01:00
Rhys Perry
0dd09a292b nir: add ACCESS_ATOMIC
This is so that passes and backends can tell if a coherent load/store is
atomic or not, instead of having to assume it could be either.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36602>
2025-10-07 17:41:30 +00:00
Erik Faye-Lund
478e8f11c4 pvr: split out rogue hw-defs to separate folder
This just carves out some space for future architectures in the source
tree.

Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37675>
2025-10-06 11:02:04 +02:00
Christian Gmeiner
553e4252ba pvr, pco: Set has_f2i32_rtne to true
Set the has_f2i32_rtne shader compiler option to indicate hardware
support for it. This enables NIR's late algebraic optimization pass to
generate more efficient code for float-to-int conversions.

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37680>
2025-10-03 11:12:44 +02:00
Simon Perretta
61a9c4f63d pvr, pco: add primitive support for terminate,demote_to_helper}_invocation
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37512>
2025-09-30 12:15:54 +00:00
Simon Perretta
a1acd6f8d1 pvr, pco: add primitive support for VK_KHR_robustness2.nullDescriptor
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37512>
2025-09-30 12:15:54 +00:00
Simon Perretta
2a7ebf2ae0 nir/lower_alpha: extend to support dynamic a2c
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37512>
2025-09-30 12:15:53 +00:00