Commit graph

12080 commits

Author SHA1 Message Date
Marek Olšák
b63a9a8b39 nir: add direct lowered frag_coord building to replace lowering passes
Instead of lowering frag_coord 4 times during compilation,
just use this.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41227>
2026-05-03 13:03:00 +00:00
Marek Olšák
9c5ad16819 nir/opt_frag_coord_to_pixel_coord: handle frag_coord_xy
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41227>
2026-05-03 13:03:00 +00:00
Marek Olšák
076b0aaf1d nir/lower_wpos_ytransform: handle frag_coord_xy
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41227>
2026-05-03 13:03:00 +00:00
Marek Olšák
e49f29f25e nir: add frag_coord_xy
to strengthen and simplify pixel_coord lowering

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41227>
2026-05-03 13:03:00 +00:00
Daniel Schürmann
012d72f2b0 nir/opt_algebraic: add some imul24_relaxed pattern
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41178>
2026-05-01 10:07:26 +00:00
Daniel Schürmann
708093d830 nir/opt_algebraic: use imul24_relaxed for lowered dot4x8_add
Totals from 28 (0.04% of 72819) affected shaders: (Navi10)

MaxWaves: 181 -> 186 (+2.76%)
Instrs: 406735 -> 338360 (-16.81%)
CodeSize: 2913588 -> 2469712 (-15.23%)
VGPRs: 5520 -> 5468 (-0.94%)
SpillVGPRs: 32 -> 0 (-inf%)
LDS: 64512 -> 62464 (-3.17%)
Scratch: 10240 -> 0 (-inf%)
Latency: 11028252 -> 4357120 (-60.49%)
InvThroughput: 11004126 -> 4079018 (-62.93%)
VClause: 1686 -> 2055 (+21.89%); split: -0.89%, +22.78%
SClause: 890 -> 852 (-4.27%)
Copies: 4516 -> 2644 (-41.45%); split: -41.59%, +0.13%
PreSGPRs: 982 -> 974 (-0.81%)
PreVGPRs: 5356 -> 4284 (-20.01%)
VALU: 370529 -> 330201 (-10.88%)
SALU: 28850 -> 1170 (-95.94%)
VMEM: 2616 -> 2560 (-2.14%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41178>
2026-05-01 10:07:25 +00:00
Marek Olšák
5db0493a04 glsl,gallium: add pipe_caps::glsl_bindless_handles_are_32bit
to lower bindless handles to 32 bits before nir_opt_varyings, so that
the high 32 bits of (input) loads of bindless handles are eliminated early.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41170>
2026-05-01 03:00:17 +00:00
Lorenzo Rossi
63aceb07ff nir/opt_sink: Add pan-specific load_input
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>
2026-04-30 18:26:10 +00:00
Lorenzo Rossi
30d8f9c554 nir/lower_point_size: Handle 16-bit point sizes
panfrost has float16 point size, handling that precision too allows the
compiler to call lower_point_size later in the compilation pipeline

Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>
2026-04-30 18:26:10 +00:00
Lorenzo Rossi
2a7d817591 nir/opt_algebraic: optimize fadd/fmul with 16-bit source and constant
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41096>
2026-04-30 17:33:09 +00:00
Lorenzo Rossi
89436db611 nir: Extract float_is_half tests in common code
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41096>
2026-04-30 17:33:09 +00:00
Karol Herbst
4e67582ddf nir: add fmul_rtz optimizations
NVK is only going to use it for `fmul_rtz(frcp(ipa), ipa)` patterns, so
try not too hard to optimize this.

Totals from 10 (0.00% of 1212873) affected shaders:
CodeSize: 34480 -> 34288 (-0.56%); split: -0.60%, +0.05%
Static cycle count: 6225 -> 6132 (-1.49%); split: -1.57%, +0.08%

Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41179>
2026-04-30 15:42:40 +00:00
Karol Herbst
2e09b4ac68 nir: handle fmul_rtz in a couple of places
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41179>
2026-04-30 15:42:40 +00:00
Karol Herbst
4e520f671c nir: add fmul_rtz
It's needed in NVK for correctness with interpolation.

Backport-to: 26.1
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41179>
2026-04-30 15:42:40 +00:00
Marek Olšák
a3e3bf0ac2 nir/opt_dce: add shader_info::assert_inputs_not_dead
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41166>
2026-04-30 07:07:32 +00:00
Marek Olšák
7bd5856cc6 nir/opt_dce: factor out dead instruction removal into a helper
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41166>
2026-04-30 07:07:32 +00:00
Icenowy Zheng
2bffc653ec isaspec: decode: manually print the sign when printing NaN float values
The IEEE754-2019 standard declaring the preceding sign "optional" when
converting NaN values to strings because the standard tries to not
regulate how sign bits in NaNs are interpreted.

In the real world, when using printf-series function to print a number
with type `float` on RISC-V, the sign of NaNs is wiped during the
conversion from `float` to `double` (defined as part of the default
argument promotions rule for variable arguments in the C spec).

Change the code to stop relying on isa_print() to print the negative
sign, instead parse it from the highest bit of value and manually print
it before "nan" string.

This fixes the `etnaviv_isa_disasm` unit test on RISC-V.

Suggested-by: Christian Gmeiner <cgmeiner@igalia.com>
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40887>
2026-04-29 11:39:12 +00:00
Xinju Li
0319be8b02 nir: resolve functions: only resolve functions that are reachable from main
Use DFS traversal from main to resolve reachable functions. Avoid spurious
"unresolved reference" linker errors for dead helper functions.
It avoid reporting linking error for following shader test. The shader test used
to pass before merge_requests/31137:

[require]
GLSL >= 1.50

[vertex shader]
/* declared but not defined */
vec4 transform_color(vec3 color, float alpha);

/* calls transform_color — but this function is never called from main */
vec4 apply_transform(vec3 color, float alpha)
{
    return transform_color(color, alpha);
}

[vertex shader]
in vec4 piglit_vertex;

void main()
{
    /* apply_transform is never called here */
    gl_Position = piglit_vertex;
}

Signed-off-by: Xinju Li <xinju.li@broadcom.com>

use pass_flags to mark function as reachable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41065>
2026-04-28 23:35:17 +00:00
Alyssa Rosenzweig
0c49738211 nir/opt_reassociate: fix exactness bug
For an inexact-associative operation (fadd or fmul), can_reassociate ensures the
root of the chain is inexact to allow reassociating. However, build_chain just
checks for opcodes to match up after, although we do sum up exactness across the
chain. Although an Effort Was Made, it still seems incorrect to reassociate

   %3 = fadd! %0, %1
   %4 = fadd %3, %2

to instead be (ex.)

   %3 = fadd! %0, %2
   %4 = fadd! %3, %1

Closes: #14418
Fixes: e0b0f7e73c ("nir: add ALU reassocation pass")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41162>
2026-04-28 21:14:56 +00:00
Georg Lehmann
599a52174b nir: disable fp class analysis for 64bit transcendentals
Some backends have terrible precision for these fp64 opcodes, so don't try to
do anything clever.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15334
Fixes: 5a298f3560 ("nir: rewrite fp range analysis as a fp class analysis")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41206>
2026-04-28 13:26:42 +00:00
Simon Perretta
57791c4a99 pco: track how many tg4/raw sample comps are needed
Rather than always emitting and swizzling 16 components for raw samples,
scale it by the number actually needed as defined by the selected tg4
channel/components.

Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40687>
2026-04-28 12:04:03 +01:00
Marek Olšák
3dcba87ca3 nir/opt_licm: hoist instructions across multiple levels of nested loops
radv gfx12:

Totals:
Instrs: 42861311 -> 42861476 (+0.00%); split: -0.00%, +0.00%
CodeSize: 227917476 -> 227918160 (+0.00%); split: -0.00%, +0.00%
Latency: 265381068 -> 265373506 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 42954018 -> 42952350 (-0.00%)
VClause: 819026 -> 819024 (-0.00%)
SClause: 1210348 -> 1210293 (-0.00%)
Copies: 2919525 -> 2919597 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 2889432 -> 2889406 (-0.00%)
VALU: 23757371 -> 23757377 (+0.00%); split: -0.00%, +0.00%
SALU: 5981417 -> 5981485 (+0.00%); split: -0.00%, +0.00%
VOPD: 8966 -> 8964 (-0.02%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41220>
2026-04-27 23:58:21 +00:00
Marek Olšák
8e036fcaec nir/opt_licm: use nir_metadata_control_flow
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41220>
2026-04-27 23:58:21 +00:00
Marek Olšák
e0112be522 nir/opt_licm: add a private state structure for the pass
The structure will grow in later commits.

The major change is that the preheader and exit blocks are replaced
by tracking just the innermost optimized nir_loop * and getting the
predecessor and successor blocks out of it.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41220>
2026-04-27 23:58:20 +00:00
Timothy Arceri
a42c55da46 amd/radeonsi: dont clamp packed user varyings
ac_nir_optimize_outputs() might pack user varyings into the color
built-ins. If this happens we skip adding clamping to the
components that contain the user varying.

This change also fixes a second bug where a color built-in can be
packed into a non-color slot and was no longer being clamped.

Fixes: 3777a5d7 ("radeonsi: assign param export indices before compilation")
Closes: #14443

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40594>
2026-04-27 22:59:58 +00:00
Simon Perretta
af1669d9e2 pco: reserve additional outputs for trilinear sampled coeffs
Sampling coeffs with trilinear filtering will output 2x sets of data.
Whether bilinear or trilinear filtering is in use can't be determined
without checking state words, so unconditionally reserve 2x to avoid
clobbering output regs.

Fixes: 7df32ba09d ("pco: initial texture/sampler compiler support")
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Tested-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41051>
2026-04-27 11:32:29 +00:00
squidbus
a41f0e62bb asahi,nir: Move asahi dynamic clipz pass to common.
Acked-by: Alyssa Rosenzweig <alyssa@rosenz.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41088>
2026-04-27 11:00:59 +00:00
Rhys Perry
91d555c2cb radv: lower indirect derefs after linking
Scratch access isn't very optimizable, so more stores are optimized away
if we lower indirect derefs after both linking and radv_optimize_nir.

fossil-db (navi21):
Totals from 1264 (0.62% of 202427) affected shaders:
Instrs: 1504703 -> 1504708 (+0.00%); split: -0.02%, +0.02%
CodeSize: 8031388 -> 8031020 (-0.00%); split: -0.02%, +0.02%
SpillSGPRs: 1865 -> 1869 (+0.21%)
Latency: 12106362 -> 12106464 (+0.00%); split: -0.01%, +0.01%
InvThroughput: 4056269 -> 4056044 (-0.01%); split: -0.01%, +0.00%
VClause: 13927 -> 13940 (+0.09%)
SClause: 32382 -> 32396 (+0.04%); split: -0.03%, +0.08%
Copies: 188004 -> 187897 (-0.06%); split: -0.17%, +0.11%
Branches: 39045 -> 39052 (+0.02%); split: -0.01%, +0.03%
PreSGPRs: 79885 -> 79814 (-0.09%); split: -0.11%, +0.02%
VALU: 1072639 -> 1072532 (-0.01%); split: -0.01%, +0.00%
SALU: 187317 -> 187375 (+0.03%); split: -0.11%, +0.14%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31265>
2026-04-24 11:01:03 +00:00
Eric Guo
ba92143ef2 compiler: Add missing MESA_SHADER_KERNEL case for SPIR-V dump
Fixes assertion failure when MESA_SPIRV_DUMP_PATH is set for OpenCL
programs.

Signed-off-by: Eric Guo <eric.guo@nxp.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41097>
2026-04-23 20:36:55 +00:00
Alyssa Rosenzweig
6a43e6c9e0 nir/opt_algebraic: add redundant u2u32/unpack_64_2x32_split_x patterns
reduces hello world kernel 57 -> 44 inst on jay. why do we have two opcodes that
do literally the same thing? :/

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41085>
2026-04-23 19:54:21 +00:00
Samuel Pitoiset
34b8ce948a spirv: add support for SPV_KHR_constant_data
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40722>
2026-04-23 11:12:06 +00:00
Emma Anholt
f7d1f59948 spirv: Demote the SPIRV 1.6 OpTypeSampledImage on Buffer failure to a warning.
The hangover DXVK builds we want to use for arm64 CI hit this path, and we
have a perfectly reasonable fallback for handling this case (ignore the
sampler, as glslang should have done).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40959>
2026-04-22 17:39:30 +00:00
Daniel Schürmann
806fcc6193 nir/opt_loop: always try to peel initial break from loops with unrolling hint
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This allows to unroll these loops, even if loop analyze is unable
to calculate the iteration count.
As always with loops, the throughput stats are meaningless.

Totals from 6 (0.00% of 202440) affected shaders: (Navi48)
Instrs: 7825 -> 6201 (-20.75%)
CodeSize: 37056 -> 30412 (-17.93%)
Latency: 21563 -> 16934 (-21.47%)
InvThroughput: 144649 -> 77962 (-46.10%)
SClause: 139 -> 133 (-4.32%)
Copies: 536 -> 388 (-27.61%)
Branches: 156 -> 84 (-46.15%)
PreVGPRs: 298 -> 296 (-0.67%); split: -1.01%, +0.34%
VALU: 2493 -> 2378 (-4.61%); split: -4.65%, +0.04%
SALU: 3263 -> 2199 (-32.61%)
SMEM: 188 -> 183 (-2.66%)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40349>
2026-04-22 10:34:58 +00:00
Daniel Schürmann
738cc6a7db nir/opt_loop: stop recursion at loop header phi in can_constant_fold()
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40349>
2026-04-22 10:34:58 +00:00
Daniel Schürmann
1f9a0490c6 nir/opt_loop: Don't peel initial break from do-while loops
As the main purpose of this optimization is to transform
while- into do-while loops, don't apply on loops which are
already in do-while form. Also set nir_loop::do_while after
this transformation, so that it is only applied once.

Totals from 576 (0.28% of 202440) affected shaders: (Navi48)
Instrs: 1337529 -> 1253438 (-6.29%); split: -6.36%, +0.07%
CodeSize: 8390852 -> 7837328 (-6.60%); split: -6.61%, +0.01%
VGPRs: 50856 -> 50844 (-0.02%)
SpillSGPRs: 42198 -> 35395 (-16.12%); split: -16.13%, +0.01%
SpillVGPRs: 47608 -> 44620 (-6.28%)
Latency: 31043828 -> 44143753 (+42.20%); split: -0.06%, +42.26%
InvThroughput: 6973433 -> 10079000 (+44.53%); split: -0.08%, +44.61%
VClause: 26839 -> 24718 (-7.90%); split: -7.91%, +0.00%
SClause: 21831 -> 21583 (-1.14%); split: -1.52%, +0.38%
Copies: 183503 -> 150040 (-18.24%); split: -18.84%, +0.61%
Branches: 27738 -> 26848 (-3.21%); split: -5.12%, +1.91%
PreSGPRs: 40233 -> 39083 (-2.86%); split: -2.88%, +0.02%
PreVGPRs: 38745 -> 38903 (+0.41%); split: -0.02%, +0.43%
VALU: 688396 -> 645948 (-6.17%); split: -6.17%, +0.01%
SALU: 189792 -> 177642 (-6.40%); split: -6.97%, +0.57%
VMEM: 121500 -> 112748 (-7.20%)
SMEM: 38765 -> 37767 (-2.57%); split: -2.58%, +0.00%
VOPD: 102488 -> 89071 (-13.09%); split: +0.24%, -13.33%

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40349>
2026-04-22 10:34:58 +00:00
Daniel Schürmann
a9a2edfbb6 glsl_to_nir: set nir_loop::do_while
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40349>
2026-04-22 10:34:58 +00:00
Daniel Schürmann
42663165a2 vtn: set nir_loop::do_while during spirv_to_nir()
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40349>
2026-04-22 10:34:58 +00:00
Daniel Schürmann
32436731a3 nir: add nir_loop::do_while to indicate do-while loops
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40349>
2026-04-22 10:34:58 +00:00
Samuel Pitoiset
9d17a7bdb4 spirv,treewide: rework specialization constant
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
With SPV_KHR_constant_data, it's allowed to specialize array of
constants.

RustiCL changes are from Karol Herbst <kherbst@redhat.com>.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41046>
2026-04-22 06:57:55 +00:00
Timothy Arceri
5f37490855 glcpp: fix paste within macro function expansion
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Note the tests added in 89cd6df034 were wrong (confirmed in gcc)
I've updated them to the expected outcome and enabled the paste
test from 475222b022.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13863
Fixes: d5cd40343f ("Expand macro arguments before performing argument substitution.")

Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40062>
2026-04-21 23:53:19 +00:00
Timothy Arceri
35eda3f3e2 glcpp: update out of date comment
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40062>
2026-04-21 23:53:19 +00:00
Eric R. Smith
4ae192a3d9 glsl, spirv: Improve accuracy of asin() and acos()
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The polynomial used for asin_expr() was suboptimal (and its source was
not documented).

A better approximation is found in the _Handbook_of_Mathematical_Functions_
by Abramowitz and Stegun, which is used in Nvidia's Cg toolkit. However,
while this approximation gives a good absolute error bound, its relative
error exceeds the 4096 ulp allowed by the Vulkan spec. Taking a page
from the spirv implementation of asin(), we implement a piecewise
approximation where a Taylor series is used for small values of |x|.
This patch also harmonizes the GLSL and Vulkan implementations by moving
the implementation to common code (nir_builder).

Running tests on asin() with a grid of 64000 samples between 0.0 and +1.0,
the original asin() at 32 bits has:
```
                       glsl                       spirv
  RMSE:            1.756451e-04                 1.609091e-04
  worst abs error: 3.904104e-04 at 0.937001     3.904104e-04 at 0.937001
  worst ulp error: 11800 at 6.2499e-05          3826 at 0.841331
```
whereas the new implementation has for both:
```
  RMSE:            2.528056e-05
  worst abs error: 4.962087e-05 at 0.451149
  worst ulp error: 2379 at 0.215106
```

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40862>
2026-04-21 21:10:22 +00:00
Brandon Jones
d1dd65d425 nir/opt_algebraic: fix fabs optimization
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This fixes a regression found in blender's unit testing, which called
fabs(-0.0) and invoked an NIR optimization that is was not valid for
the parameter -0.0. IEEE 754 requires that abs clear the sign bit
for the value -0.0.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41060>
2026-04-21 04:10:29 +00:00
Lionel Landwerlin
bbeb6be6eb nir: expose nir_opt_dce_impl
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41047>
2026-04-20 21:53:35 +03:00
Patrick Lerda
9815901f86 r600: implement tes and tcs instanced gl_PrimitiveID support
This change extends r600_lds_constant_buffer to
implement a fully conformant gl_PrimitiveID at
the tes and tcs stages.

This change was tested on cayman and barts. Here are the tests fixed:
spec/arb_tessellation_shader/execution/tcs-primitiveid-instanced: fail pass
spec/arb_tessellation_shader/execution/tes-no-tcs-primitiveid-instanced: fail pass
spec/arb_tessellation_shader/execution/tes-primitiveid-instanced: fail pass
khr-gl4[4-6]/tessellation_shader/tessellation_shader_tessellation/gl_invocationid_patchverticesin_primitiveid: fail pass
khr-gles31/core/tessellation_shader/tessellation_shader_tessellation/gl_invocationid_patchverticesin_primitiveid: fail pass
khr-glesext/tessellation_shader/tessellation_shader_tessellation/gl_invocationid_patchverticesin_primitiveid: fail pass

Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40297>
2026-04-20 13:21:55 +00:00
Janne Grunau
98a97cb413 nir/gather_info: clear interpolation qualifiers only in fragment stage
Asahi wants the the interpolation qualifiers from the shader info in the
vertex shader. Clear them only in the fragment stage so they can
propagate back.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15288
Backport-to: 26.0
Fixes: a72704d0fb ("nir/gather_info: clear interpolation qualifiers before gathering")
Signed-off-by: Janne Grunau <j@jannau.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41040>
2026-04-19 10:10:15 +00:00
Alyssa Rosenzweig
4b81cb6206 nir/opt_generate_bfi: avoid trivial instructions
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
With the pass order shuffling, code like `(x & 0xf) + (x & 0xfffffff0)` gets
optimized to bitfield_select(0xF, x, x). But it would be much better to optimize
simply to x. nir_opt_algebraic would do that for us but we run this pass too
late for algebraic to save us from ourselves, so be smarter.

Observed on dEQP-GLES31.functional.compute.basic.image_atomic_op_local_size_8
with Jay, this saves an instruction there.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40956>
2026-04-16 13:54:41 +00:00
Georg Lehmann
f949e3b819 nir: remove nir_link_xfb_varyings
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
RADV was the last user.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40977>
2026-04-16 08:49:23 +00:00
Marek Olšák
835f5faf14 nir: add back color0/1 system values and VARYING_SLOT_PARAM_GEN_AMD
It turns out we need the color sysvals recorded in system_values_read,
and PARAM_GEN is for point smoothing.

Acked-by: Pierre-Eric
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40556>
2026-04-15 18:12:07 +00:00
Natalie Vock
57f796752d nir/deref: Elide loads/stores from deref cast of undef
These can never be meaningful. DOOM: The Dark Ages also relies on this.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40799>
2026-04-15 08:42:12 +00:00