Commit graph

4403 commits

Author SHA1 Message Date
Caio Oliveira
f8db53ccae brw: Fix comparison with unordered_mode when making baked dependency
The unordered mode stored in dependencies might be a bitmask and not
only a single mode.  In practice, only the "stronger" mode will stick.
Make sure that the code testing for the mode uses "&" instead of "==",
to avoid prevent some valid combinations to happen, e.g.

```
   // ...
   add(16)         g104<1>F        g94<1,1,0>F     g34<1,1,0>F     { align1 1H @7 $7.dst compacted };
```

which without the fix ends up as

```
   // ...
   sync nop(1)                     null<0,1,0>UB                   { align1 WE_all 1N F@7 };
   add(16)         g104<1>F        g94<1,1,0>F     g34<1,1,0>F     { align1 1H $7.dst compacted };
```

Enables two tests for the scoreboard pass that illustrate this case.

For measuring the effect, re-enabled the sync.nop accounting on total of
instructions and got the following results.

```
   Totals:
   Instrs: 322041261 -> 321748285 (-0.09%)
   Cycle count: 22864587567 -> 22863073741 (-0.01%)
   Max dispatch width: 7989040 -> 7989024 (-0.00%); split: +0.00%, -0.00%

   Totals from 88212 (9.78% of 902056) affected shaders:
   Instrs: 102282050 -> 101989074 (-0.29%)
   Cycle count: 12787629859 -> 12786116033 (-0.01%)
   Max dispatch width: 525336 -> 525320 (-0.00%); split: +0.01%, -0.01%
```

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36096>
2025-07-14 20:28:54 +00:00
Caio Oliveira
1e18a2d1a8 brw: Add scoreboard test for edge case involving baked dependency
This is disable because it is adding a `sync.nop` instead of baking
together both "@3 $0.dst".

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36096>
2025-07-14 20:28:54 +00:00
jhananit
debd903a00 intel: Update all NIR_PASS_V to NIR_PASS
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35889>
2025-07-14 19:25:52 +00:00
Sagar Ghuge
36172c41dc intel/compiler: Drop unused param from set_memory_address
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36092>
2025-07-14 03:46:21 +00:00
Caio Oliveira
887642b0f2 intel: Add INTEL_DEBUG=no-vrt
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Add support for disabling the VRT (Variable Register Thread) feature.
The strategy here is to force the old BRW_MAX_GRF limit for the
register allocator (locks the upper limit) and make sure
ptl_register_blocks() always return that amount of blocks (locks
the lower limit).

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35781>
2025-07-13 21:11:02 +00:00
Ian Romanick
5adab50283 brw/nir: Use nir_opt_reassociate_matrix_mul
This needs to be called before intel_nir_opt_peephole_ffma, so I
arbitrarilly decided to call it right before.

All Intel platforms had similar results. (Lunar Lake shown)
total instructions in shared programs: 17120227 -> 17118227 (-0.01%)
instructions in affected programs: 5854 -> 3854 (-34.16%)
helped: 51 / HURT: 0

total cycles in shared programs: 895497762 -> 894733940 (-0.09%)
cycles in affected programs: 4603518 -> 3839696 (-16.59%)
helped: 95 / HURT: 21

LOST:   1
GAINED: 0

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35925>
2025-07-09 19:28:49 +00:00
Sviatoslav Peleshko
8d22eb960b brw/disasm: Fix Gfx11 3src-instructions dst register disassembly
The conversion from bit value to register file type is already done
by the brw_eu_inst_3src_a1_dst_reg_file in the FFC macro now, so doing it
again produced incorrect results.

Fixes: e7179232 ("intel/brw: Move encoding of Gfx11 3-src inside the inst helpers")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13141
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35960>
2025-07-08 19:49:09 +00:00
Daniel Schürmann
2c51a8870d nir: add nir_vectorize_cb callback parameter to nir_lower_phis_to_scalar()
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Similar to nir_lower_alu_width(), the callback can return the
desired number of components for a phi, or 0 for no lowering.

The previous behavior of nir_lower_phis_to_scalar() with lower_all=true
can be elicited via nir_lower_all_phis_to_scalar() while the previous
behavior with lower_all=false now corresponds to nir_lower_phis_to_scalar()
with NULL callback.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35783>
2025-07-08 15:33:59 +00:00
Marek Olšák
8def3f865d agx,freedreno,intel,lima,panfrost,svga,virgl,zink: fix supports_indirect_inputs
The GLSL compiler always lowers inputs to temps for VS and GS, so exclude
them from driver support because the GLSL compiler will no longer do that
unconditionally. Thus, indirect VS and GS inputs are completely untested
and broken in a lot of drivers.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35945>
2025-07-08 06:11:42 +00:00
Alyssa Rosenzweig
d31cb824df treewide: use VARYING_BIT_*
Some checks failed
macOS-CI / macOS-CI (dri) (push) Has been cancelled
macOS-CI / macOS-CI (xlib) (push) Has been cancelled
Via Coccinelle patch generated by the following Python:

  varys = [ "POS", "COL0", "COL1", "FOGC", "TEX0", "TEX1", "TEX2", "TEX3", "TEX4",
           "TEX5", "TEX6", "TEX7", "PSIZ", "BFC0", "BFC1", "EDGE", "CLIP_VERTEX",
           "CLIP_DIST0", "CLIP_DIST1", "CULL_DIST0", "CULL_DIST1", "PRIMITIVE_ID",
           "PRIMITIVE_COUNT", "LAYER", "VIEWPORT", "FACE",
           "PRIMITIVE_SHADING_RATE", "PNTC", "TESS_LEVEL_OUTER",
           "TESS_LEVEL_INNER", "PRIMITIVE_INDICES", "BOUNDING_BOX0",
           "BOUNDING_BOX1", "VIEWPORT_MASK", "CULL_PRIMITIVE" ]
  t = """
  @@
  @@

  -(1 << VARYING_SLOT_${V})
  +VARYING_BIT_${V}

  @@
  @@

  -BITFIELD_BIT(VARYING_SLOT_${V})
  +VARYING_BIT_${V}

  @@
  @@

  -(1ull << VARYING_SLOT_${V})
  +VARYING_BIT_${V}

  @@
  @@

  -BITFIELD64_BIT(VARYING_SLOT_${V})
  +VARYING_BIT_${V}

  """
  for v in varys:
      from mako.template import Template
      print(Template(t).render(V = v))

Closes: #13453
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> [panfrost, common]
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [broadcom]
Reviewed-by: Corentin Noël <corentin.noel@collabora.com> [virgl]
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> [zink]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35917>
2025-07-04 19:01:04 +00:00
Matt Turner
e6242fb958 brw: Handle bfloat16 dest and src0 operands for DPAS
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35320>
2025-07-02 20:06:59 +00:00
Caio Oliveira
c006bee22d brw: Don't use simd_select for BS shaders
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Since there's only one possible SIMD, don't need to use
the helpers to decide which one to compile.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35799>
2025-07-02 19:48:04 +00:00
Caio Oliveira
c733f07378 brw: Use the right width in brw_nir_apply_key for BS shaders
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Fixes: 23c7142cd6 ("anv: disable SIMD16 for RT shaders")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35798>
2025-07-02 15:32:23 +00:00
Lionel Landwerlin
343f3dd3c1 brw: fix non constant BTI accesses with offsets
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: e103afe7be ("brw: run the nir_opt_offsets pass and set the maximum offset size")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35822>
2025-07-02 01:04:06 +03:00
Lionel Landwerlin
89f3ee4cb2 brw: remove debug printf
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Fixes: fcf4401824 ("brw: handle wa_18019110168 with independent shader compilation")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35815>
2025-06-29 12:39:03 +03:00
Lionel Landwerlin
a742b859bd anv: add support for handling wa_18019110168 with gfx-libs
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:35 +00:00
Lionel Landwerlin
fcf4401824 brw: handle wa_18019110168 with independent shader compilation
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:35 +00:00
Lionel Landwerlin
bc8d18aee2 brw: make a helper for vertex attribute offset computation
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:34 +00:00
Lionel Landwerlin
8fabcd754f brw: move primitive_id_index field in fs_msaa
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:34 +00:00
Lionel Landwerlin
6336cf0ea2 brw: store the remapping table for wa_18019110168 in constant data
That way it can be accessed at runtime.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:33 +00:00
Lionel Landwerlin
e1a7eb1718 brw: extract out attribute register remapping
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:33 +00:00
Lionel Landwerlin
5cc66e2c8d anv/brw: move Wa_18019110168 handling to backend
We simplify the implementation by assuming the worse case, copying
entire per-vertex regions if necessary.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:32 +00:00
Lionel Landwerlin
f0f4f9c566 brw: fix vertex attribute offset computation
The formula uses scalar indices (4bytes), not slots (16bytes).

We also incorrectly passed a scalar (vertex case) & slot (mesh case)
offset in the push constants. Use slots instead so that the value is
smaller and we can pack more stuff into fs_msaa_flags.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 18bbcf9a63 ("intel: introduce new VUE layout for separate compiled shader with mesh")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:31 +00:00
Lionel Landwerlin
4b5539a0cb brw: fix set_range on load_per_primitive_output
load intrinsics don't have range

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 18bbcf9a63 ("intel: introduce new VUE layout for separate compiled shader with mesh")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:31 +00:00
Matt Turner
6d786a0e4b brw: Use convert_cmat_intel intrinsic
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35616>
2025-06-27 01:26:22 +00:00
Matt Turner
41cd196886 brw: Implement convert_cmat_intel intrinsic
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35616>
2025-06-27 01:26:22 +00:00
Marek Olšák
1754507d49 nir: rename nir_lower_io_to_temporaries -> nir_lower_io_vars_to_temporaries
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:54 +00:00
Marek Olšák
1e03827c77 nir: rename nir_lower_io_arrays_to_elements -> nir_lower_io_array_vars_to_elements
same for *_no_indirects

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:54 +00:00
Marek Olšák
12df9b3def nir: rename nir_vectorize_tess_levels -> nir_lower_tess_level_array_vars_to_vec
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:50 +00:00
Marek Olšák
2aa94caf82 nir: rename nir_lower_io_to_vector -> nir_opt_vectorize_io_vars
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:50 +00:00
Marek Olšák
439d805291 nir: rename nir_lower_io_to_scalar_early -> nir_lower_io_vars_to_scalar
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:49 +00:00
Ian Romanick
b83f618fb2 brw: Fully write temporary destinations
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Consider an innocuous instruction like:

    and(1) v250:UD, g0.3<0,1,0>:UD, 4294967264u NoMask group0

If register allocation decides to spill v250, it will see this
instruction and say, "Oh no! The other components of v250 aren't set, so
I'd better add a fill before that instruction!"

But it gets even worse than that... if register coalesce decided to
merge two of these, the live range gets massively extended because the
writes don't fully initialize the value. This causes the need to spill
these registers in the first place.

Changing that instruction to SIMD16 on Xe2 or SIMD8 on other platforms
alleviates these issues.

shader-db:

Lunar Lake
total instructions in shared programs: 17118324 -> 17113191 (-0.03%)
instructions in affected programs: 93701 -> 88568 (-5.48%)
helped: 42 / HURT: 6

total cycles in shared programs: 895422566 -> 895079488 (-0.04%)
cycles in affected programs: 30111338 -> 29768260 (-1.14%)
helped: 35 / HURT: 40

total spills in shared programs: 3588 -> 3304 (-7.92%)
spills in affected programs: 285 -> 1 (-99.65%)
helped: 10 / HURT: 0

total fills in shared programs: 2218 -> 1663 (-25.02%)
fills in affected programs: 556 -> 1 (-99.82%)
helped: 10 / HURT: 0

Meteor Lake, DG2, Tiger Lake, and Ice Lake had similar results.  (Meteor Lake shown)
total instructions in shared programs: 20059218 -> 20053563 (-0.03%)
instructions in affected programs: 96938 -> 91283 (-5.83%)
helped: 43 / HURT: 6

total cycles in shared programs: 884174588 -> 883536475 (-0.07%)
cycles in affected programs: 22105268 -> 21467155 (-2.89%)
helped: 35 / HURT: 27

total spills in shared programs: 5032 -> 4679 (-7.02%)
spills in affected programs: 355 -> 2 (-99.44%)
helped: 12 / HURT: 0

total fills in shared programs: 4782 -> 4113 (-13.99%)
fills in affected programs: 671 -> 2 (-99.70%)
helped: 12 / HURT: 0

Skylake
total instructions in shared programs: 19097658 -> 19097665 (<.01%)
instructions in affected programs: 14202 -> 14209 (0.05%)
helped: 0 / HURT: 5

total cycles in shared programs: 862058109 -> 862058267 (<.01%)
cycles in affected programs: 3450244 -> 3450402 (<.01%)
helped: 7 / HURT: 11

fossil-db:

Lunar Lake
Totals:
Cycle count: 31439652246 -> 31439652272 (+0.00%)

Totals from 2 (0.00% of 707091) affected shaders:
Cycle count: 2602 -> 2628 (+1.00%)

No other Intel platforms had any fossil-db changes.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35721>
2025-06-26 17:59:47 +00:00
Eric Engestrom
99e8d804bf intel/compiler tests: fix variable type for getopt_long() return value
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
`getopt_long()` returns an `int`, not a `char`; putting the value in
a `char` before comparing it to `-1` was making the comparison always
fail, resulting in the invalid codepath taken that then fails with:

    option `-' is invalid: ignored

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34756>
2025-06-23 08:26:29 +00:00
Eric Engestrom
f545f9eed4 intel/compiler tests: fix "is there something after the options" check
cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34756>
2025-06-23 08:26:29 +00:00
Eric Engestrom
729922cdae intel/compiler tests: fix path-to-string conversion
cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34756>
2025-06-23 08:26:29 +00:00
Eric Engestrom
de6ab1beda intel/compiler tests: rewrite subprocess handling in run-test.py
`subprocess.Popen()` returns immediately, and the subprocess might not
have finished by the time `stdout` is read on the next line, spuriously
failing the tests.

`subprocess.check_output()` makes sure the output is available before
returning, solving this issue; it additionally raises an error if the
subprocess failed, giving a better error than a failed diff later in the
script.

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34756>
2025-06-23 08:26:29 +00:00
Georg Lehmann
9da23499ff compiler: add float8 glsl types
e4m3fn: 8bit floating point format with 4bit exponent, 3bit mantissa
        and no infinities (finite only)
e5m2:   8bit floating point format with 5bit exponent, 2bit mantissa
        and with infinities.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35434>
2025-06-23 07:59:24 +00:00
Rohan Garg
e103afe7be brw: run the nir_opt_offsets pass and set the maximum offset size
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Perf A/B testing on DG2: no changes
Perf A/B testing on BMG: +2.1% Blackops3, +1.5% Cyberpunk

DG2 stats (mostly insignificant):
Assassins Creed Valhalla:
 Totals from 1169 (55.67% of 2100) affected shaders:
 Instrs: 509237 -> 509215 (-0.00%)
 Cycle count: 30614325 -> 30607419 (-0.02%); split: -0.03%, +0.00%
 Non SSA regs after NIR: 83434 -> 85909 (+2.97%)

Blackops 3:
 Totals from 1045 (64.63% of 1617) affected shaders:
 Instrs: 527312 -> 527310 (-0.00%)
 Cycle count: 496912222 -> 496902846 (-0.00%); split: -0.00%, +0.00%
 Non SSA regs after NIR: 106883 -> 109095 (+2.07%)

Cyberpunk:
 Totals from 706 (56.03% of 1260) affected shaders:
 Instrs: 345976 -> 345974 (-0.00%); split: -0.00%, +0.00%
 Cycle count: 9775138 -> 9775472 (+0.00%); split: -0.00%, +0.00%
 Max live registers: 40295 -> 40297 (+0.00%)
 Non SSA regs after NIR: 93245 -> 94718 (+1.58%)

Fortnite:
 Totals from 4210 (55.98% of 7521) affected shaders:
 Instrs: 2205471 -> 2205469 (-0.00%)
 Cycle count: 91451040 -> 91450956 (-0.00%); split: -0.00%, +0.00%
 Non SSA regs after NIR: 952354 -> 961664 (+0.98%)

LNL stats (notable changes):
Assassins Creed Valhalla:
 Totals from 1684 (83.57% of 2015) affected shaders:
 Instrs: 774305 -> 764501 (-1.27%); split: -1.27%, +0.01%
 Cycle count: 58845842 -> 58699250 (-0.25%); split: -0.98%, +0.73%
 Spill count: 625 -> 638 (+2.08%)
 Fill count: 1490 -> 1503 (+0.87%)
 Scratch Memory Size: 41984 -> 44032 (+4.88%)
 Max live registers: 196424 -> 197561 (+0.58%); split: -0.10%, +0.68%

Blackops 3:
 Totals from 1125 (76.53% of 1470) affected shaders:
 Instrs: 781749 -> 773275 (-1.08%); split: -1.08%, +0.00%
 Subgroup size: 22896 -> 22912 (+0.07%)
 Cycle count: 659864454 -> 654641032 (-0.79%); split: -1.10%, +0.31%
 Max live registers: 116772 -> 116854 (+0.07%); split: -0.01%, +0.08%
 Non SSA regs after NIR: 172648 -> 168260 (-2.54%); split: -2.55%, +0.01%

Control:
 Totals from 378 (51.50% of 734) affected shaders:
 Instrs: 148184 -> 147544 (-0.43%)
 Cycle count: 6905200 -> 6913366 (+0.12%); split: -0.30%, +0.42%
 Max live registers: 41271 -> 41281 (+0.02%)
 Non SSA regs after NIR: 44964 -> 43868 (-2.44%); split: -2.45%, +0.01%

Cyberpunk:
 Totals from 1141 (92.46% of 1234) affected shaders:
 Instrs: 636744 -> 629333 (-1.16%)
 Subgroup size: 24256 -> 24272 (+0.07%)
 Cycle count: 24952258 -> 24801298 (-0.60%); split: -1.39%, +0.78%
 Max live registers: 125848 -> 126855 (+0.80%); split: -0.00%, +0.80%
 Non SSA regs after NIR: 127399 -> 119837 (-5.94%); split: -5.95%, +0.02%

Fortnite:
 Totals from 5497 (83.52% of 6582) affected shaders:
 Instrs: 4072831 -> 4041852 (-0.76%); split: -0.77%, +0.01%
 Subgroup size: 103296 -> 103312 (+0.02%)
 Cycle count: 133046874 -> 132789242 (-0.19%); split: -0.67%, +0.48%
 Spill count: 7218 -> 7254 (+0.50%); split: -0.33%, +0.83%
 Fill count: 11724 -> 11749 (+0.21%); split: -0.34%, +0.55%
 Scratch Memory Size: 591872 -> 599040 (+1.21%)
 Max live registers: 816530 -> 818522 (+0.24%); split: -0.01%, +0.26%
 Non SSA regs after NIR: 1610296 -> 1560284 (-3.11%); split: -3.11%, +0.00%

Hitman3:
 Totals from 4713 (92.39% of 5101) affected shaders:
 Instrs: 2731598 -> 2698224 (-1.22%)
 Cycle count: 186422098 -> 185472640 (-0.51%); split: -1.12%, +0.61%
 Spill count: 3244 -> 3242 (-0.06%)
 Fill count: 9937 -> 9933 (-0.04%)
 Max live registers: 585035 -> 589801 (+0.81%); split: -0.00%, +0.82%
 Non SSA regs after NIR: 347681 -> 324314 (-6.72%); split: -6.73%, +0.01%

Hogwarts Legacy:
 Totals from 930 (59.81% of 1555) affected shaders:
 Instrs: 464146 -> 459526 (-1.00%); split: -1.00%, +0.01%
 Subgroup size: 19104 -> 19120 (+0.08%)
 Cycle count: 24062460 -> 24078964 (+0.07%); split: -0.49%, +0.56%
 Spill count: 2068 -> 1964 (-5.03%); split: -5.22%, +0.19%
 Fill count: 2342 -> 2205 (-5.85%); split: -6.40%, +0.56%
 Scratch Memory Size: 147456 -> 141312 (-4.17%)
 Max live registers: 112384 -> 112787 (+0.36%); split: -0.08%, +0.44%
 Non SSA regs after NIR: 80293 -> 79161 (-1.41%); split: -1.72%, +0.32%

Metro Exodus:
 Totals from 29755 (78.62% of 37846) affected shaders:
 Instrs: 11495578 -> 11492951 (-0.02%); split: -0.02%, +0.00%
 Subgroup size: 644688 -> 644704 (+0.00%)
 Cycle count: 301572068 -> 301548054 (-0.01%); split: -0.03%, +0.02%
 Max live registers: 3369504 -> 3370454 (+0.03%); split: -0.00%, +0.03%
 Non SSA regs after NIR: 2476561 -> 2396090 (-3.25%); split: -3.27%, +0.02%

Red Dead Redemption 2:
 Totals from 4161 (78.61% of 5293) affected shaders:
 Instrs: 2428782 -> 2409032 (-0.81%); split: -0.82%, +0.00%
 Subgroup size: 85344 -> 85360 (+0.02%)
 Cycle count: 8514984142 -> 8533415324 (+0.22%); split: -0.02%, +0.23%
 Spill count: 4659 -> 4674 (+0.32%); split: -0.02%, +0.34%
 Fill count: 11236 -> 11231 (-0.04%); split: -0.19%, +0.14%
 Scratch Memory Size: 398336 -> 397312 (-0.26%)
 Max live registers: 473946 -> 475798 (+0.39%); split: -0.08%, +0.47%
 Non SSA regs after NIR: 616820 -> 567706 (-7.96%); split: -8.09%, +0.12%

Rise Of The Tomb Raider:
 Totals from 68 (46.58% of 146) affected shaders:
 Instrs: 28209 -> 27801 (-1.45%)
 Subgroup size: 1584 -> 1600 (+1.01%)
 Cycle count: 16182992 -> 16249364 (+0.41%); split: -0.97%, +1.38%
 Max live registers: 7320 -> 7296 (-0.33%); split: -0.38%, +0.05%
 Non SSA regs after NIR: 8438 -> 8207 (-2.74%); split: -2.82%, +0.08%

Spiderman Remastered:
 Totals from 6403 (93.87% of 6821) affected shaders:
 Instrs: 5662713 -> 5597949 (-1.14%); split: -1.28%, +0.14%
 Cycle count: 282861519016 -> 279806958122 (-1.08%); split: -1.26%, +0.18%
 Spill count: 61150 -> 60754 (-0.65%); split: -1.13%, +0.48%
 Fill count: 162597 -> 163190 (+0.36%); split: -0.84%, +1.21%
 Scratch Memory Size: 5834752 -> 5804032 (-0.53%); split: -0.70%, +0.18%
 Max live registers: 901926 -> 903820 (+0.21%); split: -0.01%, +0.22%
 Non SSA regs after NIR: 555053 -> 521016 (-6.13%); split: -6.14%, +0.01%

Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
2025-06-22 10:55:24 +00:00
Rohan Garg
8a5e062e5e brw: store the buffer offset for load/store intrinsics
This will later be encoded by the backend into the
LSC extended descriptor message.

Reworks:
  * Sagar: Add nir_intrinsic_ssbo_atomic_swap

Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
2025-06-22 10:55:24 +00:00
Rohan Garg
0186113640 brw: encode the offset into the message descriptor for Xe2
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
2025-06-22 10:55:24 +00:00
Rohan Garg
937d37f0b1 brw: introduce MEMORY_LOGICAL_ADDRESS_OFFSET to encode address offsets
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
2025-06-22 10:55:24 +00:00
Lionel Landwerlin
d5a58364b1 brw: add new helper for immediate integer register with type
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
2025-06-22 10:55:24 +00:00
Lionel Landwerlin
16fca611d7 nir: add new intel ssbo intrinsics
Similar to ir3 ones, to optimize offsets in the backend.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
2025-06-22 10:55:23 +00:00
Lionel Landwerlin
ba119c73c6 intel: replace RANGE_BASE by BASE for uniform block loads
We're not currently using RANGE_BASE and we'll use BASE for offset
optimizations on Xe2+.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
2025-06-22 10:55:23 +00:00
Lionel Landwerlin
098249ba66 brw: print descriptor & extended descriptors
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
2025-06-22 10:55:22 +00:00
Emma Anholt
cd981e27f7 intel/elk: Move wpos_w setup right into nir_intrinsic_load_frag_w.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Given that the intrinsic will be CSEed at the NIR level, we don't need to
preemptively set it up at the top of the shader.  No change in HSW shader-db.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
2025-06-18 23:11:43 +00:00
Emma Anholt
269fbcb144 intel/elk: Use pixel_z for gl_FragCoord.z on pre-gen6.
Unless I've seriously missed something, we have the Z in the payload
(which we can always request if we need access to it and it's not already
passed to us due other WM IZ settings).

total instructions in shared programs: 4408303 -> 4408186 (<.01%)
instructions in affected programs: 1164 -> 1047 (-10.05%)
total cycles in shared programs: 142485036 -> 142484566 (<.01%)
cycles in affected programs: 26820 -> 26350 (-1.75%)

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
2025-06-18 23:11:43 +00:00
Emma Anholt
dc55b47a58 intel/elk: Move pre-gen6 smooth interpolation 1/w multiply to NIR.
NIR catches that if you're just doing something like adding two smooth
inputs, we can do the multiply once on the result instead of on each
input.  BRW shader-db results:

total instructions in shared programs: 4409146 -> 4408303 (-0.02%)
instructions in affected programs: 800761 -> 799918 (-0.11%)
total cycles in shared programs: 143203198 -> 142485036 (-0.50%)
cycles in affected programs: 79081682 -> 78363520 (-0.91%)
total sends in shared programs: 363044 -> 363042 (<.01%)
sends in affected programs: 33 -> 31 (-6.06%)

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
2025-06-18 23:11:42 +00:00
Emma Anholt
fb9b2261a1 intel/elk: Move pre-gen6 gl_FragCoord.w -> interpolation lowering to NIR.
BRW shader-db:
total instructions in shared programs: 4409143 -> 4409146 (<.01%)
instructions in affected programs: 330 -> 333 (0.91%)

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
2025-06-18 23:11:41 +00:00
Emma Anholt
17ab39fbf8 intel/elk: Fix some tabs in gen4 URB setup.
This formatted terribly in my editor, just use spaces.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
2025-06-18 23:11:40 +00:00