Commit graph

4397 commits

Author SHA1 Message Date
Sviatoslav Peleshko
8d22eb960b brw/disasm: Fix Gfx11 3src-instructions dst register disassembly
The conversion from bit value to register file type is already done
by the brw_eu_inst_3src_a1_dst_reg_file in the FFC macro now, so doing it
again produced incorrect results.

Fixes: e7179232 ("intel/brw: Move encoding of Gfx11 3-src inside the inst helpers")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13141
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35960>
2025-07-08 19:49:09 +00:00
Daniel Schürmann
2c51a8870d nir: add nir_vectorize_cb callback parameter to nir_lower_phis_to_scalar()
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Similar to nir_lower_alu_width(), the callback can return the
desired number of components for a phi, or 0 for no lowering.

The previous behavior of nir_lower_phis_to_scalar() with lower_all=true
can be elicited via nir_lower_all_phis_to_scalar() while the previous
behavior with lower_all=false now corresponds to nir_lower_phis_to_scalar()
with NULL callback.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35783>
2025-07-08 15:33:59 +00:00
Marek Olšák
8def3f865d agx,freedreno,intel,lima,panfrost,svga,virgl,zink: fix supports_indirect_inputs
The GLSL compiler always lowers inputs to temps for VS and GS, so exclude
them from driver support because the GLSL compiler will no longer do that
unconditionally. Thus, indirect VS and GS inputs are completely untested
and broken in a lot of drivers.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35945>
2025-07-08 06:11:42 +00:00
Alyssa Rosenzweig
d31cb824df treewide: use VARYING_BIT_*
Some checks failed
macOS-CI / macOS-CI (dri) (push) Has been cancelled
macOS-CI / macOS-CI (xlib) (push) Has been cancelled
Via Coccinelle patch generated by the following Python:

  varys = [ "POS", "COL0", "COL1", "FOGC", "TEX0", "TEX1", "TEX2", "TEX3", "TEX4",
           "TEX5", "TEX6", "TEX7", "PSIZ", "BFC0", "BFC1", "EDGE", "CLIP_VERTEX",
           "CLIP_DIST0", "CLIP_DIST1", "CULL_DIST0", "CULL_DIST1", "PRIMITIVE_ID",
           "PRIMITIVE_COUNT", "LAYER", "VIEWPORT", "FACE",
           "PRIMITIVE_SHADING_RATE", "PNTC", "TESS_LEVEL_OUTER",
           "TESS_LEVEL_INNER", "PRIMITIVE_INDICES", "BOUNDING_BOX0",
           "BOUNDING_BOX1", "VIEWPORT_MASK", "CULL_PRIMITIVE" ]
  t = """
  @@
  @@

  -(1 << VARYING_SLOT_${V})
  +VARYING_BIT_${V}

  @@
  @@

  -BITFIELD_BIT(VARYING_SLOT_${V})
  +VARYING_BIT_${V}

  @@
  @@

  -(1ull << VARYING_SLOT_${V})
  +VARYING_BIT_${V}

  @@
  @@

  -BITFIELD64_BIT(VARYING_SLOT_${V})
  +VARYING_BIT_${V}

  """
  for v in varys:
      from mako.template import Template
      print(Template(t).render(V = v))

Closes: #13453
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> [panfrost, common]
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [broadcom]
Reviewed-by: Corentin Noël <corentin.noel@collabora.com> [virgl]
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> [zink]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35917>
2025-07-04 19:01:04 +00:00
Matt Turner
e6242fb958 brw: Handle bfloat16 dest and src0 operands for DPAS
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35320>
2025-07-02 20:06:59 +00:00
Caio Oliveira
c006bee22d brw: Don't use simd_select for BS shaders
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Since there's only one possible SIMD, don't need to use
the helpers to decide which one to compile.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35799>
2025-07-02 19:48:04 +00:00
Caio Oliveira
c733f07378 brw: Use the right width in brw_nir_apply_key for BS shaders
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Fixes: 23c7142cd6 ("anv: disable SIMD16 for RT shaders")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35798>
2025-07-02 15:32:23 +00:00
Lionel Landwerlin
343f3dd3c1 brw: fix non constant BTI accesses with offsets
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: e103afe7be ("brw: run the nir_opt_offsets pass and set the maximum offset size")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35822>
2025-07-02 01:04:06 +03:00
Lionel Landwerlin
89f3ee4cb2 brw: remove debug printf
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Fixes: fcf4401824 ("brw: handle wa_18019110168 with independent shader compilation")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35815>
2025-06-29 12:39:03 +03:00
Lionel Landwerlin
a742b859bd anv: add support for handling wa_18019110168 with gfx-libs
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:35 +00:00
Lionel Landwerlin
fcf4401824 brw: handle wa_18019110168 with independent shader compilation
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:35 +00:00
Lionel Landwerlin
bc8d18aee2 brw: make a helper for vertex attribute offset computation
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:34 +00:00
Lionel Landwerlin
8fabcd754f brw: move primitive_id_index field in fs_msaa
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:34 +00:00
Lionel Landwerlin
6336cf0ea2 brw: store the remapping table for wa_18019110168 in constant data
That way it can be accessed at runtime.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:33 +00:00
Lionel Landwerlin
e1a7eb1718 brw: extract out attribute register remapping
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:33 +00:00
Lionel Landwerlin
5cc66e2c8d anv/brw: move Wa_18019110168 handling to backend
We simplify the implementation by assuming the worse case, copying
entire per-vertex regions if necessary.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:32 +00:00
Lionel Landwerlin
f0f4f9c566 brw: fix vertex attribute offset computation
The formula uses scalar indices (4bytes), not slots (16bytes).

We also incorrectly passed a scalar (vertex case) & slot (mesh case)
offset in the push constants. Use slots instead so that the value is
smaller and we can pack more stuff into fs_msaa_flags.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 18bbcf9a63 ("intel: introduce new VUE layout for separate compiled shader with mesh")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:31 +00:00
Lionel Landwerlin
4b5539a0cb brw: fix set_range on load_per_primitive_output
load intrinsics don't have range

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 18bbcf9a63 ("intel: introduce new VUE layout for separate compiled shader with mesh")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:31 +00:00
Matt Turner
6d786a0e4b brw: Use convert_cmat_intel intrinsic
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35616>
2025-06-27 01:26:22 +00:00
Matt Turner
41cd196886 brw: Implement convert_cmat_intel intrinsic
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35616>
2025-06-27 01:26:22 +00:00
Marek Olšák
1754507d49 nir: rename nir_lower_io_to_temporaries -> nir_lower_io_vars_to_temporaries
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:54 +00:00
Marek Olšák
1e03827c77 nir: rename nir_lower_io_arrays_to_elements -> nir_lower_io_array_vars_to_elements
same for *_no_indirects

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:54 +00:00
Marek Olšák
12df9b3def nir: rename nir_vectorize_tess_levels -> nir_lower_tess_level_array_vars_to_vec
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:50 +00:00
Marek Olšák
2aa94caf82 nir: rename nir_lower_io_to_vector -> nir_opt_vectorize_io_vars
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:50 +00:00
Marek Olšák
439d805291 nir: rename nir_lower_io_to_scalar_early -> nir_lower_io_vars_to_scalar
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:49 +00:00
Ian Romanick
b83f618fb2 brw: Fully write temporary destinations
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Consider an innocuous instruction like:

    and(1) v250:UD, g0.3<0,1,0>:UD, 4294967264u NoMask group0

If register allocation decides to spill v250, it will see this
instruction and say, "Oh no! The other components of v250 aren't set, so
I'd better add a fill before that instruction!"

But it gets even worse than that... if register coalesce decided to
merge two of these, the live range gets massively extended because the
writes don't fully initialize the value. This causes the need to spill
these registers in the first place.

Changing that instruction to SIMD16 on Xe2 or SIMD8 on other platforms
alleviates these issues.

shader-db:

Lunar Lake
total instructions in shared programs: 17118324 -> 17113191 (-0.03%)
instructions in affected programs: 93701 -> 88568 (-5.48%)
helped: 42 / HURT: 6

total cycles in shared programs: 895422566 -> 895079488 (-0.04%)
cycles in affected programs: 30111338 -> 29768260 (-1.14%)
helped: 35 / HURT: 40

total spills in shared programs: 3588 -> 3304 (-7.92%)
spills in affected programs: 285 -> 1 (-99.65%)
helped: 10 / HURT: 0

total fills in shared programs: 2218 -> 1663 (-25.02%)
fills in affected programs: 556 -> 1 (-99.82%)
helped: 10 / HURT: 0

Meteor Lake, DG2, Tiger Lake, and Ice Lake had similar results.  (Meteor Lake shown)
total instructions in shared programs: 20059218 -> 20053563 (-0.03%)
instructions in affected programs: 96938 -> 91283 (-5.83%)
helped: 43 / HURT: 6

total cycles in shared programs: 884174588 -> 883536475 (-0.07%)
cycles in affected programs: 22105268 -> 21467155 (-2.89%)
helped: 35 / HURT: 27

total spills in shared programs: 5032 -> 4679 (-7.02%)
spills in affected programs: 355 -> 2 (-99.44%)
helped: 12 / HURT: 0

total fills in shared programs: 4782 -> 4113 (-13.99%)
fills in affected programs: 671 -> 2 (-99.70%)
helped: 12 / HURT: 0

Skylake
total instructions in shared programs: 19097658 -> 19097665 (<.01%)
instructions in affected programs: 14202 -> 14209 (0.05%)
helped: 0 / HURT: 5

total cycles in shared programs: 862058109 -> 862058267 (<.01%)
cycles in affected programs: 3450244 -> 3450402 (<.01%)
helped: 7 / HURT: 11

fossil-db:

Lunar Lake
Totals:
Cycle count: 31439652246 -> 31439652272 (+0.00%)

Totals from 2 (0.00% of 707091) affected shaders:
Cycle count: 2602 -> 2628 (+1.00%)

No other Intel platforms had any fossil-db changes.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35721>
2025-06-26 17:59:47 +00:00
Eric Engestrom
99e8d804bf intel/compiler tests: fix variable type for getopt_long() return value
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
`getopt_long()` returns an `int`, not a `char`; putting the value in
a `char` before comparing it to `-1` was making the comparison always
fail, resulting in the invalid codepath taken that then fails with:

    option `-' is invalid: ignored

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34756>
2025-06-23 08:26:29 +00:00
Eric Engestrom
f545f9eed4 intel/compiler tests: fix "is there something after the options" check
cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34756>
2025-06-23 08:26:29 +00:00
Eric Engestrom
729922cdae intel/compiler tests: fix path-to-string conversion
cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34756>
2025-06-23 08:26:29 +00:00
Eric Engestrom
de6ab1beda intel/compiler tests: rewrite subprocess handling in run-test.py
`subprocess.Popen()` returns immediately, and the subprocess might not
have finished by the time `stdout` is read on the next line, spuriously
failing the tests.

`subprocess.check_output()` makes sure the output is available before
returning, solving this issue; it additionally raises an error if the
subprocess failed, giving a better error than a failed diff later in the
script.

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34756>
2025-06-23 08:26:29 +00:00
Georg Lehmann
9da23499ff compiler: add float8 glsl types
e4m3fn: 8bit floating point format with 4bit exponent, 3bit mantissa
        and no infinities (finite only)
e5m2:   8bit floating point format with 5bit exponent, 2bit mantissa
        and with infinities.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35434>
2025-06-23 07:59:24 +00:00
Rohan Garg
e103afe7be brw: run the nir_opt_offsets pass and set the maximum offset size
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Perf A/B testing on DG2: no changes
Perf A/B testing on BMG: +2.1% Blackops3, +1.5% Cyberpunk

DG2 stats (mostly insignificant):
Assassins Creed Valhalla:
 Totals from 1169 (55.67% of 2100) affected shaders:
 Instrs: 509237 -> 509215 (-0.00%)
 Cycle count: 30614325 -> 30607419 (-0.02%); split: -0.03%, +0.00%
 Non SSA regs after NIR: 83434 -> 85909 (+2.97%)

Blackops 3:
 Totals from 1045 (64.63% of 1617) affected shaders:
 Instrs: 527312 -> 527310 (-0.00%)
 Cycle count: 496912222 -> 496902846 (-0.00%); split: -0.00%, +0.00%
 Non SSA regs after NIR: 106883 -> 109095 (+2.07%)

Cyberpunk:
 Totals from 706 (56.03% of 1260) affected shaders:
 Instrs: 345976 -> 345974 (-0.00%); split: -0.00%, +0.00%
 Cycle count: 9775138 -> 9775472 (+0.00%); split: -0.00%, +0.00%
 Max live registers: 40295 -> 40297 (+0.00%)
 Non SSA regs after NIR: 93245 -> 94718 (+1.58%)

Fortnite:
 Totals from 4210 (55.98% of 7521) affected shaders:
 Instrs: 2205471 -> 2205469 (-0.00%)
 Cycle count: 91451040 -> 91450956 (-0.00%); split: -0.00%, +0.00%
 Non SSA regs after NIR: 952354 -> 961664 (+0.98%)

LNL stats (notable changes):
Assassins Creed Valhalla:
 Totals from 1684 (83.57% of 2015) affected shaders:
 Instrs: 774305 -> 764501 (-1.27%); split: -1.27%, +0.01%
 Cycle count: 58845842 -> 58699250 (-0.25%); split: -0.98%, +0.73%
 Spill count: 625 -> 638 (+2.08%)
 Fill count: 1490 -> 1503 (+0.87%)
 Scratch Memory Size: 41984 -> 44032 (+4.88%)
 Max live registers: 196424 -> 197561 (+0.58%); split: -0.10%, +0.68%

Blackops 3:
 Totals from 1125 (76.53% of 1470) affected shaders:
 Instrs: 781749 -> 773275 (-1.08%); split: -1.08%, +0.00%
 Subgroup size: 22896 -> 22912 (+0.07%)
 Cycle count: 659864454 -> 654641032 (-0.79%); split: -1.10%, +0.31%
 Max live registers: 116772 -> 116854 (+0.07%); split: -0.01%, +0.08%
 Non SSA regs after NIR: 172648 -> 168260 (-2.54%); split: -2.55%, +0.01%

Control:
 Totals from 378 (51.50% of 734) affected shaders:
 Instrs: 148184 -> 147544 (-0.43%)
 Cycle count: 6905200 -> 6913366 (+0.12%); split: -0.30%, +0.42%
 Max live registers: 41271 -> 41281 (+0.02%)
 Non SSA regs after NIR: 44964 -> 43868 (-2.44%); split: -2.45%, +0.01%

Cyberpunk:
 Totals from 1141 (92.46% of 1234) affected shaders:
 Instrs: 636744 -> 629333 (-1.16%)
 Subgroup size: 24256 -> 24272 (+0.07%)
 Cycle count: 24952258 -> 24801298 (-0.60%); split: -1.39%, +0.78%
 Max live registers: 125848 -> 126855 (+0.80%); split: -0.00%, +0.80%
 Non SSA regs after NIR: 127399 -> 119837 (-5.94%); split: -5.95%, +0.02%

Fortnite:
 Totals from 5497 (83.52% of 6582) affected shaders:
 Instrs: 4072831 -> 4041852 (-0.76%); split: -0.77%, +0.01%
 Subgroup size: 103296 -> 103312 (+0.02%)
 Cycle count: 133046874 -> 132789242 (-0.19%); split: -0.67%, +0.48%
 Spill count: 7218 -> 7254 (+0.50%); split: -0.33%, +0.83%
 Fill count: 11724 -> 11749 (+0.21%); split: -0.34%, +0.55%
 Scratch Memory Size: 591872 -> 599040 (+1.21%)
 Max live registers: 816530 -> 818522 (+0.24%); split: -0.01%, +0.26%
 Non SSA regs after NIR: 1610296 -> 1560284 (-3.11%); split: -3.11%, +0.00%

Hitman3:
 Totals from 4713 (92.39% of 5101) affected shaders:
 Instrs: 2731598 -> 2698224 (-1.22%)
 Cycle count: 186422098 -> 185472640 (-0.51%); split: -1.12%, +0.61%
 Spill count: 3244 -> 3242 (-0.06%)
 Fill count: 9937 -> 9933 (-0.04%)
 Max live registers: 585035 -> 589801 (+0.81%); split: -0.00%, +0.82%
 Non SSA regs after NIR: 347681 -> 324314 (-6.72%); split: -6.73%, +0.01%

Hogwarts Legacy:
 Totals from 930 (59.81% of 1555) affected shaders:
 Instrs: 464146 -> 459526 (-1.00%); split: -1.00%, +0.01%
 Subgroup size: 19104 -> 19120 (+0.08%)
 Cycle count: 24062460 -> 24078964 (+0.07%); split: -0.49%, +0.56%
 Spill count: 2068 -> 1964 (-5.03%); split: -5.22%, +0.19%
 Fill count: 2342 -> 2205 (-5.85%); split: -6.40%, +0.56%
 Scratch Memory Size: 147456 -> 141312 (-4.17%)
 Max live registers: 112384 -> 112787 (+0.36%); split: -0.08%, +0.44%
 Non SSA regs after NIR: 80293 -> 79161 (-1.41%); split: -1.72%, +0.32%

Metro Exodus:
 Totals from 29755 (78.62% of 37846) affected shaders:
 Instrs: 11495578 -> 11492951 (-0.02%); split: -0.02%, +0.00%
 Subgroup size: 644688 -> 644704 (+0.00%)
 Cycle count: 301572068 -> 301548054 (-0.01%); split: -0.03%, +0.02%
 Max live registers: 3369504 -> 3370454 (+0.03%); split: -0.00%, +0.03%
 Non SSA regs after NIR: 2476561 -> 2396090 (-3.25%); split: -3.27%, +0.02%

Red Dead Redemption 2:
 Totals from 4161 (78.61% of 5293) affected shaders:
 Instrs: 2428782 -> 2409032 (-0.81%); split: -0.82%, +0.00%
 Subgroup size: 85344 -> 85360 (+0.02%)
 Cycle count: 8514984142 -> 8533415324 (+0.22%); split: -0.02%, +0.23%
 Spill count: 4659 -> 4674 (+0.32%); split: -0.02%, +0.34%
 Fill count: 11236 -> 11231 (-0.04%); split: -0.19%, +0.14%
 Scratch Memory Size: 398336 -> 397312 (-0.26%)
 Max live registers: 473946 -> 475798 (+0.39%); split: -0.08%, +0.47%
 Non SSA regs after NIR: 616820 -> 567706 (-7.96%); split: -8.09%, +0.12%

Rise Of The Tomb Raider:
 Totals from 68 (46.58% of 146) affected shaders:
 Instrs: 28209 -> 27801 (-1.45%)
 Subgroup size: 1584 -> 1600 (+1.01%)
 Cycle count: 16182992 -> 16249364 (+0.41%); split: -0.97%, +1.38%
 Max live registers: 7320 -> 7296 (-0.33%); split: -0.38%, +0.05%
 Non SSA regs after NIR: 8438 -> 8207 (-2.74%); split: -2.82%, +0.08%

Spiderman Remastered:
 Totals from 6403 (93.87% of 6821) affected shaders:
 Instrs: 5662713 -> 5597949 (-1.14%); split: -1.28%, +0.14%
 Cycle count: 282861519016 -> 279806958122 (-1.08%); split: -1.26%, +0.18%
 Spill count: 61150 -> 60754 (-0.65%); split: -1.13%, +0.48%
 Fill count: 162597 -> 163190 (+0.36%); split: -0.84%, +1.21%
 Scratch Memory Size: 5834752 -> 5804032 (-0.53%); split: -0.70%, +0.18%
 Max live registers: 901926 -> 903820 (+0.21%); split: -0.01%, +0.22%
 Non SSA regs after NIR: 555053 -> 521016 (-6.13%); split: -6.14%, +0.01%

Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
2025-06-22 10:55:24 +00:00
Rohan Garg
8a5e062e5e brw: store the buffer offset for load/store intrinsics
This will later be encoded by the backend into the
LSC extended descriptor message.

Reworks:
  * Sagar: Add nir_intrinsic_ssbo_atomic_swap

Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
2025-06-22 10:55:24 +00:00
Rohan Garg
0186113640 brw: encode the offset into the message descriptor for Xe2
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
2025-06-22 10:55:24 +00:00
Rohan Garg
937d37f0b1 brw: introduce MEMORY_LOGICAL_ADDRESS_OFFSET to encode address offsets
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
2025-06-22 10:55:24 +00:00
Lionel Landwerlin
d5a58364b1 brw: add new helper for immediate integer register with type
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
2025-06-22 10:55:24 +00:00
Lionel Landwerlin
16fca611d7 nir: add new intel ssbo intrinsics
Similar to ir3 ones, to optimize offsets in the backend.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
2025-06-22 10:55:23 +00:00
Lionel Landwerlin
ba119c73c6 intel: replace RANGE_BASE by BASE for uniform block loads
We're not currently using RANGE_BASE and we'll use BASE for offset
optimizations on Xe2+.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
2025-06-22 10:55:23 +00:00
Lionel Landwerlin
098249ba66 brw: print descriptor & extended descriptors
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
2025-06-22 10:55:22 +00:00
Emma Anholt
cd981e27f7 intel/elk: Move wpos_w setup right into nir_intrinsic_load_frag_w.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Given that the intrinsic will be CSEed at the NIR level, we don't need to
preemptively set it up at the top of the shader.  No change in HSW shader-db.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
2025-06-18 23:11:43 +00:00
Emma Anholt
269fbcb144 intel/elk: Use pixel_z for gl_FragCoord.z on pre-gen6.
Unless I've seriously missed something, we have the Z in the payload
(which we can always request if we need access to it and it's not already
passed to us due other WM IZ settings).

total instructions in shared programs: 4408303 -> 4408186 (<.01%)
instructions in affected programs: 1164 -> 1047 (-10.05%)
total cycles in shared programs: 142485036 -> 142484566 (<.01%)
cycles in affected programs: 26820 -> 26350 (-1.75%)

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
2025-06-18 23:11:43 +00:00
Emma Anholt
dc55b47a58 intel/elk: Move pre-gen6 smooth interpolation 1/w multiply to NIR.
NIR catches that if you're just doing something like adding two smooth
inputs, we can do the multiply once on the result instead of on each
input.  BRW shader-db results:

total instructions in shared programs: 4409146 -> 4408303 (-0.02%)
instructions in affected programs: 800761 -> 799918 (-0.11%)
total cycles in shared programs: 143203198 -> 142485036 (-0.50%)
cycles in affected programs: 79081682 -> 78363520 (-0.91%)
total sends in shared programs: 363044 -> 363042 (<.01%)
sends in affected programs: 33 -> 31 (-6.06%)

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
2025-06-18 23:11:42 +00:00
Emma Anholt
fb9b2261a1 intel/elk: Move pre-gen6 gl_FragCoord.w -> interpolation lowering to NIR.
BRW shader-db:
total instructions in shared programs: 4409143 -> 4409146 (<.01%)
instructions in affected programs: 330 -> 333 (0.91%)

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
2025-06-18 23:11:41 +00:00
Emma Anholt
17ab39fbf8 intel/elk: Fix some tabs in gen4 URB setup.
This formatted terribly in my editor, just use spaces.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
2025-06-18 23:11:40 +00:00
Emma Anholt
9d7a016ed1 intel/elk: Retire the global float pixel_x/y values.
Nothing used them any more.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
2025-06-18 23:11:40 +00:00
Emma Anholt
e1bf014b6e intel/elk: Reduce this->pixel_x/y usage in gfx4 interp setup.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
2025-06-18 23:11:40 +00:00
Emma Anholt
241bc5da70 intel/elk: Use the pixel_coord UW x/y values for noncoherent FB reads.
No need to force generating the float cast just to turn it back to an int.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
2025-06-18 23:11:39 +00:00
Emma Anholt
1134cdc198 intel/elk: Lower load_frag_coord to load_{pixel_coord,frag_coord_z/w} in NIR.
This moves some conversions to NIR that may get eliminated, and also
distinguishes gl_FragCoord.z/w loads at the shader info level so we don't
need to flag uses_src_depth/uses_src_w when only gl_FragCoord.xy get used
(as is typical).  This reduces thread payload setup on many shaders.
Also, interestingly, blorp shaders stop reserving space for z/w despite
not putting them in the payload (since PS_EXTRA isn't filled out for z/w).

HSW shader-db is noise:

total instructions in shared programs: 9942649 -> 9942997 (<.01%)
instructions in affected programs: 143167 -> 143515 (0.24%)

total cycles in shared programs: 314768862 -> 314299112 (-0.15%)
cycles in affected programs: 62951452 -> 62481702 (-0.75%)

LOST:   44
GAINED: 26

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
2025-06-18 23:11:39 +00:00
Emma Anholt
88f1656133 intel/elk: Save the UW pixel x/y as a temp.
This will be used for representing gl_FragCoord in NIR and reducing
payload registers pushed.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
2025-06-18 23:11:38 +00:00
Emma Anholt
5222c35924 intel/elk: Save the UW pixel x/y as a temp on gfx6+.
This will be used for representing gl_FragCoord in NIR and reducing
payload registers pushed.

HSW results:

total instructions in shared programs: 9940636 -> 9948574 (0.08%)
instructions in affected programs: 852560 -> 860498 (0.93%)

total cycles in shared programs: 314804525 -> 314900080 (0.03%)
cycles in affected programs: 39786599 -> 39882154 (0.24%)

LOST:   5
GAINED: 11
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
2025-06-18 23:11:38 +00:00