Commit graph

39 commits

Author SHA1 Message Date
Kenneth Graunke
e5ed6f64d9 brw: Stop checking inst->is_send_from_grf() for g127 register hack
Every case but SHADER_OPCODE_SEND and SHADER_OPCODE_BARRIER will be
lowered to SEND before register allocation happens.  And the barrier
send has a null destination, so the restriction doesn't apply.

Note that this hack is for Gfx9 only, so we don't need to worry about
Xe3's SHADER_OPCODE_SEND_GATHER feature.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34040>
2025-08-08 22:12:05 +00:00
Kenneth Graunke
b0eb90ddb1 brw: Assert that EOT is always SHADER_OPCODE_SEND on pre-Xe3
We used to have other opcodes as well, but we've since transitioned
entirely to logical send lowering prior to register allocation.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34040>
2025-08-08 22:12:05 +00:00
Marek Olšák
db26597f8d intel: fork exec_node/list -> brw_exec_node/list as a private Intel utility
NIR is going to use exec_node/list without the C++ code, and may switch to
a different linked list implementation in the future.

GLSL is going to use ir_exec_node/list, which we want to keep private
for GLSL, so that we can change it easily.

Thus, it's better to fork the C++ version of list.h for Intel.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36425>
2025-07-31 20:23:02 +00:00
Antonio Ospite
ddf2aa3a4d build: avoid redefining unreachable() which is standard in C23
In the C23 standard unreachable() is now a predefined function-like
macro in <stddef.h>

See https://android.googlesource.com/platform/bionic/+/HEAD/docs/c23.md#is-now-a-predefined-function_like-macro-in

And this causes build errors when building for C23:

-----------------------------------------------------------------------
In file included from ../src/util/log.h:30,
                 from ../src/util/log.c:30:
../src/util/macros.h:123:9: warning: "unreachable" redefined
  123 | #define unreachable(str)    \
      |         ^~~~~~~~~~~
In file included from ../src/util/macros.h:31:
/usr/lib/gcc/x86_64-linux-gnu/14/include/stddef.h:456:9: note: this is the location of the previous definition
  456 | #define unreachable() (__builtin_unreachable ())
      |         ^~~~~~~~~~~
-----------------------------------------------------------------------

So don't redefine it with the same name, but use the name UNREACHABLE()
to also signify it's a macro.

Using a different name also makes sense because the behavior of the
macro was extending the one of __builtin_unreachable() anyway, and it
also had a different signature, accepting one argument, compared to the
standard unreachable() with no arguments.

This change improves the chances of building mesa with the C23 standard,
which for instance is the default in recent AOSP versions.

All the instances of the macro, including the definition, were updated
with the following command line:

  git grep -l '[^_]unreachable(' -- "src/**" | sort | uniq | \
  while read file; \
  do \
    sed -e 's/\([^_]\)unreachable(/\1UNREACHABLE(/g' -i "$file"; \
  done && \
  sed -e 's/#undef unreachable/#undef UNREACHABLE/g' -i src/intel/isl/isl_aux_info.c

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36437>
2025-07-31 17:49:42 +00:00
Caio Oliveira
ac2b072312 brw: Add more specific brw_builder helpers
Replace uses of brw_builder::at() with various more descriptive
variants.  Use block pointer from instruction when possible.

A couple of special cases remained and will be handled in separate patches.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34681>
2025-07-19 17:49:47 +00:00
Ian Romanick
f6da6399d7 brw/reg_allocate: Don't access out of bounds in non-debug builds
In debug builds, the assertion should be preferred as it will highlight
the actual problem. In non-debug builds, it is possible to fail register
allocation more gracefully. If the problem only occurs in, for example,
a SIMD32 version of a shader, the application may even continue to
function.

Closes: #13239
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36202>
2025-07-18 19:04:01 +00:00
Ian Romanick
b57bad1fd7 brw/reg_allocate: Check source / destination hazard for all larger SIMD
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
All platforms needs this check for SIMD32. Xe2+ do not need this for
SIMD16.

Also... delete some really stale comments about Gfx4/Gfx5. This compiler
doesn't even support those platforms.

No shader-db changes on any pre-Xe2 Intel platforms:

shader-db:

Lunar Lake
total instructions in shared programs: 17108867 -> 17108855 (<.01%)
instructions in affected programs: 35211 -> 35199 (-0.03%)
helped: 19 / HURT: 6

total cycles in shared programs: 885026794 -> 885805580 (0.09%)
cycles in affected programs: 140449880 -> 141228666 (0.55%)
helped: 903 / HURT: 1142

LOST:   0
GAINED: 25

fossil-db:

Lunar Lake
Totals:
Instrs: 208578317 -> 208574097 (-0.00%); split: -0.00%, +0.00%
Cycle count: 31268800798 -> 31259914590 (-0.03%); split: -0.10%, +0.07%
Spill count: 504472 -> 504102 (-0.07%); split: -0.09%, +0.02%
Fill count: 606581 -> 606079 (-0.08%); split: -0.13%, +0.05%
Scratch Memory Size: 35001344 -> 34957312 (-0.13%)

Totals from 60714 (8.59% of 706970) affected shaders:
Instrs: 48923370 -> 48919150 (-0.01%); split: -0.01%, +0.01%
Cycle count: 11830486210 -> 11821600002 (-0.08%); split: -0.27%, +0.20%
Spill count: 397150 -> 396780 (-0.09%); split: -0.12%, +0.02%
Fill count: 469651 -> 469149 (-0.11%); split: -0.17%, +0.06%
Scratch Memory Size: 25971712 -> 25927680 (-0.17%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35903>
2025-07-15 19:35:44 +00:00
Ian Romanick
7e98ca89f2 brw/reg_allocate: Adjust source / destination hazard conditions for broadcast
Broadcast selects one lane from the source to write to all the lanes
of the destination. This makes it possible for the first half to
overwrite the source used by the second half.

No shader-db changes on any Intel platform.

fossil-db:

Lunar Lake
Totals:
Instrs: 208705405 -> 208705374 (-0.00%); split: -0.00%, +0.00%
Cycle count: 31274597098 -> 31273711544 (-0.00%); split: -0.00%, +0.00%

Totals from 77 (0.01% of 707133) affected shaders:
Instrs: 220177 -> 220146 (-0.01%); split: -0.02%, +0.00%
Cycle count: 461694212 -> 460808658 (-0.19%); split: -0.33%, +0.14%

No fossil-db changes on any other Intel platforms.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35903>
2025-07-15 19:35:44 +00:00
Ian Romanick
67dc02acc2 brw/reg_allocate: Only add interference for the source with the hazard
shader-db:

Lunar Lake
total instructions in shared programs: 17105892 -> 17105732 (<.01%)
instructions in affected programs: 55720 -> 55560 (-0.29%)
helped: 29 / HURT: 24

total cycles in shared programs: 884342344 -> 884663448 (0.04%)
cycles in affected programs: 154776382 -> 155097486 (0.21%)
helped: 719 / HURT: 761

total spills in shared programs: 3278 -> 3262 (-0.49%)
spills in affected programs: 320 -> 304 (-5.00%)
helped: 4 /HURT: 0

total fills in shared programs: 1632 -> 1616 (-0.98%)
fills in affected programs: 368 -> 352 (-4.35%)
helped: 4 / HURT: 0

LOST:   3
GAINED: 4

No shader-db changes on any other Intel platforms.

fossil-db:

Lunar Lake
Totals:
Instrs: 208696275 -> 208692511 (-0.00%); split: -0.00%, +0.00%
Cycle count: 31325252074 -> 31274118190 (-0.16%); split: -0.27%, +0.11%
Spill count: 504809 -> 504472 (-0.07%); split: -0.07%, +0.01%
Fill count: 607047 -> 606581 (-0.08%); split: -0.08%, +0.01%
Scratch Memory Size: 35037184 -> 35001344 (-0.10%); split: -0.11%, +0.01%

Totals from 44135 (6.24% of 707112) affected shaders:
Instrs: 39570465 -> 39566701 (-0.01%); split: -0.01%, +0.00%
Cycle count: 11140437886 -> 11089304002 (-0.46%); split: -0.76%, +0.30%
Spill count: 279756 -> 279419 (-0.12%); split: -0.13%, +0.01%
Fill count: 354706 -> 354240 (-0.13%); split: -0.14%, +0.01%
Scratch Memory Size: 18758656 -> 18722816 (-0.19%); split: -0.20%, +0.01%

Meteor Lake, DG2, Tiger Lake, Ice Lake, and Skylake had similar results. (Meteor Lake shown)
Totals:
Cycle count: 25377247343 -> 25377246251 (-0.00%); split: -0.00%, +0.00%

Totals from 11 (0.00% of 806166) affected shaders:
Cycle count: 899080 -> 897988 (-0.12%); split: -0.48%, +0.36%

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35903>
2025-07-15 19:35:43 +00:00
Ian Romanick
4e05de7c3d brw/reg_allocate: Require SIMD32 for destination / source interference on Xe2
No platforms other than Lunar Lake were affected in shader-db or
fossil-db for obvious reasons.

shader-db:

Lunar Lake
total instructions in shared programs: 17070074 -> 17069908 (<.01%)
instructions in affected programs: 151939 -> 151773 (-0.11%)
helped: 61 / HURT: 60

total cycles in shared programs: 891338314 -> 880188516 (-1.25%)
cycles in affected programs: 550482120 -> 539332322 (-2.03%)
helped: 8053 / HURT: 7183

total spills in shared programs: 3294 -> 3278 (-0.49%)
spills in affected programs: 138 -> 122 (-11.59%)
helped: 8 / HURT: 0

total fills in shared programs: 1653 -> 1632 (-1.27%)
fills in affected programs: 212 -> 191 (-9.91%)
helped: 8 / HURT: 0

LOST:   96
GAINED: 70

fossil-db:

Lunar Lake
Totals:
Instrs: 208555066 -> 208509387 (-0.02%); split: -0.03%, +0.00%
Cycle count: 31487691872 -> 31318442816 (-0.54%); split: -0.88%, +0.34%
Spill count: 508701 -> 504809 (-0.77%); split: -0.86%, +0.10%
Fill count: 612583 -> 607047 (-0.90%); split: -1.03%, +0.13%
Scratch Memory Size: 35311616 -> 35037184 (-0.78%); split: -0.81%, +0.04%

Totals from 214417 (30.33% of 706852) affected shaders:
Instrs: 123732970 -> 123687291 (-0.04%); split: -0.04%, +0.01%
Cycle count: 27410928904 -> 27241679848 (-0.62%); split: -1.01%, +0.39%
Spill count: 452458 -> 448566 (-0.86%); split: -0.97%, +0.11%
Fill count: 550991 -> 545455 (-1.00%); split: -1.15%, +0.14%
Scratch Memory Size: 31138816 -> 30864384 (-0.88%); split: -0.92%, +0.04%

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35903>
2025-07-15 19:35:43 +00:00
Ian Romanick
e9ae997ffc brw: Only apply GRF 127 send workaround to Gfx9
The portion of the Bspec dedicated to Gfx6-Gfx11 says that this
workaround applies to "Pre-CNL" (with CNL being Gfx10). There is no
mention of this workaround in the sections for Xe or Xe2.

No shader-db or fossil-db changes on Skylake or older Intel platforms.

shader-db:

Lunar Lake, Meteor Lake, DG2, Tiger Lake, and Ice Lake (Lunar Lake shown)
total instructions in shared programs: 17107031 -> 17107027 (<.01%)
instructions in affected programs: 32182 -> 32178 (-0.01%)
helped: 16 / HURT: 14

total cycles in shared programs: 895016760 -> 894975410 (<.01%)
cycles in affected programs: 312774834 -> 312733484 (-0.01%)
helped: 9279 / HURT: 8091

LOST:   40
GAINED: 33

The pre-Xe2 platforms had a lot more lost / gained shaders. This appears
to be due to churn in the cycle counts and the SIMD32 heuristic.

fossil-db:

Lunar Lake
Totals:
Instrs: 208667436 -> 208671853 (+0.00%); split: -0.00%, +0.01%
Subgroup size: 14241168 -> 14241200 (+0.00%)
Cycle count: 31495149690 -> 31481397970 (-0.04%); split: -0.17%, +0.13%
Spill count: 508467 -> 508701 (+0.05%); split: -0.10%, +0.14%
Fill count: 611979 -> 612583 (+0.10%); split: -0.07%, +0.17%
Scratch Memory Size: 35288064 -> 35311616 (+0.07%); split: -0.07%, +0.14%

Totals from 205773 (29.10% of 707019) affected shaders:
Instrs: 103153541 -> 103157958 (+0.00%); split: -0.01%, +0.01%
Subgroup size: 4563584 -> 4563616 (+0.00%)
Cycle count: 12979963010 -> 12966211290 (-0.11%); split: -0.42%, +0.32%
Spill count: 494741 -> 494975 (+0.05%); split: -0.10%, +0.15%
Fill count: 597988 -> 598592 (+0.10%); split: -0.07%, +0.17%
Scratch Memory Size: 33351680 -> 33375232 (+0.07%); split: -0.08%, +0.15%

Meteor Lake and DG2 had similar results. (Meteor Lake shown)
Totals:
Instrs: 233063764 -> 233057897 (-0.00%); split: -0.01%, +0.00%
Subgroup size: 9892840 -> 9892856 (+0.00%)
Cycle count: 25387597341 -> 25373885583 (-0.05%); split: -0.36%, +0.31%
Spill count: 518469 -> 517940 (-0.10%); split: -0.19%, +0.09%
Fill count: 559444 -> 558537 (-0.16%); split: -0.29%, +0.13%
Scratch Memory Size: 19694592 -> 19658752 (-0.18%); split: -0.21%, +0.03%
Max dispatch width: 7135248 -> 7131672 (-0.05%); split: +0.13%, -0.18%

Totals from 301996 (37.49% of 805603) affected shaders:
Instrs: 144535999 -> 144530132 (-0.00%); split: -0.01%, +0.01%
Subgroup size: 3768528 -> 3768544 (+0.00%)
Cycle count: 18687102311 -> 18673390553 (-0.07%); split: -0.50%, +0.42%
Spill count: 515687 -> 515158 (-0.10%); split: -0.20%, +0.09%
Fill count: 557638 -> 556731 (-0.16%); split: -0.29%, +0.13%
Scratch Memory Size: 18662400 -> 18626560 (-0.19%); split: -0.22%, +0.03%
Max dispatch width: 2029872 -> 2026296 (-0.18%); split: +0.44%, -0.62%

Tiger Lake
Totals:
Instrs: 238813279 -> 238766482 (-0.02%); split: -0.04%, +0.02%
Subgroup size: 9851320 -> 9851328 (+0.00%)
Cycle count: 23668877036 -> 23646286421 (-0.10%); split: -0.51%, +0.42%
Spill count: 559060 -> 554241 (-0.86%); split: -1.12%, +0.26%
Fill count: 595926 -> 591843 (-0.69%); split: -1.46%, +0.78%
Scratch Memory Size: 19929088 -> 19764224 (-0.83%); split: -1.19%, +0.36%
Max dispatch width: 7102184 -> 7101840 (-0.00%); split: +0.13%, -0.13%

Totals from 284125 (35.42% of 802235) affected shaders:
Instrs: 144695094 -> 144648297 (-0.03%); split: -0.06%, +0.03%
Subgroup size: 3567312 -> 3567320 (+0.00%)
Cycle count: 11303753658 -> 11281163043 (-0.20%); split: -1.07%, +0.87%
Spill count: 554624 -> 549805 (-0.87%); split: -1.13%, +0.26%
Fill count: 592252 -> 588169 (-0.69%); split: -1.47%, +0.78%
Scratch Memory Size: 19553280 -> 19388416 (-0.84%); split: -1.21%, +0.37%
Max dispatch width: 1895488 -> 1895144 (-0.02%); split: +0.48%, -0.50%

Ice Lake
Totals:
Instrs: 239034316 -> 239049108 (+0.01%); split: -0.03%, +0.04%
Subgroup size: 9926440 -> 9926448 (+0.00%)
Cycle count: 24944253156 -> 24919967386 (-0.10%); split: -0.25%, +0.15%
Spill count: 575498 -> 571612 (-0.68%); split: -1.18%, +0.51%
Fill count: 709760 -> 716665 (+0.97%); split: -1.31%, +2.28%
Scratch Memory Size: 20699136 -> 20599808 (-0.48%); split: -1.45%, +0.97%
Max dispatch width: 7140856 -> 7143568 (+0.04%); split: +0.15%, -0.12%

Totals from 233451 (29.01% of 804669) affected shaders:
Instrs: 127440610 -> 127455402 (+0.01%); split: -0.07%, +0.08%
Subgroup size: 2835784 -> 2835792 (+0.00%)
Cycle count: 11818511030 -> 11794225260 (-0.21%); split: -0.53%, +0.32%
Spill count: 559557 -> 555671 (-0.69%); split: -1.22%, +0.52%
Fill count: 694460 -> 701365 (+0.99%); split: -1.34%, +2.33%
Scratch Memory Size: 19774464 -> 19675136 (-0.50%); split: -1.52%, +1.02%
Max dispatch width: 1602736 -> 1605448 (+0.17%); split: +0.69%, -0.52%

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35903>
2025-07-15 19:35:42 +00:00
Caio Oliveira
887642b0f2 intel: Add INTEL_DEBUG=no-vrt
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Add support for disabling the VRT (Variable Register Thread) feature.
The strategy here is to force the old BRW_MAX_GRF limit for the
register allocator (locks the upper limit) and make sure
ptl_register_blocks() always return that amount of blocks (locks
the lower limit).

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35781>
2025-07-13 21:11:02 +00:00
Caio Oliveira
80fb555718 brw: Fix MAD instruction usage in spilling logic
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The intention here is to build a SIMD8 value, that will be expanded
as needed -- just like the SHL/ADD case, but with a single instruction.

Found when the was triggering invalid MAD with SIMD32 (that gets compressed)
*and* with overlapping destination and source *and* which would cause
conflict when divided into two SIMD16.

Fixes: 338273dedd ("brw/reg_allocate: Optimize spill offset calculation using integer MAD")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35302>
2025-06-06 15:31:50 +00:00
Ian Romanick
338273dedd brw/reg_allocate: Optimize spill offset calculation using integer MAD
Gfx12.5 and later allow the use of two 16-bit immediate values in
integer MAD. Gfx11 and Gfx12 allow a single immediate for integer MAD,
but that is not helpful where.

v2: brw_reg_alloc::build_lane_offsets is only used on Gfx12.5+, so the
check around using integer MAD is unnecessary.

No shader-db or fossil-db changes on any pre-Gfx12.5 platforms.

shader-db:

Lunar Lake, Meteor Lake, and DG2 had similar results. (Lunar Lake shown)
total instructions in shared programs: 17119962 -> 17118441 (<.01%)
instructions in affected programs: 65398 -> 63877 (-2.33%)
helped: 32 / HURT: 0

total cycles in shared programs: 895433316 -> 895425578 (<.01%)
cycles in affected programs: 13437376 -> 13429638 (-0.06%)
helped: 30 / HURT: 2

fossil-db:

Lunar Lake, Meteor Lake, and DG2 had similar results. (Lunar Lake shown)
Totals:
Instrs: 210052706 -> 209550074 (-0.24%)
Cycle count: 31486266412 -> 31436238696 (-0.16%); split: -0.16%, +0.00%

Totals from 7081 (1.00% of 707082) affected shaders:
Instrs: 16864614 -> 16361982 (-2.98%)
Cycle count: 6323185782 -> 6273158066 (-0.79%); split: -0.79%, +0.00%

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34886>
2025-05-09 21:31:09 +00:00
Ian Romanick
3db8dbfdc3 brw/reg_allocate: Optimize spill offset calculation using more SIMD8
Re-associate the calculation. The current calcuation is

    ((lane + zero_or_8) << 2) + offset

The first addition is SIMD8, and the shift and second addition are
SIMD16. By switching to

    ((lane << 2) + offset) + zero_or_32

All operations are SIMD8.

The SHL operates directly on the UW 0x76543210UV value, and that
eliminates the MOV to expand the UW to UD.

v2: Switch to alternate method. Update for SIMD32 on Xe2.

No shader-db or fossil-db changes on any pre-Gfx12.5 platforms.

shader-db:

Lunar Lake, Meteor Lake, and DG2 had similar results. (Lunar Lake shown)
total instructions in shared programs: 17121519 -> 17119962 (<.01%)
instructions in affected programs: 73208 -> 71651 (-2.13%)
helped: 36
HURT: 0
helped stats (abs) min: 1 max: 129 x̄: 43.25 x̃: 56
helped stats (rel) min: 0.05% max: 4.92% x̄: 2.50% x̃: 2.79%
95% mean confidence interval for instructions value: -56.02 -30.48
95% mean confidence interval for instructions %-change: -3.24% -1.75%
Instructions are helped.

total cycles in shared programs: 895450146 -> 895433316 (<.01%)
cycles in affected programs: 13709400 -> 13692570 (-0.12%)
helped: 31
HURT: 2
helped stats (abs) min: 26 max: 1654 x̄: 543.10 x̃: 672
helped stats (rel) min: <.01% max: 3.43% x̄: 0.43% x̃: 0.51%
HURT stats (abs)   min: 2 max: 4 x̄: 3.00 x̃: 3
HURT stats (rel)   min: <.01% max: <.01% x̄: <.01% x̃: <.01%
95% mean confidence interval for cycles value: -652.42 -367.58
95% mean confidence interval for cycles %-change: -0.61% -0.19%
Cycles are helped.

fossil-db:

Lunar Lake, Meteor Lake, and DG2 had similar results. (Lunar Lake shown)
Totals:
Instrs: 210566294 -> 210052706 (-0.24%)
Cycle count: 31582309052 -> 31486266412 (-0.30%); split: -0.30%, +0.00%

Totals from 7091 (1.00% of 707082) affected shaders:
Instrs: 17408115 -> 16894527 (-2.95%)
Cycle count: 6443785290 -> 6347742650 (-1.49%); split: -1.49%, +0.00%

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34886>
2025-05-09 21:31:09 +00:00
Caio Oliveira
7457c4ecfd brw: Make brw_range use half-open ranges
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34253>
2025-04-09 19:06:49 +00:00
Caio Oliveira
6509f8139d brw: Use brw_range::last() to explicit get the last valid IP
This is a preparation to change what is stored in brw_range::end.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34253>
2025-04-09 19:06:49 +00:00
Caio Oliveira
0b4a3c0ff6 brw: Use brw_range to store VGRF ranges
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34253>
2025-04-09 19:06:49 +00:00
Caio Oliveira
e644b42e59 brw: Use brw_range when operating with live ranges
Makes the intention of some comparisons clearer by using the named
helper functions.  Add commentary when the straightforward range is not
the one used, e.g. VGRF interference.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34253>
2025-04-09 19:06:49 +00:00
Caio Oliveira
f56a5cf1eb brw: Use brw_range in IP ranges analysis
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34253>
2025-04-09 19:06:49 +00:00
Caio Oliveira
7ae638c0fe brw: Add brw_builder::uniform()
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34355>
2025-04-04 23:07:21 +00:00
Caio Oliveira
a6b0783375 brw: Use brw_ip_ranges in scheduling / regalloc
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34012>
2025-03-29 00:25:51 +00:00
Caio Oliveira
fd6045cca9 brw: Track total_instructions in a shader
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34012>
2025-03-29 00:25:50 +00:00
Lionel Landwerlin
c60180ba63 brw: fix spilling for Xe2+
The problem occurs with a series of instructions build the subgroup
invocation value :

mov(8)          g23<1>UW        0x76543210V
add(8)          g23.8<1>UW      g23<8,8,1>UW    0x0008UW
add(16)         g23.16<1>UW     g23<16,16,1>UW  0x0010UW

Our register spilling code operates on physical registers (64B on
Xe2+) and using the brw_inst::is_partial_write() helper only considers
32B registers. So the spiller doesn't see that the add(16) instruction
is doing a partial write and ends up discarding the previous value.

You can reproduce the issue by running a test like :

INTEL_DEBUG=spill_fs ./deqp-vk -n dEQP-VK.compute.pipeline.cooperative_matrix.khr_a.subgroupscope.constant.uint8_uint8.buffer.rowmajor.linear

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: aa494cbacf ("brw: align spilling offsets to physical register sizes")
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33642>
2025-03-13 15:29:22 +00:00
Caio Oliveira
32e562ae01 brw: Simplify brw_builder "insert before inst" constructor
Since brw_inst now has the block it belongs and the block can
reach the shader, the only necessary information to create a
builder is the brw_inst itself.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33815>
2025-03-06 23:33:38 +00:00
Kenneth Graunke
88309a9818 brw: Rename shared function enums for clarity
Our name for this enum was brw_message_target, but it's better known as
shared function ID or SFID.  Call it brw_sfid to make it easier to find.

Now that brw only supports Gfx9+, we don't particularly care whether
SFIDs were introduced on Gfx4, Gfx6, or Gfx7.5.  Also, the LSC SFIDs
were confusingly tagged "GFX12" but aren't available on Gfx12.0; they
were introduced with Alchemist/Meteorlake.

GFX6_SFID_DATAPORT_SAMPLER_CACHE in particular was confusing.  It sounds
like the SFID to use for the sampler on Gfx6+, however it has nothing to
do with the sampler at all.  BRW_SFID_SAMPLER remains the sampler SFID.
On Haswell, we ran out of messages on the main data cache data port, and
so they introduced two additional ones, for more messages.  The modern
Tigerlake PRMs simply call these DP_DC0, DP_DC1, and DP_DC2.  I think
the "sampler" name came from some idea about reorganizing messages that
never materialized (instead, the LSC came as a much larger cleanup).

Recently we've adopted the term "HDC" for the legacy data cluster, as
opposed to "LSC" for the modern Load/Store Cache.  To make clear which
SFIDs target the legacy HDC dataports, we use BRW_SFID_HDC0/1/2.

We were also citing the G45, Sandybridge, and Ivybridge PRMs for a
compiler that supports none of those platforms.  Cite modern docs.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33650>
2025-02-27 08:49:24 +00:00
Caio Oliveira
ff44f4d278 intel/brw: Update outdated comments
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32536>
2025-02-11 09:13:28 +00:00
Caio Oliveira
5c55b29d1a intel/brw: Rename a few remaining functions to remove fs prefix
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32536>
2025-02-11 09:13:28 +00:00
Caio Oliveira
cf3bb77224 intel/brw: Rename fs_visitor to brw_shader
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32536>
2025-02-11 09:13:28 +00:00
Caio Oliveira
352a63122f intel/brw: Rename files brw_fs.cpp/h to brw_shader.cpp/h
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32536>
2025-02-11 09:13:28 +00:00
Caio Oliveira
f82bcd56fc intel/brw: Add functions to allocate VGRF space
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33334>
2025-02-06 08:33:03 -08:00
Caio Oliveira
ea87bab4ce intel/brw: Remove 'using namespace brw' directives
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33418>
2025-02-06 07:58:55 -08:00
Caio Oliveira
1ade9a05d8 intel/brw: Use brw prefix instead of namespace for analysis implementations
Also drop the 'fs' prefix when applicable.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33048>
2025-02-05 21:47:07 +00:00
Caio Oliveira
2b92eb0b2c intel/brw: Use brw prefix instead of namespace for dep analysis enum
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33048>
2025-02-05 21:47:07 +00:00
Caio Oliveira
d59bd421a2 intel/brw: Rename fs_inst to brw_inst
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33114>
2025-01-31 00:57:21 +00:00
Francisco Jerez
d03eac3133 intel/brw/xe3+: Disable round-robin allocation heuristic on Xe3+.
Xe3+ benefits from packing register allocations tightly in order to
make optimal use of the GRF space.  The round-robin heuristic
previously in use often causes the whole GRF space to be used even if
register pressure is substantially lower, which would severely
decrease thread-level parallelism on Xe3+.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32664>
2025-01-29 23:39:32 +00:00
Francisco Jerez
8d2331fe4b intel/brw/xe3: Extend regalloc sets to maximum Xe3 GRF size.
Extend our regalloc sets to 256 registers to match the maximum
capacity of the GRF file on Gfx30.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32664>
2025-01-29 23:39:32 +00:00
Caio Oliveira
fb09dac988 intel/brw: Remove 'fs' prefix from reg alloc code
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33112>
2025-01-21 07:33:49 -08:00
Caio Oliveira
62dd470d0a intel/brw: Rename brw_fs_reg_allocate.cpp to brw_reg_allocate.cpp
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33112>
2025-01-21 07:33:49 -08:00
Renamed from src/intel/compiler/brw_fs_reg_allocate.cpp (Browse further)