Commit graph

7 commits

Author SHA1 Message Date
Ian Romanick
d1614cd6db brw/cmod: Don't propagate from CMP to ADD if there is a write between
If either source of the CMP is modified before an appropriate ADD is
found, the ADD and the CMP will not have the same result.

shader-db:

Lunar Lake
total instructions in shared programs: 17098815 -> 17098818 (<.01%)
instructions in affected programs: 1187 -> 1190 (0.25%)
helped: 0 / HURT: 3

total cycles in shared programs: 876858960 -> 876858968 (<.01%)
cycles in affected programs: 6878 -> 6886 (0.12%)
helped: 0 / HURT: 1

Meteor Lake, DG2, Tiger Lake, Ice Lake, and Skylake had similar results. (Meteor Lake shown)
total instructions in shared programs: 20034973 -> 20034984 (<.01%)
instructions in affected programs: 4599 -> 4610 (0.24%)
helped: 0 / HURT: 11

total cycles in shared programs: 881033088 -> 881033108 (<.01%)
cycles in affected programs: 57872 -> 57892 (0.03%)
helped: 0 / HURT: 5

fossil-db:

All Intel platforms had similar results. (Lunar Lake shown)
Totals:
Instrs: 918873064 -> 918873269 (+0.00%)
CodeSize: 14747338416 -> 14747339360 (+0.00%); split: -0.00%, +0.00%
Cycle count: 104141836677 -> 104141840371 (+0.00%); split: -0.00%, +0.00%

Totals from 205 (0.01% of 2011421) affected shaders:
Instrs: 290415 -> 290620 (+0.07%)
CodeSize: 4280704 -> 4281648 (+0.02%); split: -0.01%, +0.03%
Cycle count: 18166526 -> 18170220 (+0.02%); split: -0.00%, +0.02%

Closes: #14874
Fixes: 020b0055e7 ("i965/fs: Propagate conditional modifiers from compares to adds")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39967>
2026-02-19 21:28:54 +00:00
Ian Romanick
08d71730ca brw/cmod: Propagate to an instruction with same source
Detect cases like

    mov.nz.f0.0(8)  null<1>D       g66<8,8,1>D
    (+f0.0) sel(8)  g123<1>UD      g87<8,8,1>UD   g84<8,8,1>UD
    mov.nz.f0.0(8)  null<1>D       g66<8,8,1>D
    (+f0.0) sel(8)  g124<1>UD      g88<8,8,1>UD   g85<8,8,1>UD

Either MOV instruction could also be an equivalent CMP.

v2: Require no predicate, groups match, and flags written match.

v3: Add some more unit tests. Suggested by Caio.

shader-db:

All Intel platforms had similar results. (Lunar Lake shown)
total instructions in shared programs: 17203627 -> 17203590 (<.01%)
instructions in affected programs: 51432 -> 51395 (-0.07%)
helped: 37 / HURT: 0

total cycles in shared programs: 879884982 -> 879884670 (<.01%)
cycles in affected programs: 6014730 -> 6014418 (<.01%)
helped: 25 / HURT: 4

fossil-db:
Lunar Lake
Totals:
Instrs: 925092938 -> 925071952 (-0.00%); split: -0.00%, +0.00%
Cycle count: 105972157149 -> 105966120894 (-0.01%); split: -0.01%, +0.00%
Spill count: 3423592 -> 3423582 (-0.00%)
Fill count: 4876743 -> 4877121 (+0.01%); split: -0.00%, +0.01%
Max live registers: 193525293 -> 193525251 (-0.00%)
Max dispatch width: 49047056 -> 49047088 (+0.00%); split: +0.00%, -0.00%

Totals from 17714 (0.88% of 2018791) affected shaders:
Instrs: 56708169 -> 56687183 (-0.04%); split: -0.04%, +0.00%
Cycle count: 4560530879 -> 4554494624 (-0.13%); split: -0.15%, +0.01%
Spill count: 434846 -> 434836 (-0.00%)
Fill count: 807443 -> 807821 (+0.05%); split: -0.02%, +0.07%
Max live registers: 4332542 -> 4332500 (-0.00%)
Max dispatch width: 295248 -> 295280 (+0.01%); split: +0.02%, -0.01%

Meteor Lake and DG2 had similar results. (Meteor Lake shown)
Totals:
Instrs: 995075628 -> 995051291 (-0.00%); split: -0.00%, +0.00%
Cycle count: 92060967154 -> 92059311640 (-0.00%); split: -0.00%, +0.00%
Spill count: 3664664 -> 3664675 (+0.00%); split: -0.00%, +0.00%
Fill count: 4961929 -> 4961874 (-0.00%); split: -0.00%, +0.00%
Max live registers: 121480292 -> 121480184 (-0.00%)
Max dispatch width: 37947528 -> 37947496 (-0.00%)

Totals from 20569 (0.90% of 2278279) affected shaders:
Instrs: 57437989 -> 57413652 (-0.04%); split: -0.04%, +0.00%
Cycle count: 4297505238 -> 4295849724 (-0.04%); split: -0.06%, +0.03%
Spill count: 487508 -> 487519 (+0.00%); split: -0.00%, +0.00%
Fill count: 869228 -> 869173 (-0.01%); split: -0.01%, +0.00%
Max live registers: 2413028 -> 2412920 (-0.00%)
Max dispatch width: 239280 -> 239248 (-0.01%)

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
Totals:
Instrs: 1012570598 -> 1012546137 (-0.00%); split: -0.00%, +0.00%
Cycle count: 85579989052 -> 85589116671 (+0.01%); split: -0.00%, +0.01%
Spill count: 3901755 -> 3901748 (-0.00%)
Fill count: 6799383 -> 6799367 (-0.00%)
Max live registers: 122288761 -> 122288658 (-0.00%)

Totals from 20595 (0.90% of 2280449) affected shaders:
Instrs: 57764192 -> 57739731 (-0.04%); split: -0.04%, +0.00%
Cycle count: 3899898675 -> 3909026294 (+0.23%); split: -0.04%, +0.27%
Spill count: 481262 -> 481255 (-0.00%)
Fill count: 1057996 -> 1057980 (-0.00%)
Max live registers: 2412395 -> 2412292 (-0.00%)

Skylake
Totals:
Instrs: 516619178 -> 516617390 (-0.00%)
Cycle count: 57593545602 -> 57592502019 (-0.00%); split: -0.00%, +0.00%
Fill count: 860403 -> 860402 (-0.00%)
Max live registers: 87553761 -> 87553649 (-0.00%)

Totals from 1357 (0.08% of 1730068) affected shaders:
Instrs: 3575640 -> 3573852 (-0.05%)
Cycle count: 1772148559 -> 1771104976 (-0.06%); split: -0.06%, +0.00%
Fill count: 68917 -> 68916 (-0.00%)
Max live registers: 131237 -> 131125 (-0.09%)

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38315>
2025-12-18 15:15:20 -08:00
Ian Romanick
24cd8aa3b8 brw/cmod: Allow FIXED_GRF
Later commits will call cmod prop after register allocation. At that
time, there is only FIXED_GRF.

No shader-db or fossil-db changes on any Intel platform.

v2: FIXED_GRF uses subnr instead of offset. Add a unit test to
demonstrate the issue.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38315>
2025-12-18 15:15:20 -08:00
Ian Romanick
ba30794847 brw/cmod: Don't propagate between instructions in different groups
The group implicity selects which flags the instruction can write. This
was discovered while working on another set of changes that could change
some logical operations into predicated MOV instructions.

Prevents regressions later in the series in
dEQP-VK.graphicsfuzz.cov-loop-fragcoord-identical-condition.

No shader-db or fossil-db changes on any Intel platform.

v2: Update the comment in the test case. Suggested by Caio.

Fixes: 95ac3b1dae ("i965/fs: don't propagate cmod when the exec sizes differ")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38315>
2025-12-18 15:15:20 -08:00
Lionel Landwerlin
efcba73b49 brw: switch to new sampler payload description scheme
Instead of having abstracted opcodes, we target directly the HW format
at the NIR translation.

The payload description gives us the order of the payload sources (we
can use that for pretty printing) and we don't have to have a
complicated scheme in the logical send lowering for the ordering. All
we have to do is build the header if needed as well as the descriptors.

PTL Fossil-db stats:
 Totals from 66759 (13.54% of 492917) affected shaders:
 Instrs: 44289221 -> 43957404 (-0.75%); split: -0.81%, +0.06%
 Send messages: 2050378 -> 2042607 (-0.38%)
 Cycle count: 3878874713 -> 3712848434 (-4.28%); split: -4.44%, +0.16%
 Max live registers: 8773179 -> 8770104 (-0.04%); split: -0.06%, +0.03%
 Max dispatch width: 1677408 -> 1707952 (+1.82%); split: +1.85%, -0.03%
 Non SSA regs after NIR: 11407821 -> 11421041 (+0.12%); split: -0.03%, +0.15%
 GRF registers: 5686983 -> 5838785 (+2.67%); split: -0.24%, +2.91%

LNL Fossil-db stats:

 Totals from 57911 (15.72% of 368381) affected shaders:
 Instrs: 39448036 -> 38923650 (-1.33%); split: -1.41%, +0.08%
 Subgroup size: 1241360 -> 1241392 (+0.00%)
 Send messages: 1846696 -> 1845137 (-0.08%)
 Cycle count: 3834818910 -> 3784003027 (-1.33%); split: -2.33%, +1.00%
 Spill count: 21866 -> 22168 (+1.38%); split: -0.07%, +1.45%
 Fill count: 59324 -> 60339 (+1.71%); split: -0.00%, +1.71%
 Scratch Memory Size: 1479680 -> 1483776 (+0.28%)
 Max live registers: 7521376 -> 7447841 (-0.98%); split: -1.04%, +0.06%
 Non SSA regs after NIR: 9744605 -> 10113728 (+3.79%); split: -0.01%, +3.80%

Only 2 titles negatively impacted (spilling) :
  - Shadow of the Tomb Raider
  - Red Dead Redemption 2

All impacted shaders were already spilling.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37171>
2025-10-16 12:08:15 +00:00
Ian Romanick
4193895145 brw/cmod: Enable limited cmod propagation for BFN
cmod propagation needs more work. Since the result type is always UD,
BRW_CONDITION_G should be able to substitute for NZ. Either that or
users of the condition could be rewritten to use an inverted condition.

v2: Add a couple more unit tests. Suggested by Matt.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37186>
2025-10-10 17:25:11 +00:00
Kenneth Graunke
73cbb35442 brw: Move into a new src/intel/compiler/brw subdirectory
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This keeps the directory structure a bit more organized:
- brw specific code
- elk specific code
- common NIR passes that could be used in both places

It also means that you can now 'git grep' in the brw directory without
finding a bunch of elk code, or having to "grep thing b*".

Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37755>
2025-10-09 07:01:47 +00:00
Renamed from src/intel/compiler/test_opt_cmod_propagation.cpp (Browse further)