Commit graph

4253 commits

Author SHA1 Message Date
Ian Romanick
cb69d019cf brw/nir: Use offset() for all uses of offs in emit_pixel_interpolater_alu_at_offset
This is necessary to appropriately uniformize the first component
access of a convergent vector. Without this, this is produced:

    load_payload(16) %18:D, 0d, 0d NoMask group0
    add(32) %21:F, %18+0.0:F, 0.5f
    add(32) %22:F, %18+2.0<0>:F, 0.5f

This is the correct code:

    load_payload(16) %18:D, 0d, 0d NoMask group0
    add(32) %21:F, %18+0.0<0>:F, 0.5f
    add(32) %22:F, %18+2.0<0>:F, 0.5f

Without 38b58e286f, the code generated was more incorrect, but happened
to work for this test case:

    load_payload(16) %18:D, 0d, 0d NoMask group0
    add(32) %21:F, %18+0.0<0>:F, 0.5f
    add(32) %22:F, %18+0.4<0>:F, 0.5f

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 38b58e286f ("brw/nir: Fix source handling of nir_intrinsic_load_barycentric_at_offset")
Closes: #12969
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34427>
2025-04-09 22:21:18 +00:00
Caio Oliveira
7457c4ecfd brw: Make brw_range use half-open ranges
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34253>
2025-04-09 19:06:49 +00:00
Caio Oliveira
6509f8139d brw: Use brw_range::last() to explicit get the last valid IP
This is a preparation to change what is stored in brw_range::end.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34253>
2025-04-09 19:06:49 +00:00
Caio Oliveira
596bbb2c95 brw: Use brw_range to store Vars ranges
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34253>
2025-04-09 19:06:49 +00:00
Caio Oliveira
0b4a3c0ff6 brw: Use brw_range to store VGRF ranges
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34253>
2025-04-09 19:06:49 +00:00
Caio Oliveira
e644b42e59 brw: Use brw_range when operating with live ranges
Makes the intention of some comparisons clearer by using the named
helper functions.  Add commentary when the straightforward range is not
the one used, e.g. VGRF interference.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34253>
2025-04-09 19:06:49 +00:00
Caio Oliveira
f56a5cf1eb brw: Use brw_range in IP ranges analysis
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34253>
2025-04-09 19:06:49 +00:00
Caio Oliveira
fb50461220 brw: Add brw_range struct
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34253>
2025-04-09 19:06:48 +00:00
Caio Oliveira
8d9155e34d brw: Clean up saturate propagation after non-defs version removal
Remove now unused analysis and no need to walk blocks in reverse
after the non-defs version of the pass was removed.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34253>
2025-04-09 19:06:48 +00:00
Caio Oliveira
cfc4067b0e brw: Add a few basic tests for register coalesce
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34253>
2025-04-09 19:06:48 +00:00
Lionel Landwerlin
19e4dda9a2 brw: fix shuffle with scalar/uniform index
The fixes commit isn't actually the source of the bug but likely the
biggest enabler because it creates scalar values that more easily end
up in the shuffle operations.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 1b24612c57 ("brw/nir: Treat load_*_uniform_block_intel as convergent")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12927
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12688
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12570
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12905
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12734
Reviewed-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34393>
2025-04-08 20:14:11 +00:00
Felix DeGrood
7a3de9e877 intel/brw: support for dumping shader line numbers
Add support for dumping shader asm containing instruction line numbers
matching offsets within instruction state pool buffer. Offsets
should match values collected from eu stall sampling. This is
required for match eu stall data with individual shader instructions.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30142>
2025-04-08 19:39:53 +00:00
Faith Ekstrand
436f175187 intel/compiler: Use nir_split_conversions()
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34266>
2025-04-07 17:45:21 -05:00
Caio Oliveira
bf9ad36f2d brw: Properly handle cooperative matrices created with constants
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Expand constant sources to cover the region read by DPAS, and also
use NULL register as accumulator when possible.

Reviewed-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34373>
2025-04-07 14:27:43 -07:00
Ian Romanick
f33faa4648 brw/nir: Allow b2f(not(X)) optimization on Gfx12.5+
Since there are no type conversions, no restrictions are violated.

No shader-db or fossil-db changes on any Gfx12 or older Intel
platforms.

shader-db:

Lunar Lake, Meteor Lake, and DG2 had similar results. (Lunar Lake shown)
total instructions in shared programs: 16956077 -> 16944933 (-0.07%)
instructions in affected programs: 1957573 -> 1946429 (-0.57%)
helped: 4629 / HURT: 35

total cycles in shared programs: 915668518 -> 915684808 (<.01%)
cycles in affected programs: 341925598 -> 341941888 (<.01%)
helped: 3040 / HURT: 1305
helped stats (abs) min: 2 max: 23034 x̄: 205.36 x̃: 16
helped stats (rel) min: <.01% max: 41.21% x̄: 1.28% x̃: 0.48%
HURT stats (abs)   min: 2 max: 68820 x̄: 490.88 x̃: 22
HURT stats (rel)   min: <.01% max: 103.69% x̄: 2.29% x̃: 0.37%
95% mean confidence interval for cycles value: -50.28 57.78
95% mean confidence interval for cycles %-change: -0.35% -0.07%
Inconclusive result (value mean confidence interval includes 0).

LOST:   40
GAINED: 42

fossil-db:

Lunar Lake, Meteor Lake, and DG2 had similar results. (Lunar Lake shown)
Totals:
Instrs: 209828027 -> 209790349 (-0.02%); split: -0.03%, +0.01%
Cycle count: 30504938008 -> 30514045408 (+0.03%); split: -0.06%, +0.09%
Spill count: 512182 -> 512168 (-0.00%)
Fill count: 623432 -> 623426 (-0.00%); split: -0.00%, +0.00%
Max live registers: 65465029 -> 65464959 (-0.00%)

Totals from 57895 (8.19% of 706589) affected shaders:
Instrs: 50144907 -> 50107229 (-0.08%); split: -0.11%, +0.03%
Cycle count: 7549692606 -> 7558800006 (+0.12%); split: -0.25%, +0.37%
Spill count: 58834 -> 58820 (-0.02%)
Fill count: 102324 -> 102318 (-0.01%); split: -0.01%, +0.01%
Max live registers: 9129045 -> 9128975 (-0.00%)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33931>
2025-04-07 17:42:05 +00:00
Ian Romanick
853ead2073 brw/nir: Optimize b2f(not(X)) using logical operations instead of arithmetic
Funny story... this is how regular b2f was implemented before Curro
implmented the `MOV dst:F -src:D` method 9 years ago (see
3ee2daf23d).

Eliminating the type conversion in the arithmetic operation enables the
next commit.

No shader-db or fossil-db changes on any Intel platform.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33931>
2025-04-07 17:42:05 +00:00
Ian Romanick
3d23496fd9 brw/copy: Copy prop -X into Y&1
This commit prevents code quality regressions in the next
commit. Without this, some fragment shaders in Batman: Arkham Origins
have code like:

    shr(8)          g51<1>UW        g1.28<1,8,0>UB  0x76543210V
    ...
    and(8)          g52<1>UD        ~g51<8,8,1>UW   0x0001UW
    ...
    add(8)          g56<1>D         -g52<8,8,1>D    1D

transformed to

    shr(8)          g51<1>UW        g1.28<1,8,0>UB  0x76543210V
    ...
    and(8)          g52<1>UD        ~g51<8,8,1>UW   0x0001UW
    ...
    mov(8)          g56<1>D         -g52<8,8,1>D
    ...
    and(8)          g57<1>UD        ~g56<8,8,1>D    0x00000001UD

Propagating through the negation allows the added MOV to be deleted.

shader-db:

All Intel platforms had simlar results. (Lunar Lake shown)
total instructions in shared programs: 16968020 -> 16968019 (<.01%)
instructions in affected programs: 281 -> 280 (-0.36%)
helped: 1 / HURT: 0

total cycles in shared programs: 914598850 -> 914598832 (<.01%)
cycles in affected programs: 5398 -> 5380 (-0.33%)
helped: 1 / HURT: 0

A single Blender vertex shader was affected.

fossil-db:

Lunar Lake, Tiger Lake, Ice Lake, and Skylake had similar results. (Lunar Lake shown)
Totals:
Instrs: 209894650 -> 209894651 (+0.00%)
Cycle count: 30545958586 -> 30545952860 (-0.00%)

Totals from 2 (0.00% of 706657) affected shaders:
Instrs: 3582 -> 3583 (+0.03%)
Cycle count: 1875100 -> 1869374 (-0.31%)

Meteor Lake and DG2 had similar results. (Meteor Lake shown)
Totals:
Subgroup size: 9906400 -> 9906416 (+0.00%)

Totals from 2 (0.00% of 805770) affected shaders:
Subgroup size: 16 -> 32 (+100.00%)

Two compute shaders in Hogwarts Legacy were affected.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33931>
2025-04-07 17:42:05 +00:00
Ian Romanick
e82464e6e0 brw/copy: Refactor source modifier type checking
This simplifies the next commit.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33931>
2025-04-07 17:42:05 +00:00
Ian Romanick
dee49f4206 brw/algebraic: Optimize derivative of convergent value
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This is mostly defensive. If a convergent value ever ended up as a
source of a DDX or DDY, the eu_emit code will ignore the stride. This
will result in bad code being generated.

No shader-db or fossil-db changes on any Intel platform.

v2: DDX and DDY will always be float, but brw_imm_for_type only works
with integer types.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Suggested-by: Ken
Fixes: d5d7ae22ae ("brw/nir: Fix up handling of sources that might be convergent vectors")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33007>
2025-04-07 17:16:34 +00:00
Ian Romanick
5656682344 brw/nir: Eliminate default parameter to get_nir_src
The vast majority of the callers want channel = 0. During the
development process, using this default parameter value saved a lot of
pain in rebasing. However, it seems to be more trouble than it's worth.

Issue #12464 occurred because LNL was merged while this code was in
review. As a result, one caller of get_nir_src that wanted channel = -1
was not inspected closely, and it got the default channel = 0 instead.

To prevent this happening in the future (with possible branches still
yet to be merged, for example), remove the default parameter. This will
force the inspection of any callers that don't have an explicit channel
parameter. Hopefully that will prevent more problems.

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33007>
2025-04-07 17:16:34 +00:00
Ian Romanick
38b58e286f brw/nir: Fix source handling of nir_intrinsic_load_barycentric_at_offset
The source of nir_intrinsic_load_barycentric_at_offset is a vector, so
-1 should be passed to get_nir_src. This is also done for texture
sampling intrinsics.

I skimmed the other user of get_nir_src, and I believe they are
correct. This one was just missed as LNL support landed an many, many
rebases of the original MR occurred.

v2: Fix another get_nir_src call. Suggested by Lionel.

Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> [v1]
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes: d5d7ae22ae ("brw/nir: Fix up handling of sources that might be convergent vectors")
Closes: #12464
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33007>
2025-04-07 17:16:34 +00:00
Caio Oliveira
9845693912 brw: Fix memory leak in EU validation tests
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Fixes: 62323a934b ("brw: Add BRW_TYPE_BF validation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34395>
2025-04-06 06:26:03 +00:00
Caio Oliveira
c33ee4adae brw: Fix invalid memory access in scoreboard test
Fixes: 03aca2d248 ("brw: Use new bld/exp style in scoreboard tests")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34394>
2025-04-05 22:58:23 -07:00
Caio Oliveira
7ae638c0fe brw: Add brw_builder::uniform()
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34355>
2025-04-04 23:07:21 +00:00
Caio Oliveira
f33d93da11 brw: Remove HSW specific code from brw_compile_cs.cpp
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34355>
2025-04-04 23:07:21 +00:00
Caio Oliveira
03aca2d248 brw: Use new bld/exp style in scoreboard tests
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34354>
2025-04-04 20:14:53 +00:00
Caio Oliveira
7ee673c195 brw: Add parser of SWSB annotations to use in tests
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34354>
2025-04-04 20:14:53 +00:00
Caio Oliveira
81dd3e1527 brw: Return actual progress in brw_lower_scoreboard
This will be useful later for tests to be used in conjunction with the
EXPECT_PROGRESS / EXPECT_NO_PROGRESS helpers.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34354>
2025-04-04 20:14:53 +00:00
Caio Oliveira
3e727000dd brw: Stop setting SFID in scoreboard tests
They won't affect the scoreboard, and will get in the
way of a later change.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34354>
2025-04-04 20:14:53 +00:00
Caio Oliveira
bcea076aca brw: Use SIMD16 shaders in scoreboard tests for Xe2+
Some tests changed to avoid unintended overlap between operands which
would change the SWSB assigned.  In some cases also changed the Gfx12
matching test so they remain equal.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34354>
2025-04-04 20:14:52 +00:00
Caio Oliveira
cd486cda48 brw: Use control flow helpers in scoreboard tests
Also update WHILE to optionally take a predicate (default to NONE).  And
make the predicate in the IF optional (default to NORMAL).

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34354>
2025-04-04 20:14:52 +00:00
Ian Romanick
20cce95ce5 brw/opt: Don't call brw_opt_copy_propagation before brw_lower_load_reg
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
On a 36c/72t Xeon system, performance of replaying
hogwarts_legacy.dx12vk-ultra.foz was improved 1.3% +/- 0.77% (n=10).

I picked MTL for the fossil-db results because it was the most negative.

shader-db:

All Intel platforms had fairly similar results. (Lunar Lake)
total instructions in shared programs: 16964217 -> 16964216 (<.01%)
instructions in affected programs: 51777 -> 51776 (<.01%)
helped: 20 / HURT: 27

total cycles in shared programs: 892934916 -> 893041912 (0.01%)
cycles in affected programs: 51245298 -> 51352294 (0.21%)
helped: 96 /HURT: 78

fossil-db:

All Intel platforms had similar results. (Meteor Lake shown)
Totals:
Instrs: 233678547 -> 233678944 (+0.00%); split: -0.00%, +0.00%
Cycle count: 24398049850 -> 24400490877 (+0.01%); split: -0.01%, +0.02%
Max live registers: 42145052 -> 42145038 (-0.00%); split: -0.00%, +0.00%

Totals from 1141 (0.14% of 805934) affected shaders:
Instrs: 1546001 -> 1546398 (+0.03%); split: -0.01%, +0.03%
Cycle count: 1201746062 -> 1204187089 (+0.20%); split: -0.14%, +0.34%
Max live registers: 84247 -> 84233 (-0.02%); split: -0.03%, +0.01%

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>
2025-04-04 06:45:02 +00:00
Ian Romanick
991a2f510b brw/sat: Eliminate non-defs saturate propagation
The intervening_saturating_copy test is removed. The defs version of the
pass does not handle this case. It should not occur often in practice
anyway. Copy propagation and brw_nir_opt_fsat should prevent this
scenario from happening.

No shader-db changes on any Intel platform.

fossil-db:

All Intel platforms had similar results. (Lunar Lake shown)
Totals:
Instrs: 212677275 -> 212677278 (+0.00%)
Cycle count: 30466062848 -> 30466056040 (-0.00%)

Totals from 1 (0.00% of 706300) affected shaders:
Instrs: 1343 -> 1346 (+0.22%)
Cycle count: 411664 -> 404856 (-1.65%)

v2: Stop counting ip. The non-defs part of the pass was the only thing
that used it.

v3: Also delete "if (block != def->block) continue;" code. I noticed
this while working on some other changes to this function. It's the last
thing in the loop, so it's totally useless. Delete some other spurious
continues too.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> [v2]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>
2025-04-04 06:45:02 +00:00
Ian Romanick
cc5a6a5ae8 brw/sat: Convert tests to use load_reg
This is in prepartion for a commit that removes the non-defs version of
the pass.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>
2025-04-04 06:45:02 +00:00
Ian Romanick
2d13acf9d9 brw: Add passes to generate and lower load_reg
v2: Add support for WE_all instructions... this already just worked, so
I only had to delete the check and the FINISHME comment.

v3: Use logic more like def_analysis::update_for_reads to determine when
to not insert LOAD_REG instructions. Based on a suggestion by Ken.

v4: Eliminate "store" from all the names since STORE_REG does not exist
anymore. Fold insert_load_reg into brw_insert_load_reg. Elminate extra
call to s.def_analysis.require() after progress. Pull a loop-invariant
check out of the inst->srouces loop. Drop call to
brw_opt_split_virtual_grfs after lowering load_reg. All suggested by
Caio.

v5: Assert that LOAD_REG doesn't already exist in
brw_insert_load_reg. Update comment before fully_defines. Both
suggested by Caio.

v6: Don't explicitly special-case SHADER_OPCODE_MEMORY_STORE_LOGICAL.
Move the inst->dst.file != VGRF check earlier to avoid the loop over
sources. Both suggested by Ken. Move the call the brw_insert_load_reg
a little bit later, and explain why it's at that location. Suggested
by Caio.

v7: Many changes to the for-each-source loop in brw_insert_load_reg.
Removes incorrect multiplication of s.alloc.sizes with reg_unit. Adds
checks for matching SIMD size and NoMask in the search for pre-existing
LOAD_REG of same value.

v8: Add some unit tests. Suggested by Caio.

shader-db:

Lunar Lake
total instructions in shared programs: 16923237 -> 16921895 (<.01%)
instructions in affected programs: 450565 -> 449223 (-0.30%)
helped: 251 / HURT: 377

total cycles in shared programs: 910428418 -> 889920590 (-2.25%)
cycles in affected programs: 719248184 -> 698740356 (-2.85%)
helped: 9076 / HURT: 9082

total fills in shared programs: 2242 -> 2218 (-1.07%)
fills in affected programs: 116 -> 92 (-20.69%)
helped: 2 / HURT: 0

total sends in shared programs: 848635 -> 848421 (-0.03%)
sends in affected programs: 810 -> 596 (-26.42%)
helped: 10 / HURT: 0

LOST:   82
GAINED: 78

Meteor Lake and DG2 had similar results. (Meteor Lake shown)
total instructions in shared programs: 19875784 -> 19871694 (-0.02%)
instructions in affected programs: 1050091 -> 1046001 (-0.39%)
helped: 251 / HURT: 2403

total cycles in shared programs: 905328238 -> 882446458 (-2.53%)
cycles in affected programs: 682736344 -> 659854564 (-3.35%)
helped: 7869 / HURT: 7911

total spills in shared programs: 5512 -> 5032 (-8.71%)
spills in affected programs: 1830 -> 1350 (-26.23%)
helped: 8 / HURT: 0

total fills in shared programs: 5648 -> 4782 (-15.33%)
fills in affected programs: 3312 -> 2446 (-26.15%)
helped: 8 / HURT: 0

total sends in shared programs: 1032942 -> 1032722 (-0.02%)
sends in affected programs: 572 -> 352 (-38.46%)
helped: 10 / HURT: 0

LOST:   138
GAINED: 53

Tiger Lake
total instructions in shared programs: 19711930 -> 19715591 (0.02%)
instructions in affected programs: 1040623 -> 1044284 (0.35%)
helped: 317 / HURT: 2474

total cycles in shared programs: 862988990 -> 860573870 (-0.28%)
cycles in affected programs: 612392461 -> 609977341 (-0.39%)
helped: 7447 / HURT: 7686

total sends in shared programs: 1034763 -> 1034555 (-0.02%)
sends in affected programs: 784 -> 576 (-26.53%)
helped: 8 / HURT: 0

LOST:   56
GAINED: 143

Ice Lake and Skylake had similar results. (Ice Lake shown)
total instructions in shared programs: 20545461 -> 20545220 (<.01%)
instructions in affected programs: 422405 -> 422164 (-0.06%)
helped: 180 / HURT: 459

total cycles in shared programs: 872697345 -> 866874523 (-0.67%)
cycles in affected programs: 573117917 -> 567295095 (-1.02%)
helped: 6783 / HURT: 6980

total spills in shared programs: 4335 -> 4336 (0.02%)
spills in affected programs: 90 -> 91 (1.11%)
helped: 1 / HURT: 2

total fills in shared programs: 4194 -> 4196 (0.05%)
fills in affected programs: 463 -> 465 (0.43%)
helped: 1 / HURT: 2

total sends in shared programs: 1079446 -> 1079238 (-0.02%)
sends in affected programs: 784 -> 576 (-26.53%)
helped: 8 / HURT: 0

LOST:   117
GAINED: 37

fossil-db:

All Intel platforms had similar results. (Lunar Lake shown)
Totals:
Instrs: 209708136 -> 209695617 (-0.01%); split: -0.02%, +0.01%
Send messages: 10927753 -> 10927640 (-0.00%)
Cycle count: 30540172048 -> 30427084732 (-0.37%); split: -0.99%, +0.62%
Spill count: 511621 -> 510932 (-0.13%); split: -0.22%, +0.08%
Fill count: 621166 -> 618440 (-0.44%); split: -0.56%, +0.12%
Scratch Memory Size: 35574784 -> 35648512 (+0.21%); split: -0.06%, +0.26%
Max live registers: 65453860 -> 65453140 (-0.00%); split: -0.00%, +0.00%
Non SSA regs after NIR: 75374990 -> 35195764 (-53.31%)

Totals from 503284 (71.25% of 706391) affected shaders:
Instrs: 180203778 -> 180191259 (-0.01%); split: -0.02%, +0.01%
Send messages: 9699732 -> 9699619 (-0.00%)
Cycle count: 30080349592 -> 29967262276 (-0.38%); split: -1.01%, +0.63%
Spill count: 511584 -> 510895 (-0.13%); split: -0.22%, +0.08%
Fill count: 621120 -> 618394 (-0.44%); split: -0.56%, +0.12%
Scratch Memory Size: 35443712 -> 35517440 (+0.21%); split: -0.06%, +0.27%
Max live registers: 52566092 -> 52565372 (-0.00%); split: -0.01%, +0.00%
Non SSA regs after NIR: 70110949 -> 29931723 (-57.31%)

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>
2025-04-04 06:45:02 +00:00
Ian Romanick
8b2be206f3 brw/algebraic: Constant folding for BROADCAST and SHUFFLE
This prevents assertion failures in brw_eu_emit in a later commit in
this MR. Even though they have not been previously observed, these
assertion failures could happen even without that commit.

No shader-db or fossil-db changes on any Intel platform.

Fixes: 04e1783278 ("brw: Call brw_fs_opt_algebraic less often")

v2: Add SHUFFLE. Suggested by Ken. Fixed indentation.

v3: Update BROADCAST exec_size after rebasing on "brw/build: Use SIMD8
temporaries in emit_uniformize".

v4: Explain why munging the exec_size is correct.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>
2025-04-04 06:45:02 +00:00
Ian Romanick
1b997c7bcc brw/coalesce: Prepare brw_opt_register_coalesce for load_reg
v2: Explain the problematic situation a little better in the
comment. Suggested by Caio.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>
2025-04-04 06:45:02 +00:00
Ian Romanick
15637334ce brw/copy: Prepare copy_propagation for load_reg
The changes to try_copy_propagate will be removed later in the series.

v2: Fix up some comments to note that offset != 0 is allowed only when
stride == 0. Apply same offset=0 restriction in try_copy_propagate_def
too. Allow copy propagation if the source is either a def or
UNIFORM. Don't copy prop a load_reg through a non-def value.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>
2025-04-04 06:45:02 +00:00
Ian Romanick
cfc50390fb brw: Add basic infrastructure for load_reg pseudo op
load_reg is something like load_payload except it has a single
source. It copies the entire source to the destination. Its purpose is
to convert a non-SSA VGRF into an SSA value. This copy is marked as
volatile so that it will act as a scheduling barrier.

v2: Fix some typos in the commit message. Eliminate the
brw_builder::LOAD_REG overload that returns a brw_inst*. This is
unlikely to ever be used. Add some checks to brw_validate. All
suggested by Caio.

v3: Force the source and destination types of the LOAD_REG to by
integer. This will (eventually) simplify the creating of unit tests for
the pass that adds LOAD_REG instructions.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>
2025-04-04 06:45:02 +00:00
Ian Romanick
b9656d51c0 brw/opt: Move non-SSA register accounting after first brw_opt_split_virtual_grfs
v2: Move to immediately before the main optimization loop. Most
importantly, this is after the first call to DCE.

fossil-db:

All Intel platforms had similar results. (Lunar Lake shown)
Totals:
Non SSA regs after NIR: 237045283 -> 100183460 (-57.74%); split: -58.12%, +0.39%

Totals from 701423 (99.26% of 706657) affected shaders:
Non SSA regs after NIR: 236868848 -> 100007025 (-57.78%); split: -58.17%, +0.39%

Suggested-by: Ken
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>
2025-04-04 06:45:02 +00:00
Caleb Callaway
5ad00bae8b intel/compiler: fix lingering i965 references
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34351>
2025-04-03 03:17:25 +00:00
Ian Romanick
e210b79ce3 brw/nir: Lower fsign again after last call to brw_nir_optimize
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
No shader-db or fossil-db changes on any Intel platform.

Fixes: 13332c23 ("intel/brw: Unconditionally run optimizations after nir_opt_uniform_subgroup")
Closes: #12888
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34251>
2025-04-02 01:59:49 +00:00
Ian Romanick
ca95cb8178 brw: Fix typo in comment
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34251>
2025-04-02 01:59:49 +00:00
irql-notlessorequal
255166a349 elk: always write the VUE header
ELK equivalent of !34211, also required to avoid potential rendering errors with hasvk.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34298>
2025-03-31 16:56:13 +00:00
irql-notlessorequal
fe7e0fd4f1 elk: ensure VUE header writes in HS/DS/GS stages
ELK equivalent of !34041, required to avoid potential rendering errors with VK_KHR_maintenance5

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34298>
2025-03-31 16:56:13 +00:00
Lionel Landwerlin
4346210ae6 brw: move texture offset packing to NIR
That way we can deal with upcoming non constant values for
VK_KHR_maintenance8.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33138>
2025-03-29 02:15:18 +00:00
Lionel Landwerlin
67ae49dede intel: move lower_texture to brw
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33138>
2025-03-29 02:15:18 +00:00
Lionel Landwerlin
86773b2ba6 brw: don't lower tg4 offsets without LOD
The problem this fixes is currently hidden because of the order in
which we run nir_lower_tex & intel_nir_lower_texture. The issue is
that nir_lower_tex removes the LOD source in some cases and the second
run of nir_lower_tex can add it back.

This is also only needed on Gfx12.5+ if the LOD is present.

Finally move all of the texture lowering to the postprocess phase. No
need to run this multiple times.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33138>
2025-03-29 02:15:18 +00:00
Lionel Landwerlin
b87dccc64c elk: stop using intel_nir_lower_texture
It's not doing anything.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33138>
2025-03-29 02:15:18 +00:00
Caio Oliveira
63224f64cc brw: Remove adjust_block_ips and brw_inst::remove() with defer
Now that the brw_ip_ranges analysis is being used, there's no
need to track start_ip/end_ips in the blocks as they are mutate.  And
also no need to call adjust_block_ips at the end of some passes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34012>
2025-03-29 00:25:51 +00:00