Commit graph

109 commits

Author SHA1 Message Date
Caio Oliveira
2bd7592b0b intel/brw: Add SHADER_OPCODE_BALLOT
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31052>
2024-11-21 19:32:59 +00:00
Ian Romanick
80a5d158ae brw/copy: Don't copy propagate through smaller entry dest size
Copy propagation would incorrectly occur in this code

    mov(16) v4+2.0:UW, u0<0>:UW NoMask
    ...
    mov(8) v6+2.0:UD, v4+2.0:UD NoMask group0

to create

    mov(16) v4+2.0:UW, u0<0>:UW NoMask
    ...
    mov(8) v6+2.0:UD, u0<0>:UD NoMask group0

This has different behavior. I think I just made a mistake when I
changed this condition in e3f502e007.

It seems like this condition could be relaxed to cover cases like (note
the change of destination stride)

    mov(16) v4+2.0<2>:UW, u0<0>:UW NoMask
    ...
    mov(8) v6+2.0:UD, v4+2.0:UD NoMask group0

I'm not sure it's worth it.

No shader-db or fossil-db changes on any Intel platform. Even the code
for the test case mentioned in the original commit did not change.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes: e3f502e007 ("intel/fs: Allow copy propagation between MOVs of mixed sizes")
Closes: #12116
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32041>
2024-11-08 17:46:45 +00:00
Ian Romanick
ac64b78f1f brw/copy: Perform constant folding with constant propagation
No shader-db or fossil-db changes on any Intel platform.

v2: Simlify the logic for when to try constant folding. Do
commute_immediates before constant folding. Both suggested by Ken.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31729>
2024-10-25 23:39:36 +00:00
Ian Romanick
8329c04521 brw/copy: Don't remove instructions w/ conditional modifier
Fixes: 9e750f00c3 ("intel/brw: Make opt_copy_propagation_defs clean up its own trash")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31834>
2024-10-25 20:31:44 +00:00
Kenneth Graunke
02482604e5 intel/brw: Delete old-style surface and A64 message opcodes
These have now been replaced by the MEMORY_*_LOGICAL opcodes.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Acked-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>
2024-09-12 20:54:36 +00:00
Kenneth Graunke
d5f38be713 intel/brw: Introduce new MEMORY_*_LOGICAL opcodes
This is a new unified set of opcodes for memory access loosely patterned
after the new LSC-style data port messages introduced on Alchemist GPUs.

Rather than creating an opcode for every type of memory access, it has
only three opcodes: load, store, and atomic.  It has various sources to
indicate the rest:

- Binding type (raw pointer, pointer to surface state, or BT index)
- Address size (A64, A32, A16)
- Data size (bit size, number of components)
- Opcode (atomic opcode, or LOAD/STORE vs. LOAD_CMASK/STORE_CMASK)
- Mode (typed vs. untyped vs. shared-local vs. scratch)
- Address (and its dimensionality)
- Data (0 for loads, 1 for stores, 2 for atomics)
- Whether we want block access

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>
2024-09-12 20:54:36 +00:00
Ian Romanick
fef175de09 intel/brw: Enable constant propagation for a couple more logical sends
This prevents some regressions later in the MR. Once load_const
operations are marked as is_scalar, they will cesase to get the
automatic constant propagation that occurs in try_rebuild_source.

No shader-db or fossil-db changes on any Intel platform.

v2: Slightly relax source restrictions on
SHADER_OPCODE_UNALIGNED_OWORD_BLOCK_READ_LOGICAL. Add a comment
explaining the restriction.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>
2024-08-30 03:39:31 +00:00
Ian Romanick
572e00dd66 intel/brw: Copy prop from raw integer moves with mismatched types
The specific pattern from the unit test was observed in ray tracing
trampoline shaders.

v2: Refactor the is_raw_move tests out to a utility function. Suggested
by Ken.

v3: Fix a regression caused by being too picky about source
modifiers. This was introduced somewhere between when I did initial
shader-db runs an v2.

v4: Fix typo in comment. Noticed by Caio.

shader-db:

All Intel platforms had similar results. (Meteor Lake shown)
total instructions in shared programs: 19734086 -> 19733997 (<.01%)
instructions in affected programs: 135388 -> 135299 (-0.07%)
helped: 76 / HURT: 2

total cycles in shared programs: 916290451 -> 916264968 (<.01%)
cycles in affected programs: 41046002 -> 41020519 (-0.06%)
helped: 32 / HURT: 29

fossil-db:

Meteor Lake, DG2, and Skylake had similar results. (Meteor Lake shown)
Totals:
Instrs: 151531355 -> 151513669 (-0.01%); split: -0.01%, +0.00%
Cycle count: 17209372399 -> 17208178205 (-0.01%); split: -0.01%, +0.00%
Max live registers: 32016490 -> 32016493 (+0.00%)

Totals from 17361 (2.75% of 630198) affected shaders:
Instrs: 2642048 -> 2624362 (-0.67%); split: -0.67%, +0.00%
Cycle count: 79803066 -> 78608872 (-1.50%); split: -1.75%, +0.25%
Max live registers: 421668 -> 421671 (+0.00%)

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
Totals:
Instrs: 149995644 -> 149977326 (-0.01%); split: -0.01%, +0.00%
Cycle count: 15567293770 -> 15566524840 (-0.00%); split: -0.02%, +0.01%
Spill count: 61241 -> 61238 (-0.00%)
Fill count: 107304 -> 107301 (-0.00%)
Max live registers: 31993109 -> 31993112 (+0.00%)

Totals from 17813 (2.83% of 629912) affected shaders:
Instrs: 3738236 -> 3719918 (-0.49%); split: -0.49%, +0.00%
Cycle count: 4251157049 -> 4250388119 (-0.02%); split: -0.06%, +0.04%
Spill count: 28268 -> 28265 (-0.01%)
Fill count: 50377 -> 50374 (-0.01%)
Max live registers: 470648 -> 470651 (+0.00%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>
2024-08-30 03:39:31 +00:00
Kenneth Graunke
da395e6985 intel/brw: Fix extract_imm for subregion reads of 64-bit immediates
We could be trying to extract a D/UD from a Q/UQ, for example.  We were
ignoring the top 32-bits, which is incorrect.

Fixes: 580e1c592d ("intel/brw: Introduce a new SSA-based copy propagation pass")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30884>
2024-08-28 12:33:26 -07:00
Kenneth Graunke
51c85e0363 intel/brw: Drop misguided sign extension attempts in extract_imm()
This function never expands a type - it only narrows it.  As such, we
don't need to ever sign extend to fill additional new bits.  I think
this code was left over from earlier versions of my optimization pass
that was buggy and trying to handle cases it should not have.

Fixes: 580e1c592d ("intel/brw: Introduce a new SSA-based copy propagation pass")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30884>
2024-08-28 12:33:26 -07:00
Caio Oliveira
8ba8e33c39 intel/brw: Simplify @file annotations
Doxygen documentation says

> If the file name is omitted (i.e. the line after \file is left
> blank) then the documentation block that contains the \file command will
> belong to the file it is located in.

so we can omit the filename itself when using the annotation.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30168>
2024-07-22 22:48:03 +00:00
Caio Oliveira
71ccf8e4cd intel/brw: Rename fs_reg_* helpers to brw_reg_*
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29791>
2024-07-03 02:53:19 +00:00
Caio Oliveira
3670c24740 intel/brw: Replace uses of fs_reg with brw_reg
And remove the fs_reg alias.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29791>
2024-07-03 02:53:19 +00:00
Kenneth Graunke
9e750f00c3 intel/brw: Make opt_copy_propagation_defs clean up its own trash
Copy propagation often eliminates all uses of an instruction.  If we
detect that we've done so, we can eliminate the instruction ourselves
rather than leaving it hanging until the next DCE pass.

This saves some CPU time as other passes don't see dead code.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>
2024-06-18 09:02:25 +00:00
Kenneth Graunke
580e1c592d intel/brw: Introduce a new SSA-based copy propagation pass
(Quite a few of the restrictions here are ported from the old pass.)

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28666>
2024-06-18 09:02:25 +00:00
Kenneth Graunke
3da444b79e intel/brw: Refactor code to commute immediates into legal positions
This will let us reuse this in a new pass shortly.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29624>
2024-06-08 02:19:12 -07:00
Kenneth Graunke
d45da713e7 intel/brw: Refactor try_constant_propagate()
This will let us reuse the bulk of this code in a new copy propagation
pass without replicating it.  We retain a wrapper function for dealing
with ACP entries, which the new pass won't have.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29624>
2024-06-08 02:19:10 -07:00
Kenneth Graunke
85aa6f80af intel/brw: Drop BRW_OPCODE_IF from try_constant_propagate
This was for Sandybridge's IF with embedded comparison, which only
existed for a single generation of hardware.  Since the compiler fork,
we no longer support Sandybridge here.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29624>
2024-06-08 02:19:08 -07:00
Kenneth Graunke
7019bc4469 intel/brw: Drop compiler parameter from try_constant_propagate()
This is unused.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29624>
2024-06-08 02:19:06 -07:00
Ian Romanick
033405cd4b intel/brw: Combine constants and constant propagation for CSEL
No shader-db or fossil-db changes on any Intel platform. This ends up
begin helpful in "intel/brw: Use range analysis to optimize fsign."

v2: Add integer CSEL support

v3: Massive simplification (-20 lines!) of constant propagation
logic. Suggested by Ken. Add missing CSEL case in supports_src_as_imm.
Noticed by Ken.

v4: While MAD can mix F and HF sources on some platforms, CSEL
cannot. Found by skqp on TGL.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v3]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29095>
2024-05-14 01:28:20 +00:00
Kenneth Graunke
545bb8fb6f intel/brw: Replace type_sz and brw_reg_type_to_size with brw_type_size_*
Both of these helpers do the same thing.  We now have brw_type_size_bits
and brw_type_size_bytes and can use whichever makes sense in that place.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>
2024-04-25 11:41:48 +00:00
Kenneth Graunke
873fcdff38 intel/brw: Stop using long BRW_REGISTER_TYPE enum names
s/BRW_REGISTER_TYPE/BRW_TYPE/g

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>
2024-04-25 11:41:48 +00:00
Francisco Jerez
62aab1437e intel/fs/gfx20+: Handle subdword integer regioning restrictions in copy propagation.
This makes sure that copy propagation doesn't undo the lowering of
restricted sub-dword integer regions done by brw_fs_lower_regioning().

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28698>
2024-04-22 18:02:32 -07:00
Ian Romanick
0e817ba548 intel/brw/xe2+: Implement Wa 22016140776
HF sources to math instructions cannot be scalar. This is very similar
to an old Gfx6 restriction on POW, so let's fix it in a similar way.

As an extra bit of saftey, lower any occurances that might slip through
in brw_fs_lower_regioning.

The primary change is to prevent copy propagation from violating the
restriction. With that change, nothing should be able to generate these
invalid source strides. The modification to fs_visitor::validate should
detect potential problems sooner rather than later.

Previous attempts to implement this Wa when emitting the math
instruction (in brw_eu_emit.c gfx6_math) didn't work for several
reasons. The lowering happens after the SWSB pass, so the scoreboarding
was incorrect (thanks to Curro for finding that). In addition, the
lowering happens after register allocation, so it's impossible to
allocate a non-scalar register to expand the scalar value.

Fixes 113 tests in the dEQP-VK.spirv_assembly.* group on LNL.

v2: Add changes to brw_fs_lower_regioning. Suggested by Curro.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28480>
2024-04-04 21:04:09 -07:00
Rohan Garg
a715512177 intel/brw: adjust the copy propgation pass to account for wider GRF's on Xe2+
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27235>
2024-03-28 19:53:40 +00:00
Kenneth Graunke
c0c05c1041 intel/brw: Fix destination stride assertion in copy propagation
We were asserting that entry->dst.offset % REG_SIZE == 0, which is
easily tripped by a simple LOAD_PAYLOAD that writes a 16-bit vec2:

    load_payload(8) vgrf1:UW, vgrf2+0.0:UW, vgrf3+0.0:UW

We create separate ACP entries corresponding to the values coming from
vgrf2 and vgrf3, with entry->dst set to the location within vgrf1 where
those sources get written to.  So the second entry will have offset 16,
which is not REG_SIZE aligned.

It looks like this assert was originally added back in 2014 (see commit
1728e74957) and adjusted through the ages,
including at a point when we combined reg and subreg offsets into a
single byte offset, and over time also extended copy propagation.

Here the destination offset is already accounted for via rel_offset,
at the byte offset level, so things ought to work and there is no need
to assert that this is the case.  Ian had already noted that the assert
tripped in commit e3f502e007, but checking
for inst->opcode == MOV here doesn't really make sense - it's just the
case that he found that broke.

Remove the erroneous assertion.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28067>
2024-03-27 04:52:17 +00:00
Ian Romanick
d9674cbe7d intel/brw: Combine constants for src0 of POW instructions too
I tried this when I was working on MR !7698, and it didn't have much
affect back then. Maybe I've added more stuff to my fossil-db?

Gfx12 platforms (Tiger Lake and DG2) are unaffected because the POW
instruction was removed.

shader-db:

Ice Lake and Skylake had similar results. (Ice Lake shown)
total instructions in shared programs: 20301933 -> 20301900 (<.01%)
instructions in affected programs: 9077 -> 9044 (-0.36%)
helped: 33 / HURT: 0

total cycles in shared programs: 842797624 -> 842799471 (<.01%)
cycles in affected programs: 1361911 -> 1363758 (0.14%)
helped: 35 / HURT: 111

LOST:   0
GAINED: 9

fossil-db:

Ice Lake and Skylake had similar results. (Ice Lake shown)
Totals:
Instrs: 165510222 -> 165510163 (-0.00%)
Cycles: 15125195835 -> 15125194484 (-0.00%); split: -0.00%, +0.00%
Spill count: 45204 -> 45196 (-0.02%)
Fill count: 74157 -> 74149 (-0.01%)

Totals from 65 (0.01% of 656118) affected shaders:
Instrs: 57426 -> 57367 (-0.10%)
Cycles: 1667918 -> 1666567 (-0.08%); split: -0.11%, +0.03%
Spill count: 137 -> 129 (-5.84%)
Fill count: 515 -> 507 (-1.55%)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27552>
2024-03-12 21:31:30 +00:00
Ian Romanick
e7480f94c1 intel/brw: Combine constants for src0 of integer multiply too
The majority of cases that would have been affected by this actually
had both sources as integer constants. The earlier commit "intel/rt:
Don't directly generate umul_32x16" allowed those to be constant
folded.

v2: Move the a*-1 block to be near the existing a*-1 block.

No shader-db changes on any Intel platform.

fossil-db results:

All Intel platforms had similar results. (Ice Lake shown)
Totals:
Instrs: 165510246 -> 165510222 (-0.00%)
Cycles: 15125198238 -> 15125195835 (-0.00%); split: -0.00%, +0.00%

Totals from 46 (0.01% of 656118) affected shaders:
Instrs: 36010 -> 35986 (-0.07%)
Cycles: 2613658 -> 2611255 (-0.09%); split: -0.17%, +0.07%

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27552>
2024-03-12 21:31:30 +00:00
Kenneth Graunke
f6ac6c94a9 intel/brw: Handle SHADER_OPCODE_SEND without src[3] in copy prop
We construct some SENDs with only 3 sources (such as FB writes).
This code could read out of bounds.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27876>
2024-03-05 11:39:26 +00:00
Kenneth Graunke
49606ab067 intel/brw: Avoid copy propagating any fixed registers into EOTs
We were handling FIXED_GRF, but we probably also ought to handle ATTR
(pushed inputs) and UNIFORM (pushed constants).  Just check if file
isn't VGRF to handle everything.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27876>
2024-03-05 11:39:26 +00:00
Kenneth Graunke
45a5e4c0c4 intel/brw: Delete SHADER_OPCODE_TXF_UMS
Nothing seems to generate this anymore.  I guess we always use CMS.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27908>
2024-03-01 22:19:51 +00:00
Kenneth Graunke
601ef12467 intel/brw: Delete SHADER_OPCODE_TXF_CMS[_LOGICAL]
We always use the wide variant (_W) on hardware this compiler supports.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27908>
2024-03-01 22:19:50 +00:00
Caio Oliveira
082735750b intel/brw: Simplify usage of reg immediate helpers
Use fs_reg and don't take the type as argument.  In all uses the type
passed is the type of the register.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27904>
2024-03-01 17:52:09 +00:00
Caio Oliveira
8f3c52c1da intel/brw: Remove MRF type
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691>
2024-02-28 05:45:39 +00:00
Caio Oliveira
5c93a0e125 intel/brw: Remove Gfx8- remaining opcodes
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691>
2024-02-28 05:45:39 +00:00
Caio Oliveira
7ac5696157 intel/brw: Remove Gfx8- code from backend passes
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691>
2024-02-28 05:45:38 +00:00
Sagar Ghuge
6f0ab5e4d5 intel/compiler: Add texture gather offset LOD/Bias message support
v2: (Ian)
- Space formatting on conditional statement

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27447>
2024-02-27 00:22:46 +00:00
Sagar Ghuge
79af0ac29a intel/compiler: Add gather4_i/l/[_c]/b sampler message
v2: (Ian)
- Format comment

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27447>
2024-02-27 00:22:46 +00:00
Caio Oliveira
6a3329a6c4 intel/brw: Pull opt_copy_propagation out of fs_visitor
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26887>
2024-02-26 20:54:24 +00:00
Caio Oliveira
b5cd91501d intel/fs: Use linear allocator in opt_copy_propagation
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25670>
2024-01-04 23:06:07 +00:00
Caio Oliveira
6d2503e935 intel/fs: Only allocate acp_entry if we are adding one
In practice it seems we are always entering here, haven't looked
in detail whether at this point we could just assert.  But for now
only allocate a new acp_entry if we are going to add it.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25670>
2024-01-04 23:06:07 +00:00
Francisco Jerez
6bf99e6a45 intel/compiler: Don't change types for copies from ATTR file.
Since the <8;8,0> regions they use in multipolygon mode could violate
regioning restrictions in some cases, depending on the execution type
of the instruction.  Note that the assertion is removed from
try_copy_propagate() since a more accurate check is used within that
function than what fs_inst::can_change_types() can do.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26585>
2023-12-22 18:05:31 +00:00
Francisco Jerez
2ed36050fb intel/fs: Don't copy-propagate ATTR registers in multi-polygon FS shaders when invalid.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26585>
2023-12-22 18:05:31 +00:00
Jordan Justen
3f89fa63e6 intel/compiler: Pass max_polygons to copy-prop from fs_visitor.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26585>
2023-12-22 18:05:31 +00:00
Ian Romanick
92f5442489 intel/fs: Merge copy prop dataflow loops
This is kept as a separate commit because the change looks like a lot
more than it it. The order of the two loops is swapped, then the two
loops are merged.

Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>
2023-09-14 22:31:23 +00:00
Ian Romanick
fa2757aa97 intel/fs: Use rb_tree for copy prop dataflow
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>
2023-09-14 22:31:23 +00:00
Ian Romanick
35644bb483 intel/fs: Use rb_tree to store ACP entries by destination
Using a single data structure seems better. There's no appreciable
performance change. On batman_arkham_city_goty.foz, the difference
reported was 0.48%±0.36% (n=20). Several commits in the MR, including
some that should have no effect at all, reported similar changes. I
attribute this primarily changing of loop alignments and similar.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>
2023-09-14 22:31:23 +00:00
Ian Romanick
c28bf1a249 intel/fs: Use rb_tree to store ACP entries by source
On batman_arkham_city_goty.foz, this improves fossil-db time by
-3.83%±0.24% (n=20). This fossil takes the longest time of any in my
database.

v2: Add some comments for cmp_entry_src_entry_src and
cmp_entry_src_nr. Suggested by Ken.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>
2023-09-14 22:31:23 +00:00
Ian Romanick
06bdd3eac0 intel/fs: Encapsulate per-block ACP in a structure
This simplifies some later changes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>
2023-09-14 22:31:23 +00:00
Ian Romanick
c262752d74 intel/fs: Make opt_copy_propagation_local file private
This annoyed me durning development of this MR. Every time I changed the
parameters to this internal function, I had to modify a public header
file... and trigger a much large rebuild.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25091>
2023-09-14 22:31:23 +00:00