Commit graph

3556 commits

Author SHA1 Message Date
Caio Oliveira
bbb12dbbf9 intel/compiler: Add a few tests to opt_predicated_break
v2 (idr): Fix expectations BottomBreakWithContinue. opt_predicated_break
will remove the IF and make the CONTINUE predicated.

v3 (idr): Temporarily disable the one test that fails.

v4 (idr): Free strings allocated by open_memstream. Fixes gitlab CI
failures in debian-testing-asan.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25216>
2023-11-30 20:58:05 +00:00
Caio Oliveira
0b072c5351 intel/compiler: Sort lists of succs and preds in CFG dump output
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25216>
2023-11-30 20:58:05 +00:00
Caio Oliveira
47c5656f0e intel/compiler: Allow dumping CFG to a specific FILE*
Add optional argument for both cfg and block dump() function to pass
a FILE*.  Default behavior remains dumping to stderr.

v2 (idr): Don't add the new test framework in this commit.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25216>
2023-11-30 20:58:05 +00:00
Caio Oliveira
21cf9323f0 intel/compiler: Add a few more helpers to fs_builder
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25216>
2023-11-30 20:58:05 +00:00
Ian Romanick
c0ecc0d70b intel/compiler: Don't promote CFG link types when removing a block
Imagine 3 blocks A, B, and C. A has a physical link to B, and B has a
logical link to C. Previous to this commit, if B were removed, A would
get a logical link to C. This is not correct.

This was specifically observed to occur when block A was a DO block and
B was the WHILE block. The DO block would have two logical successors,
and that is completely invalid.

v2: Assert that the links from A-to-B and B-back-to-A are the same
kind. Suggested by Caio.

v3: Assume the successor and predecessor lists are well formed. Use this
to simplify the logic. Suggested by Caio. Add checks to cfg_t::validate
to ensure the lists are well formed.

v4: Remove (now unused) bblock_link_invalid. Suggested by Curro.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25216>
2023-11-30 20:58:05 +00:00
Ian Romanick
77c0c1ce54 intel/compiler: Don't create extra CFG links when deleting a block
The previous is_successor_of and is_predecessor_of checks prevented
creating a physical link when a logical link already existed. However, a
logical link could be added when a physical link already existed. This
change causes an existing physical link to be "promoted" to a logical
link.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25216>
2023-11-30 20:58:05 +00:00
Ian Romanick
7e842a75ac intel/compiler: Don't create extra CFG links in opt_predicated_break
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25216>
2023-11-30 20:58:05 +00:00
Ian Romanick
bbd7729993 intel/compiler: Delete bidirectional block links in opt_predicated_break
Previously when earlier_block->children.make_empty() was called, the
child blocks would still have links back to earlier_block.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25216>
2023-11-30 20:58:05 +00:00
Ian Romanick
5842829380 intel/compiler: Limit scope of cur_endif variable
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25216>
2023-11-30 20:58:05 +00:00
Ian Romanick
02f9bbf6f3 intel/compiler: Add basic CFG validation
v2: Use _mesa_shader_stage_to_abbrev(stage) instead of
stage_abbrev. Noticed by Caio and GCC. That's what I get for not
recompiling after rebasing. Wrap cfg_t::validate in NDEBUG
magic. Suggested by Caio.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25216>
2023-11-30 20:58:05 +00:00
Ian Romanick
19db6f1cd9 intel/vec4: Don't emit an empty ELSE
This matches the behavior of fs_visitor::nir_emit_if.

This is not technically wrong, but the cfg_t generates some invalid
parent / child links in this case.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25216>
2023-11-30 20:58:05 +00:00
Caio Oliveira
5de5a0d475 intel/compiler: Don't use fs_visitor::bld in thread payload classes
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26301>
2023-11-28 19:53:51 +00:00
Caio Oliveira
2d6240ab14 intel/compiler: Don't use fs_visitor::bld in fs_reg_alloc
Just set up the builder without relying on the pre-existing one.  Moves
one step close to remove bld from fs_visitor.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26301>
2023-11-28 19:53:51 +00:00
Caio Oliveira
f55867b56c intel/compiler: Don't use fs_visitor::bld in tests
Tests create their own fs_builder now.  Moves one step closer to remove
bld from fs_visitor.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26301>
2023-11-28 19:53:51 +00:00
Caio Oliveira
9540259e1c intel/compiler: Prefer ctor/dtors in some Google Tests
Per Google Test FAQ recommendation, prefer consutrctors and destructors
unless there's a need to use SetUp/TearDown.

We will take advantage of this later to initialize an fs_builder.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26301>
2023-11-28 19:53:51 +00:00
Lionel Landwerlin
e22e88f8ce intel/fs: reuse set_predicate()
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26306>
2023-11-28 13:40:07 +00:00
Lionel Landwerlin
83a1657b6c intel/fs: fix incorrect register flag interaction with dynamic interpolator mode
Once NIR code is lowered and a few optimization passes have run, there
might be flag register interactions between instructions quite far
away from one another.

In the following case :

   f0 = and r0, r1
   ...
   fs_interpolate r2, r3
   ...
   if f0
      ...
   endif

If we lower fs_inteporlate while using the f0 register, we completely
garble the value meant for the if block.

To fix this, emit the predication for fs_interpolate in brw_fs_nir.cpp
when doing the NIR translation to the backend IR. This will guarantee
that the flag register interactions are visible to the optimization
passes, avoiding the problem above.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 68027bd38e ("intel/fs: implement dynamic interpolation mode for dynamic persample shaders")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9757
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26306>
2023-11-28 13:40:07 +00:00
Daniel Schürmann
1179d83a89 nir: remove info.fs.needs_all_helper_invocations
Use info.uses_wide_subgroup_intrinsics instead.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26026>
2023-11-22 11:31:11 +01:00
Caio Oliveira
e8220b9319 intel/compiler: Simplify allocation of NIR related arrays
Those are not reused, so this will be the first and only allocation, so
no need to use the "realloc" variants.

For the fs_reg arrays, there's currently no particular reason to keep
them uninitialized, so zero-initialize them too -- not ideal but better
than random values.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26302>
2023-11-21 18:31:05 +00:00
Lionel Landwerlin
4eb4197d27 intel/nir/rt: fix reportIntersection() hitT handling
We're currently updating the hitT value in the traversal result with
the hitT value from reportIntersection(), but this is not correct.

First the hitT value of reportIntersection() should update the
gl_RayTmaxEXT value (maps to brw_nir_rt_mem_ray_defs::t_far).

Second the hitT determined by traversal should only be updated if the
reportIntersection() hitT value has updated the gl_RayTmaxEXT and that
the new gl_RayTmaxEXT is smaller than the determined hitT value from
traversal.

Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Fixes: 303378e1dd ("intel/rt: Add lowering for combined intersection/any-hit shaders")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25146>
2023-11-17 07:06:30 +00:00
Lionel Landwerlin
6dbb5f1e07 intel/fs: rerun divergence analysis prior to convert_from_ssa
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9964
Cc: mesa-stable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26235>
2023-11-17 06:40:49 +00:00
Rhys Perry
f695a9fed2 intel/compiler: use nir_lower_fp16_casts
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25566>
2023-11-16 11:02:31 +00:00
Lionel Landwerlin
295734bf88 intel/fs: fix residency handling on Xe2
We're missing a few reg_unit() scaling when dealing with residency data.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26208>
2023-11-15 20:06:12 +00:00
Faith Ekstrand
80376146ed nak: Encode program headers
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24998>
2023-11-14 00:48:06 +00:00
Caio Oliveira
dcb68de656 intel/compiler: Clear up block instructions before re-adding them
Avoids fixing up list pointers that we don't care about anymore -- since
all the instructions will be re-added in a different order anyway.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>
2023-11-13 23:05:47 +00:00
Caio Oliveira
a9f95bf687 intel/compiler: Reuse same scheduler for all pre-RA scheduling modes
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>
2023-11-13 23:05:47 +00:00
Caio Oliveira
0dd5378ffe intel/compiler: Make scheduler classes take an external mem_ctx
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>
2023-11-13 23:05:47 +00:00
Caio Oliveira
04aa2df461 intel/compiler: Separate schedule_node temporary data
Some fields in schedule_node will need to be reset each time they are
used.  The `cand_generation` needs to be back to zero, and both
`unblocked_time` and `parent_count` need to be back to their initial
values, which were pre-calculated.

Rename the initial data fields and add new ones for the temporary data.

Note the helper function is `per node` to allow it "tag along" with an
existing loops.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>
2023-11-13 23:05:47 +00:00
Caio Oliveira
81594d0db1 intel/compiler: Move earlier scheduler code that is not mode-specific
This will be useful later on when we reuse the same scheduler for
multiple modes.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>
2023-11-13 23:05:47 +00:00
Caio Oliveira
73d4e4118a intel/compiler: Tidy up code in scheduler related to reads_remaining
- Just assert in functions we expect it to exist
- Predicate usage with `!post_reg_alloc` to avoid suggest there are more
  combinations.
- Reuse an existing loop to call the count function.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>
2023-11-13 23:05:47 +00:00
Caio Oliveira
4f246cf4e7 intel/compiler: Merge child/latency arrays in schedule_node
Values are used together, saves one pointer in schedule_node,
reduces amount of reallocations when children count grows.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>
2023-11-13 23:05:47 +00:00
Caio Oliveira
e59a054203 intel/compiler: Move FS specific fields to fs_instruction_scheduler
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>
2023-11-13 23:05:47 +00:00
Caio Oliveira
a6297d05ca intel/compiler: Remove virtual calls from scheduler
Pull run() and schedule_instructions() for fs, and pull a very
simplified version of those into a run() for vec4.  Because of the
previous patches the duplication is small.

Since we are touching these, change run() implementations to use the
cfg from the existing reference to the visitor/shader instead of taking
one as argument.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>
2023-11-13 23:05:47 +00:00
Caio Oliveira
d76d58cf50 intel/compiler: Cache issue_time information
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>
2023-11-13 23:05:47 +00:00
Caio Oliveira
ecd7ffcf78 intel/compiler: Extract scheduling related basic functions
Those will be used in multiple places later.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>
2023-11-13 23:05:47 +00:00
Caio Oliveira
8a8dd2db0c intel/compiler: Add only available instructions to scheduling list
The list was used for iterating through all instructions and then
later also to track the available ones.  Now that the array iteration
is used, change how we fill it and rename it to reflect its only job.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>
2023-11-13 23:05:47 +00:00
Caio Oliveira
ddff6428c5 intel/compiler: Use array to iterate the scheduler nodes
For all the preparation data collection before the scheduling
actually happens, it is possible to walk the schedule nodes
in order by iterating on the range of the array dedicated to
a given block.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>
2023-11-13 23:05:47 +00:00
Caio Oliveira
fe6ac5a184 intel/compiler: Allocate all schedule_nodes at once
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>
2023-11-13 23:05:47 +00:00
Caio Oliveira
be012055da intel/compiler: Remove reference to brw_isa_info from schedule_node
It is always the same for all nodes, so use the one available in the
scheduler itself.

Also, per Matt's suggestion, collect is_haswell from devinfo instead of
from a function argument.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>
2023-11-13 23:05:47 +00:00
Caio Oliveira
6987571737 intel/compiler: Use linear allocator in parts of brw_schedule_instructions
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>
2023-11-13 23:05:47 +00:00
Caio Oliveira
fcd025c1ce intel/compiler: Remove is_tex()
The current name doesn't cover all the tex related instructions and
in all usages, we already have a switch statement to dispatch
per instruction type, so is more natural to list the instructions we
care there.

In fs::is_send_from_grf() we can simply ignore them since the
instructions are either lowered directly to SEND (Gfx7+) or use
MRF (Gfx6-).

With this change, the fs_inst::size_read() generated code gets
simplified (the "tex" entries get added to the switch jump table
in gcc) and the default case loses the conditional handling tex.

This reduces shader compilation time, as illustrated by replaying
fossils (tested on my TGL laptop):

```
// Rise of the Tomb Raider (N=13)
Difference at 95.0% confidence
	-1.32231 +/- 0.0170138
	-4.37605% +/- 0.0563054%
	(Student's t, pooled s = 0.0210159)

// Cyberpunk 2077 (N=7)
Difference at 95.0% confidence
	-3.64 +/- 0.114993
	-2.95188% +/- 0.0932544%
	(Student's t, pooled s = 0.09873)
```

Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25721>
2023-11-10 15:43:31 +00:00
Francisco Jerez
073b876539 intel/fs/xe2+: Don't special case SEL_EXEC in inferred_exec_pipe().
This is lowered to 32-bit integer execution type by the regioning
lowering pass now, so the existing special casing is redudant for
Gfx12 and buggy for Xe2+, since SEL_EXEC is now emitted without
lowering for 64-bit integers.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25514>
2023-11-08 23:17:42 -08:00
Francisco Jerez
23e14a6c27 intel/eu/xe2+: Add definition for size of GRF space on Xe2.
And use it in various places in the compiler that require knowledge
about the size of the register file.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25514>
2023-11-08 23:17:24 -08:00
Francisco Jerez
ff3814abdd intel/fs/xe2+: Handle extended math instructions as in-order in SWSB pass.
Extended math instructions are now synchronized as in-order
instructions like other ALU operations, which is more efficient than
the out-of-order tracking we had to do in previous generations, and
avoids false dependencies introduced due to SBID aliasing.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25514>
2023-11-08 23:17:12 -08:00
Francisco Jerez
5fb6760f11 intel/fs/xe2+: Teach SWSB pass about the behavior of double precision instructions.
Xe2 hardware has a "long" EU pipeline specifically for FP64
instructions, so these are handled as in-order instructions which
require RegDist synchronization.  64-bit integer instructions are now
handled by the normal integer pipeline, so the existing special-casing
inherited from ATS needs to be disabled.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25514>
2023-11-08 23:17:03 -08:00
Francisco Jerez
9e446c9282 intel/fs/xe2+: Add comment reminding us to take advantage of the 32 SBID tokens.
The additional SBID tokens will be useful when large GRF mode is implemented.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25514>
2023-11-08 23:16:54 -08:00
Francisco Jerez
15d6c6ab11 intel/eu/xe2+: Add support for 10-bit SWSB representation on Xe2+ platforms.
This implements the extended 10-bit encoding of the software
scoreboard information used by Xe2 platforms.  The new encoding is
different enough that there are few opportunities for sharing code
during translation to machine code, but the high-level tgl_swsb
representation remains roughly the same.

Among other changes the 10-bit SWSB format provides 5 bits worth of
SBID tokens (though they're only usable in large GRF mode) instead of
4 bits, the extended math pipeline is handled as an in-order (RegDist)
pipeline instead of as an out-of-order one, and the dual-argument
encodings support additional combinations of RegDist and SBID
synchronization modes.  A new encoding is introduced for preventing
the accumulator hardware scoreboard from being updated, but this is
currently not needed.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25514>
2023-11-08 23:12:32 -08:00
Caio Oliveira
40416850f1 intel/compiler: Re-enable opt_zero_samples() in many cases for Gfx12.5
The workaround applies specifically to Cube and Cube Arrays, so we can
still apply the optimization for the others.

Ideally we would like to pull opt_zero_samples logic into the lowering
sends -- to avoid adding a bit to communicate between passes.  However
the texture coordinates for the LOGICAL backend instructions, which
are a common target for the optimization, are combined into offsets over
a single VGRF, so we can't easily identify the constant cases.  The
copy-prop pass make this more visible for opt_zero_samples.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25742>
2023-11-09 03:56:28 +00:00
Caio Oliveira
daeab51a62 intel/compiler: Re-enable opt_zero_samples() for Gfx7+
Inadvertently, because of a sequence of changes elsewhere, this pass
ended up not having any effect:

- Before Gfx5 the optimization is not applicable.

- On Gfx5-6 it doesn't apply because it sampler operations don't
  currently use LOAD_PAYLOAD, but write the MOVs directly.  Not clear to
  me whether they ever did.

- On Gfx7+ it doesn't apply anymore because now the logical sampler
  operations are now lowered directly to SENDs, and the is_tex() check
  would skip them.

Since the LOAD_PAYLOAD implementation applies for Gfx7+ only, rework the
pass to work again by handling SEND instructions.  To make the pass
easier, the optimization will happen before opt_split_sends() so only
one LOAD_PAYLOAD needs to be cared for.

Update the code to accept BAD_FILE sources in addition to zeros, these
are added in some cases as padding and effectively are don't care
values, so we can assume them zeros.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25742>
2023-11-09 03:56:28 +00:00
Caio Oliveira
ef8553082e intel/compiler: Rework opt_split_sends to not rely/modify LOAD_PAYLOAD
This is a preparation to (re-)enable opt_zero_samples(), which will reduce
a SEND mlen before we split it.  When that happen, opt_split_sends()
won't be able to rely on the fact that mlen covers the entire
LOAD_PAYLOAD.

Since we are changing that, take the opportunity to also not modify the
existing LOAD_PAYLOAD, just create two new ones with the exact set of
sources.  This allows the pass to be further simplified by iterating
forward and not require live_variables analysis.

The helper function was added so can be used later for
opt_zero_samples().

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25742>
2023-11-09 03:56:28 +00:00