Commit graph

719 commits

Author SHA1 Message Date
Caio Oliveira
ce10130f09 intel/brw: Pull remove_extra_rounding_modes out of fs_visitor
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26887>
2024-02-26 20:54:25 +00:00
Caio Oliveira
73fe658456 intel/brw: Pull eliminate_find_live_channel out of fs_visitor
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26887>
2024-02-26 20:54:25 +00:00
Caio Oliveira
4f314f89f7 intel/brw: Pull opt_zero_samples out of fs_visitor
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26887>
2024-02-26 20:54:25 +00:00
Caio Oliveira
838d6d5cd2 intel/brw: Pull opt_split_sends out of fs_visitor
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26887>
2024-02-26 20:54:25 +00:00
Caio Oliveira
8dcbdc8fac intel/brw: Pull split/compact virtual_grf opts out of fs_visitor
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26887>
2024-02-26 20:54:25 +00:00
Caio Oliveira
1947a38680 intel/brw: Pull opt_algebraic out of fs_visitor
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26887>
2024-02-26 20:54:25 +00:00
Caio Oliveira
d849c2ecff intel/brw: Pull redundant_halt out of fs_visitor
And call it "remove redundant halts".

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26887>
2024-02-26 20:54:25 +00:00
Caio Oliveira
96c4aa8545 intel/brw: Pull peephole_sel out of fs_visitor
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26887>
2024-02-26 20:54:25 +00:00
Caio Oliveira
10489b418c intel/brw: Pull bank_conflicts out of fs_visitor
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26887>
2024-02-26 20:54:25 +00:00
Caio Oliveira
13c312431c intel/brw: Pull opt_cse out of fs_visitor
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26887>
2024-02-26 20:54:25 +00:00
Caio Oliveira
4f09ad9dee intel/brw: Pull opt_combine_constants out of fs_visitor
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26887>
2024-02-26 20:54:24 +00:00
Caio Oliveira
59bff8adf4 intel/brw: Pull dead_code_eliminate out of fs_visitor
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26887>
2024-02-26 20:54:24 +00:00
Caio Oliveira
1bd175f458 intel/brw: Pull opt_saturate_propagation out of fs_visitor
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26887>
2024-02-26 20:54:24 +00:00
Caio Oliveira
dc33a8fb06 intel/brw: Pull opt_cmod_propagation out of fs_visitor
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26887>
2024-02-26 20:54:24 +00:00
Caio Oliveira
6a3329a6c4 intel/brw: Pull opt_copy_propagation out of fs_visitor
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26887>
2024-02-26 20:54:24 +00:00
Caio Oliveira
0b73d163d4 intel/brw: Remove Gfx8- passes from optimize()
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26887>
2024-02-26 20:54:24 +00:00
Caio Oliveira
373130a66c intel/compiler: Remove has_render_target_reads from wm_prog_data
This was used only by the classic i965 driver.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27772>
2024-02-24 02:34:59 +00:00
Kenneth Graunke
c12300844d intel/fs: Don't rely on CSE for VARYING_PULL_CONSTANT_LOAD
In the past, we didn't have a good solution for combining scalar loads
with a variable index plus a constant offset.  To handle that, we took
our load offset and rounded it down to the nearest vec4, loaded an
entire vec4, and trusted in the backend CSE pass to detect loads from
the same address and remove redundant ones.

These days, nir_opt_load_store_vectorize() does a good job of taking
those scalar loads and combining them into vector loads for us, so we
no longer need to do this trick.  In fact, it can be better not to:
our offset need only be 4 byte (scalar) aligned, but we were making it
16 byte (vec4) aligned.  So if you wanted to load an unaligned vec2,
we might actually load two vec4's (___X | Y___) instead of doing a
single load at the starting offset.

This should also reduce the work the backend CSE pass has to do,
since we just emit a single VARYING_PULL_CONSTANT_LOAD instead of 4.

shader-db results on Alchemist:
- No changes in SEND count or spills/fills
- Instructions: helped 95, hurt 100, +/- 1-3 instructions
- Cycles: helped 3411 hurt 1868, -0.01% (-0.28% in affected)
- SIMD32: gained 5, lost 3

fossil-db results on Alchemist:
- Instrs: 161381427 -> 161384130 (+0.00%); split: -0.00%, +0.00%
- Cycles: 14258305873 -> 14145884365 (-0.79%); split: -0.95%, +0.16%
- SIMD32: Gained 42, lost 26

- Totals from 56285 (8.63% of 652236) affected shaders:
- Instrs: 13318308 -> 13321011 (+0.02%); split: -0.01%, +0.03%
- Cycles: 7464985282 -> 7352563774 (-1.51%); split: -1.82%, +0.31%

From this we can see that we aren't doing more loads than before
and the change is pretty inconsequential, but it requires less
optimizing to produce similar results.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27568>
2024-02-20 23:16:27 -08:00
Caio Oliveira
d8f9a05f32 intel/compiler: Rename the passes and files related to intel_nir.h
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27644>
2024-02-16 22:35:05 +00:00
Caio Oliveira
dc76cfc781 intel/compiler: Collect NIR-only passes in intel_nir.h
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27644>
2024-02-16 22:35:05 +00:00
Caio Oliveira
5732c9d269 intel/compiler: Rename brw_cs_dispatch_info to intel_cs_dispatch_info
And move to the intel_shader_enums.h file.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27475>
2024-02-14 22:31:23 -08:00
Caio Oliveira
c5b80de583 intel/compiler: Rename brw_vue_map to intel_vue_map
And move to the intel_shader_enums.h file.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27475>
2024-02-14 22:31:23 -08:00
Caio Oliveira
7d85d2c7fd intel/compiler: Rename DISPATCH_MODE_* enums to INTEL_DISPATCH_MODE_*
And move to the intel_shader_enums.h file.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27475>
2024-02-14 22:31:23 -08:00
Caio Oliveira
26dd1f0bba intel/compiler: Rename BRW_WM_MSAA_* enums to INTEL_MSAA_*
And move to the intel_shader_enums.h file.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27475>
2024-02-14 22:31:23 -08:00
Jordan Justen
b533bf7361 intel/compiler: Set branch shader required-width as 16 for xe2
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27529>
2024-02-14 20:07:13 +00:00
Ian Romanick
7441af803f intel/compiler/xe2: Update get_sampler_lowered_simd_width
The Bspec also says, "The table below describes the SIMD modes which
are supported. SIMD32 and SIMD64 are used for media-type operations
only."  Perhaps this commit should just add

    if (devinfo->ver >= 20)
        return 16;

instead.

v2: Use reg_unit in get_sampler_lowered_simd_width. Suggested by Sagar.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27305>
2024-02-02 02:39:09 +00:00
Francisco Jerez
c3a64f8dd1 intel/fs/xe2+: Allow SIMD16 MULH instructions.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27165>
2024-01-20 19:55:31 +00:00
Francisco Jerez
54f3d5a00c intel/fs: Emit QUAD_SWIZZLE instructions with WE_all for derivative lowering.
Otherwise the code generator will attempt to emit SIMD-lowered
QUAD_SWIZZLE instructions with an execution group not multiple of 8,
which is invalid on Xe2+.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27165>
2024-01-20 19:55:31 +00:00
Francisco Jerez
f974eacab3 intel/compiler/xe2: Fix for the removal of most predication modes.
Reworks:
* Remove changes to fixup_nomask workaround since it applies only for
  Gfx12 family.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26860>
2024-01-12 20:18:03 +00:00
Lionel Landwerlin
4b30b46ffd intel/fs: fix depth compute state for unchanged depth layout
There is no VK CTS exercising this case. If there was we would run
into hangs as noticed in
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26876

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26923>
2024-01-08 17:28:12 +00:00
Sagar Ghuge
96e0d979a7 intel/fs: Check fs_visitor instance before using it
On Xe2+, we don't build the SIMD8 shader so this check makes sure we
don't execute the uninitialized invocations.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26886>
2024-01-04 22:24:07 +00:00
Dave Airlie
37366fef68 intel/compiler: fix release build unused variable.
This is only used in an assert.

Fixes: 158ac265df ("intel/fs: Make helpers for saving/restoring instruction order")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26863>
2024-01-03 23:52:11 +00:00
Ian Romanick
3756f60558 intel/fs: DPAS lowering
Implements integer dot product lowering both with and without
DP4A. Implements half-float dot product lowering.

There are a couple FINISHME comments describing future optimizations.

v2: Add a brw_compiler::lower_dpas flag to track when the lowering
should be applied.

v3: Use is_null() instead of checking file != ARF. Suggested by Caio.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>
2023-12-29 20:27:15 -08:00
Ian Romanick
e666872c75 intel/compiler: Initial bits for DPAS instruction
v2: Add brw_ir_performance.cpp and brw_fs_generator.cpp changes. Fix
overlapping register allocation (via has_source_and_destination_hazard). Fix
incorrect destination register file encoding.

v3: Prevent lower_regioning from trying to "fix" DPAS sources.

v4: Add instruction latency information for scheduling and perf
estimates.

v5: Remove all mention of DPASW. Suggested by Curro and Caio. Update
the comment in fs_inst::has_source_and_destination_hazard. Suggested
by Caio.

v6: Add some comments near the src2 calculation in
fs_inst::size_read. Suggested by Caio.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>
2023-12-29 20:24:16 -08:00
Francisco Jerez
ab0eff4388 intel/fs/xe2+: Attempt to build quad-SIMD8 and dual-SIMD16 FS variants on Xe2+ platforms.
Extend the pre-existing dual-SIMD8 compilation path in
brw_compile_fs() to attempt quad-SIMD8 and dual-SIMD16 compiles.
Instead of building every possible dispatch mode and then picking one
based on cycle-count heuristics, this attempts to only build a single
multipolygon kernel -- The different mulipolygon dispatch modes are
tried in the expected order of decreasing performance (quad-SIMD8,
dual-SIMD16 then dual-SIMD8), the first one that successfully compiles
without spills is taken as a simple heuristic, and no further
multipolygon builds are attempted.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26606>
2023-12-28 14:12:59 -08:00
Francisco Jerez
50d084ec29 intel/fs/xe2+: Lower SIMD width of instructions that access ATTR file from SIMD2x8/4x8 FS.
This is needed because the information stored on the ATTR file for
multipolygon fragment shaders isn't stored as a contiguous sequence in
the GRF, instead the ATTR source may be lowered by assign_urb_setup()
to use a <16;8,0> region, which reads 4 SIMD16 GRFs for a SIMD32
instruction, even though the result of fs_inst::size_read() is
expected to be 2 GRFs.  Special case ATTR sources for multipolygon PS
shaders to calculate the number of physical GRFs that will actually be
read by the instruction after lowering, based on the number of
polygons processed by the instruction.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26606>
2023-12-28 14:12:59 -08:00
Francisco Jerez
0d332d0c49 intel/fs: Plumb shader instead of compiler to get_lowered_simd_width() and friends.
This will allow making lowering decisions based on properties of the
shader, like the multipolygon dispatch mode used.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26606>
2023-12-28 14:12:59 -08:00
Francisco Jerez
bd634bef12 intel/fs/xe2+: Implement layout of mesh shading per-primitive inputs in PS thread payloads.
This is based on a previous patch by Marcin Ślusarz addressing the
same issue, though it's largely rewritten, simplified and includes
additional fixes.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26606>
2023-12-28 14:12:59 -08:00
Francisco Jerez
4cebfaadf7 intel/fs/xe2+: Implement support for multi-polygon vertex setup data in PS payload.
This fixes a number of assumptions made by the multipolygon input
attribute handling code from assign_urb_setup() so it also works on
Xe2+, which has additional multipolygon dispatch modes (like SIMD4x8
and SIMD2x16) and uses a different more compact representation of the
plane parameters.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26606>
2023-12-28 14:12:59 -08:00
Francisco Jerez
702eabaaae intel/fs/xe2+: Update for new layout of vertex setup data in PS payload.
The interpolation deltas of PS inputs now show up as a 12B vec3 (A0,
A1-A0, A2-A0) in the ATTR file, instead of the previously used 16B
format with an unused component.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26606>
2023-12-28 11:07:03 -08:00
Francisco Jerez
d622e19f00 intel/fs/xe2+: Enable new format of barycentrics in PS payload.
The X and Y barycentric vectors are no longer interleaved in SIMD8
chunks (yay), so this is mostly a matter of disabling the
lower_barycentrics() pass and switching to a simpler implementation of
fetch_barycentric_reg() that simply calls fetch_payload_reg() instead
of the SIMD8 shuffling we had to do in previous generations.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26606>
2023-12-28 11:07:03 -08:00
Francisco Jerez
49a867f67e intel/fs: Add support for vector payload values to fetch_payload_reg().
This extends fetch_payload_reg() to support fetching vector registers
like barycentrics stored on the payload as a contiguous sequence of
SIMD-wide vectors.  In the SIMD32 case, both halves of the SIMD16
vector registers specified as regs[0] and regs[1] are zipped to
construct a single SIMD32-wide vector.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26606>
2023-12-28 11:07:03 -08:00
Francisco Jerez
a0ae3c0dba intel/fs/xe2+: Update uses of pixel/sample mask from PS thread payload.
Note from Caio: proper handling of brw_sample_mask_reg
will appear in later patches.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26606>
2023-12-28 11:07:03 -08:00
Rohan Garg
3e46ee61d5 intel/fs/xe2+: Lift CPS dispatch width restrictions on Xe2+.
These restrictions don't seem to be applicable anymore, and limiting
to SIMD8 wouldn't work since we're no longer building shaders with
that dispatch width.

[ Francisco: This one-liner change was squashed by Rohan Garg into a
  previous version of my patch "Stop building SIMD8 programs", but it
  makes more sense as a separate commit -- Formatted as a separate
  patch. ]

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26605>
2023-12-22 10:37:00 -08:00
Francisco Jerez
6877916155 intel/fs/xe2+: Stop building SIMD8 fragment shaders.
They are no longer suppored by the fixed-function hardware.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26605>
2023-12-22 10:37:00 -08:00
Francisco Jerez
7397ba61c2 intel/fs/xe2+: Stop building SIMD8 compute-like shaders (CS/BS/TS/MS).
SIMD8 kernels are no longer able to utilize the ALUs efficiently,
since they have twice the vector width as previous platforms.  However
even though there aren't many reasons to use it, SIMD8 is still
supported by the instruction set technically, and it will still be
used for some SIMD-lowering sequences.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26605>
2023-12-22 10:37:00 -08:00
Francisco Jerez
1f2c44dc21 intel/compiler: Attempt to build dual-SIMD8 variant of fragment shaders on gfx12+ platforms.
Similar to other FS dispatch modes, attempt to build a dual-SIMD8
program if the regular SIMD8 program didn't spill and doubling the
amount of space for varyings doesn't cause us to go over the thread
payload limit.  Dual-SIMD8 builds in combination with coarse pixel
shading are currently not handled.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26585>
2023-12-22 18:05:31 +00:00
Francisco Jerez
09ea840987 intel/fs: No need to copy null destinations in lower_simd_width.
The copy would be discarded immediately.  Until now we were relying on
DCE to eliminate these, but it seems like in some cases MOVs into the
null register emitted by lower_simd_width() are never eliminated,
likely because a lower_simd_width() call has been introduced close to
the bottom of optimize() which isn't follow by any additional DCE
passes.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26585>
2023-12-22 18:05:31 +00:00
Francisco Jerez
5e0760a993 intel/fs/gfx12: Don't consider multipolygon PS to have packed dispatch.
This fixes a number of regressions and hangs in multipolygon fragment
shaders that have FIND_LIVE_CHANNEL sequences which would otherwise
lead to access of a dead channel.  Note that the failures don't seem
to be reproducible in simulation.

Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26585>
2023-12-22 18:05:31 +00:00
Francisco Jerez
6bf99e6a45 intel/compiler: Don't change types for copies from ATTR file.
Since the <8;8,0> regions they use in multipolygon mode could violate
regioning restrictions in some cases, depending on the execution type
of the instruction.  Note that the assertion is removed from
try_copy_propagate() since a more accurate check is used within that
function than what fs_inst::can_change_types() can do.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26585>
2023-12-22 18:05:31 +00:00