Commit graph

4318 commits

Author SHA1 Message Date
Lionel Landwerlin
5c7c1eceb5 anv/brw: handle pipeline libraries with mesh
I always thought there was a massive issue with pipeline libraries &
mesh shaders. Indeed recent CTS tests have exposed a number of issues.

Some values delivered to the fragment shader are coming from different
places depending on whether the preceding shader is Mesh or not. For
example PrimitiveID is delivered in the per-primitive block in Mesh
pipelines whereas for other pipelines it's coming as a VUE slot (which
is per-vertex). Those are 2 different locations in the payload.

We have to find a layout for fragment shaders that is compatible with
everything. Leaving gaps here and there in the thread payload.

Fixes the following test pattern :

  dEQP-VK.mesh_shader.ext.smoke.fast_lib.shared_*

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:35 +00:00
Lionel Landwerlin
18bbcf9a63 intel: introduce new VUE layout for separate compiled shader with mesh
Mesh shaders have per vertex block in URB pretty much identical to the
VUE format. Let's just reuse that concept to do all of our layout in
the payload attribute registers. This will ensure that we have
consistent VUE layout between Mesh & non-Mesh pipelines.

We need a new way of laying out the VUE though as we have to
accomodate a HW constraint of maximum (per-primitive + per-vertex) of
32 varying. This means we cannot have 2 locations in the payload for
things like PrimitiveID which can come from either the per-primitive
or the per-vertex block. The new layout places the PrimitiveID at the
end of the per-vertex attributes and shrinks the delivery dynamically
if the mesh stage is active. The shader is compiled with a
MOV_INDIRECT to read the PrimitiveID from the right location in the
attributes.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:35 +00:00
Lionel Landwerlin
2d396f6085 intel: prepare VUE layout for more than 2 layouts
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:35 +00:00
Lionel Landwerlin
95efdca00b brw: add documentation pointers to FS attribute layout
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:35 +00:00
Lionel Landwerlin
9d342081e7 brw/nir: add intrinsics to read attribute payload register indirectly
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:35 +00:00
Lionel Landwerlin
ef17fbf8e5 anv/brw: use separate_shader to deduced MUE compaction
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:35 +00:00
Lionel Landwerlin
6230f3029f brw: fix brw_nir_move_interpolation_to_top
In a case like this :
   block_0:
      %5 = ...
      %6 = ...
      block_1:
         %7 = load_interpolated_input %5, %6

The current logic would move load_interpolated_input to block_0 before
%5 but not move %5 & %6 which are sources of that instruction.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:34 +00:00
Lionel Landwerlin
5ff1b31c3f brw: document some brw_wm_prog_data fields
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:34 +00:00
Lionel Landwerlin
2f654ddd03 brw: use VARYING_BIT_* macros more
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:34 +00:00
Lionel Landwerlin
62d2e323ba anv/brw: shrink FS varying payload
We're currently allocating payload spots for 3 fields already
delivered somewhere else in the payload.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:34 +00:00
Lionel Landwerlin
c467444670 brw/nir: use a new intrinsic for fs_msaa_flag
Avoid NIR code doing offset computations.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:34 +00:00
Lionel Landwerlin
dd1ef73aae brw: use newer NIR constructs
nir_shader_intrinsics_pass() & NIR_PASS()

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:34 +00:00
Lionel Landwerlin
b64f237dc4 brw: move helper to brw_nir.c
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:34 +00:00
Lionel Landwerlin
cbbe7ff66e brw: add new helper to print out FS URB setup
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:34 +00:00
Lionel Landwerlin
b8a80c88cb brw: improve VUE printout
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:34 +00:00
Lionel Landwerlin
4f10a1f618 anv: switch to brw helpers to figure out if a fragment is dynamic
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:34 +00:00
Lionel Landwerlin
7f500cc6e4 brw: store input_vertices on tcs_prog_data
Will allow the driver to know if the vertices count is dynamic.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:34 +00:00
Lionel Landwerlin
a9ee498347 brw: add helpers to check if a fragment shader execution is dynamic
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:34 +00:00
Lionel Landwerlin
4717382f84 anv: lower input vertices for TCS unconditionally
Take the opportunity to reuse the backend pass.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34109>
2025-05-08 06:48:34 +00:00
Lionel Landwerlin
c434050a00 brw: add pre ray trace intrinsic moves
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Some intrinsics are implemented by reading memory location that could
be rewritten by a further tracing calls. So we need to move those
reads prior to tracing operations in the shaders.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8979
Tested-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34214>
2025-05-06 13:34:53 +00:00
Lionel Landwerlin
63f633557f intel: fix null render target setup logic
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Or current render target cache setting is to key on the binding table
index, meaning the HW associates a number in the range [0, 7] to a
RENDER_SURFACE_STATE description. If you want change the render target
0 between 2 draw calls, you need to insert a PIPE_CONTROL in between
the 2 draw calls with pb-stall + rt-flush in order to flush an writes
to a previous RENDER_SURFACE_STATE that has now becomed disassociated
with the [0, 7] number.

This PIPE_CONTROL taking care of the flush is dealt with in
cmd_buffer_maybe_flush_rt_writes(). This function diffs the current
BTI setup for render targets (first 0 to 7 BTIs) with what the next
fragment shader wants.

The issue here is we might have a render pass with 0 color attachments
and yet in 98cdb9349a we added one pointing to the render target 0,
but in the emit_binding_table() when we finally program the BTI, we
check the render pass color count and program a null surface state
instead of an actual surface state. And this leads to hangs because
the render target cache will end up with inconsistent state data.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 98cdb9349a ("anv: ensure null-rt bit in compiler isn't used when there is ds attachment")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12955
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34603>
2025-05-01 11:25:18 +00:00
Iván Briano
29d7b90cfc brw: make HALT instruction act as barrier in new CSE pass
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This brings back c9e33e5cbf ("intel/fs/cse: Make HALT instruction act
as CSE barrier."), from the old CSE pass into the new one.

Fixes new CTS test: dEQP-VK.subgroups.shader_quad_control.terminated_invocation

Fixes: 9690bd369d ("intel/brw: Delete old local common subexpression elimination pass")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34643>
2025-04-29 20:28:24 +00:00
Sagar Ghuge
821c1bfa7e intel/compiler: Fix stackIDs on Xe2+
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
For Xe2+, from Bspec 64643, bit field "StackID": The maximum number of
StackIDs can be 2^12- 1.

Cc: mesa-stable
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34709>
2025-04-29 17:03:35 +00:00
Caio Oliveira
07fa3b3785 intel: Add support for BFloat16 as cooperative matrix source
Re-organize the configuration lists to make easier to include BFloat16
only for the Gfx125+ that support it, while keeping MTL supporting the
"lowered" configurations from pre-Gfx125.

Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>
2025-04-29 16:29:37 +00:00
Caio Oliveira
d4381c0908 brw/cmat: Implement conversion from/to BFloat16
When converting BFloat16 from/to non-Float32 type, use
the Float32 conversion as an intermediate step.  Take the
opportunity to separate the unary_op/convert code-paths.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>
2025-04-29 16:29:37 +00:00
Caio Oliveira
de88184ab6 brw/cmat: Support different src/dst packing factors in emit_packed_alu1
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>
2025-04-29 16:29:37 +00:00
Caio Oliveira
7fa7be970d brw/cmat: Extract emit_packed_alu1() function
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>
2025-04-29 16:29:37 +00:00
Caio Oliveira
4b4500ad35 brw/cmat: Store more information about cmat slices
Store the cmat_description and packing_factor so that various
functions don't need to extract and recalculate them.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>
2025-04-29 16:29:37 +00:00
Caio Oliveira
a7ff177a88 brw: Consider bfloat16 in lower simd width pass
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>
2025-04-29 16:29:37 +00:00
Caio Oliveira
2c31516b3e brw: Consider bfloat16 in lower regioning pass
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>
2025-04-29 16:29:37 +00:00
Caio Oliveira
5936768ce0 brw: Consider bfloat16 in copy propagation
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>
2025-04-29 16:29:37 +00:00
Caio Oliveira
129c074811 brw: Implement support for BFloat16 ALU opcodes
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>
2025-04-29 16:29:37 +00:00
Caio Oliveira
a38960e8f3 brw, nir: Use glsl_base_type instead of nir_alu_type for @dpas_intel
This will allow including types that don't have a nir_alu_type
equivalent, like bfloat16.

Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>
2025-04-29 16:29:37 +00:00
Rohan Garg
9e5d7eb88d compiler/types: add a bfloat16 type
Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>
2025-04-29 16:29:36 +00:00
Ian Romanick
c2ac7fa77b brw/cmod: Allow integer CMP to ADD propagation only for Z and NZ
No shader-db chnages on any Intel platform.

v2: Add a note about integer types in the saturate handling path.

fossil-db:

All Intel platforms had similar results. (Lunar Lake shown)
Totals:
Instrs: 210743769 -> 210743727 (-0.00%)
Cycle count: 30377699060 -> 30377700318 (+0.00%); split: -0.00%, +0.00%

Totals from 36 (0.01% of 706776) affected shaders:
Instrs: 17032 -> 16990 (-0.25%)
Cycle count: 291716 -> 292974 (+0.43%); split: -0.01%, +0.44%

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34509>
2025-04-28 19:44:23 +00:00
Ian Romanick
e26270249b brw/cmod: Don't propagate from CMP to possible Inf + (-Inf)
Most of the churn in this commit is changing unit tests that were
testing things that are now invalid.

shader-db:

All Intel platforms had similar results. (Lunar Lake shown)
total instructions in shared programs: 17122204 -> 17122669 (<.01%)
instructions in affected programs: 120669 -> 121134 (0.39%)
helped: 0 / HURT: 124

total cycles in shared programs: 895602370 -> 895613210 (<.01%)
cycles in affected programs: 17868974 -> 17879814 (0.06%)
helped: 35 / HURT: 85

fossil-db:

All Intel platforms had similar results. (Lunar Lake shown)
Totals:
Instrs: 210736518 -> 210743769 (+0.00%)
Cycle count: 30377733040 -> 30377699060 (-0.00%); split: -0.00%, +0.00%
Max live registers: 66056852 -> 66056966 (+0.00%)

Totals from 1505 (0.21% of 706776) affected shaders:
Instrs: 1890151 -> 1897402 (+0.38%)
Cycle count: 48397408 -> 48363428 (-0.07%); split: -0.11%, +0.04%
Max live registers: 256821 -> 256935 (+0.04%)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 020b0055e7 ("i965/fs: Propagate conditional modifiers from compares to adds")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34509>
2025-04-28 19:44:23 +00:00
Ian Romanick
0dab520a19 brw/cmod: Fix some errors when propagating from CMP to ADD.SAT
When I originally wrote that code, I didn't understand what a jerk NaN
can be.

v2: Remove the brw_type_is_uint stuff. This function is currently only
called for float types. In a later commit, integer types will be
supported but only for NZ and Z conditions. Noticed by Matt.

shader-db:

All Intel platforms had similar results. (Lunar Lake shown)
total instructions in shared programs: 17122197 -> 17122204 (<.01%)
instructions in affected programs: 1691 -> 1698 (0.41%)
helped: 0 / HURT: 4

total cycles in shared programs: 895602484 -> 895602370 (<.01%)
cycles in affected programs: 912964 -> 912850 (-0.01%)
helped: 2 / HURT: 2

fossil-db:

All Intel platforms had similar results. (Lunar Lake shown)
Totals:
Instrs: 210736388 -> 210736518 (+0.00%)
Cycle count: 30377728900 -> 30377733040 (+0.00%); split: -0.00%, +0.00%

Totals from 130 (0.02% of 706776) affected shaders:
Instrs: 169911 -> 170041 (+0.08%)
Cycle count: 18021210 -> 18025350 (+0.02%); split: -0.00%, +0.02%

Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 020b0055e7 ("i965/fs: Propagate conditional modifiers from compares to adds")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34509>
2025-04-28 19:44:23 +00:00
Ian Romanick
8f0fd0e66e brw/cmod: Remove special handling of NOT
The previous commit converts any NOT that might have been affected by
this path into a simple MOV. Those MOVs are handled by other paths.

No shader-db or fossil-db changes on any Intel platform.

v2: Fix a bad squash. Changes that were accidentally in this commit were
supposed to be in the previous commit. Noticed by Ivan.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34509>
2025-04-28 19:44:23 +00:00
Ian Romanick
08fe7988d7 brw/algebraic: Convert some NOT to MOV
On Xe platforms, many fragment shaders have patterns like:

    asr(8)          g21<2>W         g1.2<0,1,0>W    15D
    ...
    mov(8)          g11<1>UW        g21<16,8,2>UW
    ...
    not.nz.f0.0(8)  null<1>D        g11<8,8,1>W

Converting the NOT.NZ to MOV.Z enables copy propagation to eliminate the
original MOV. Then cmod propagation is able to eliminate the
NOT-converted-to-MOV.

It might be possible to cover this case by adding more opcodes to the
list NOT can propagate to. The next commit will show that just
converting to MOV is a better approach anyway.

v2: Fix a bad squash. Changes that were supposed to be in this commit
were accidentally in the next commit. Noticed by Ivan.

shader-db:

Meteor Lake, DG2, and Tiger Lake had similar results. (Meteor Lake shown)
total instructions in shared programs: 20069804 -> 20065167 (-0.02%)
instructions in affected programs: 592450 -> 587813 (-0.78%)
helped: 2300 / HURT: 0

total cycles in shared programs: 884534032 -> 884496201 (<.01%)
cycles in affected programs: 13064194 -> 13026363 (-0.29%)
helped: 1285 / HURT: 790

LOST:   18
GAINED: 15

fossil-db:

Meteor Lake, DG2, and Tiger Lake had similar results. (Meteor Lake shown)
Totals:
Instrs: 234506495 -> 234468664 (-0.02%)
Cycle count: 24444825202 -> 24445710703 (+0.00%); split: -0.01%, +0.01%
Max live registers: 42349793 -> 42349789 (-0.00%)
Max dispatch width: 7131344 -> 7131744 (+0.01%); split: +0.05%, -0.04%

Totals from 16673 (2.07% of 805781) affected shaders:
Instrs: 6497669 -> 6459838 (-0.58%)
Cycle count: 435877770 -> 436763271 (+0.20%); split: -0.54%, +0.74%
Max live registers: 1122972 -> 1122968 (-0.00%)
Max dispatch width: 151528 -> 151928 (+0.26%); split: +2.19%, -1.92%

No shader-db or fossil-db on any other Intel platforms.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34509>
2025-04-28 19:44:23 +00:00
Ian Romanick
9ce869aef5 brw/cmod: Delete some stale comment text
Stale like the mummified remains of Ötzi, The Iceman.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34509>
2025-04-28 19:44:23 +00:00
Ian Romanick
12a022cf45 brw/algebraic: Greatly simplify brw_opt_constant_fold_instruction
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
brw_opt_constant_fold_instruction can either do nothing or replace the
instruction with a MOV of an immediate value. Previously each opcode
case would perform this replacement, and code at the bottom of the
function would verify the results.

It is much simpler if each opcode case calculates a result in a brw_reg,
and code at the bottom of the function performs the replacement. There
are two outlier cases that cannot use this pattern: MAD and
BROADCAST. These cases simply return directly from the switch-statement
after performing the replacement.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34707>
2025-04-28 18:33:42 +00:00
Lionel Landwerlin
1f6cca0800 intel: fixup a few debugging option checks
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: ad328bc58d ("intel: Switch uint64_t intel_debug to a bitset")
Reviewed-by: Michael Cheng <michael.cheng@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34667>
2025-04-23 18:47:42 +00:00
Michael Cheng
ad328bc58d intel: Switch uint64_t intel_debug to a bitset
We are reaching our limit of adding flags to intel_debug
(apporaching 64 flags). Switch intel_debug to a bitset,
which gives us almost "unlimited" bits to use in the future.

v2(Michael Cheng): Fixed a few ci errors

Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Reviewed-by: Casey Bowman <casey.g.bowman@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34596>
2025-04-22 23:09:26 +00:00
Sagar Ghuge
36433e932b intel/rt: Update BVH instance leaf load for Xe3+
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kevin Chuang <kaiwenjon23@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33047>
2025-04-21 20:10:45 +00:00
Sagar Ghuge
5cd0f4ba2f intel/compiler: Update MemRay data structure to 64-bit
Rework: (Kevin)
- Fix miss_shader_index offset
- Handle hit group index

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kevin Chuang <kaiwenjon23@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33047>
2025-04-21 20:10:45 +00:00
Kevin Chuang
7b526de18f intel/compiler/rt: Calculate barycentrics on demand
This commit moves the calculation of tri_bary out of
brw_nir_rt_load_mem_hit_from_addr(), and only do the calculation on
demand, since unorm_float_convert can be expensive. We do this for both
Xe1/2 and Xe3+ for consistency.

Signed-off-by: Kevin Chuang <kaiwenjon23@gmail.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33047>
2025-04-21 20:10:45 +00:00
Sagar Ghuge
afc23dffa4 intel/compiler: Update MemHit data structure to 64-bit version
Rework (Kevin):
- Fix inst leaf ptr
- Handle 24bit unorm barycentric coord

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kevin Chuang <kaiwenjon23@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33047>
2025-04-21 20:10:45 +00:00
Kevin Chuang
40fb95d51a intel/compiler: Use 24bits for hit_kind on Xe3+
For Xe3+, the upper 8 bits of the second dword of a potential hit is
used to store hitGroupIndex0, which is stuffed by the HW. This
hitGroupIndex0 will later be used by the HW again to reconstruct the
whole hitGroupIndex when driver issues a TRACE_RAY_COMMIT.

We were corrupting this hitGroupIndex0 at the driver by setting the
whole dword to hit_kind, which will cause the HW to read a wrong
hitGroupIndex and therefore invoke a wrong closest hit shader. The
behavior can be seen in
dEQP-VK.ray_tracing_pipeline.pipeline_no_null_shaders_flag.gpu.boxes.\*
and dEQP-VK.ray_tracing_pipeline.pipeline_library.configurations.\*

This commit changes the driver to only use lower 24bits to store the
hit_kind, and leave the upper 8bits as it.

Signed-off-by: Kevin Chuang <kaiwenjon23@gmail.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33047>
2025-04-21 20:10:45 +00:00
Sagar Ghuge
64fd66407b intel/compiler: Pass around intel_device_info parameter in helper
This will help us to handle code path separately for Xe3+ for updated
64bit memory data structure for RT.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kevin Chuang <kaiwenjon23@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33047>
2025-04-21 20:10:45 +00:00
Sagar Ghuge
6deb1950a4 anv: Update RT dispatch globals to use 64bit data structure
Rework (Kevin)
- Fix Hit/Miss/Resume shader group table value

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kevin Chuang <kaiwenjon23@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33047>
2025-04-21 20:10:45 +00:00