Commit graph

214263 commits

Author SHA1 Message Date
Simon Perretta
bbdd688bc5 docs/pvr: update hardware list
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Reviewed-by: Ashish Chauhan <ashish.chauhan@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37790>
2025-10-11 19:45:10 +00:00
Alessio Belle
1db1038a61 pvr: add device info for BXM-4-64 (36.56.104.183)
Requested by the community [1].

[1] https://gitlab.freedesktop.org/imagination/linux-firmware/-/issues/12

Signed-off-by: Alessio Belle <alessio.belle@imgtec.com>
Reviewed-by: Ashish Chauhan <ashish.chauhan@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37790>
2025-10-11 19:45:10 +00:00
Frank Binns
0dd5db3478 pvr: add device info for GE8300 (22.68.54.30)
Requested by the community [1].

[1] https://gitlab.freedesktop.org/imagination/linux-firmware/-/issues/6

Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Ashish Chauhan <ashish.chauhan@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37790>
2025-10-11 19:45:10 +00:00
Frank Binns
6c42d7eb01 pvr: add device info for GE8300 (22.102.54.38)
Requested by the community [1].

[1] https://gitlab.freedesktop.org/imagination/linux-firmware/-/issues/5

Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Ashish Chauhan <ashish.chauhan@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37790>
2025-10-11 19:45:10 +00:00
Frank Binns
e60e0c96ba pvr: add device info for BXE-2-32 (36.29.52.182)
Requested by the community [1].

[1] https://gitlab.freedesktop.org/imagination/linux-firmware/-/issues/2

Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Ashish Chauhan <ashish.chauhan@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37790>
2025-10-11 19:45:10 +00:00
Frank Binns
2743363a57 pvr: add device info for BXM-4-64 (36.52.104.182)
Requested by the community [1].

[1] https://gitlab.freedesktop.org/imagination/linux-firmware/-/issues/1

Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Ashish Chauhan <ashish.chauhan@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37790>
2025-10-11 19:45:09 +00:00
Frank Binns
5914d1146f pvr: add device info for GX6650 (4.46.6.62)
Requested by the community [1].

[1] https://gitlab.freedesktop.org/mesa/mesa/-/issues/7032

Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Ashish Chauhan <ashish.chauhan@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37790>
2025-10-11 19:45:09 +00:00
Frank Binns
9358c65c3d pvr: add device info for G6110 (5.9.1.46)
Requested by the community [1][2].

[1] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15243#note_1306552
[2] https://gitlab.freedesktop.org/frankbinns/linux-firmware/-/issues/1

Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Ashish Chauhan <ashish.chauhan@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37790>
2025-10-11 19:45:08 +00:00
Frank Binns
4a245d9f57 pvr: add device info for GX6250 (4.45.2.58)
Requested by the community [1].

[1] https://lists.freedesktop.org/archives/dri-devel/2023-June/409639.html

Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Ashish Chauhan <ashish.chauhan@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37790>
2025-10-11 19:45:08 +00:00
Frank Binns
ea28791d40 pvr: add device info for BXE-4-32 (36.50.54.182)
Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Ashish Chauhan <ashish.chauhan@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37790>
2025-10-11 19:45:08 +00:00
Simon Perretta
d41c34c5ca pco: ensure a variable exists for the multiview index
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37724>
2025-10-11 20:28:16 +01:00
Simon Perretta
e7c409cd29 pvr: amend num temps calculation when wg_size is not provided
Fixes: 7a32dc673b ("pvr: add device info and functions for calculating ava...")
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37724>
2025-10-11 20:28:16 +01:00
Simon Perretta
1c1bc876fb pvr: amend tile buffer size calculation for eot
Fixes: a67120cda3 ("pvr, pco: full support for tile buffer eot handling")
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37724>
2025-10-11 20:28:16 +01:00
Simon Perretta
b0609a30b1 pco: improve early and late algebraic pass ordering
Ensures early algebraic passes aren't called again following late
algebraic passes, so that the latter's opts aren't undone (e.g.
unfusing ffmas).

Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37724>
2025-10-11 20:28:16 +01:00
Simon Perretta
e637d01ef2 pco: tidy and commonize conversion ops
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37724>
2025-10-11 20:28:16 +01:00
Simon Perretta
34b4b35ca8 pco: apply rounding mode to relevant conversion ops
The rounding behaviour on [iu]2f32 ops needs to be explicitly set in
order to match the implicit behaviour described in the
KHR_shader_float_controls properties.

Fixes: e306abc6e6 ("pvr: implement KHR_shader_float_controls")
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37724>
2025-10-11 20:28:16 +01:00
Mel Henning
a89ab2993a nvk: Reduce subc switches with events
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mohamed Ahmed <mohamedahmedegypt2001@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37671>
2025-10-11 16:58:24 +00:00
Mel Henning
a3ed200300 nvk/cmd_copy: Pipeline user copy_rect operations
Reviewed-by: Mohamed Ahmed <mohamedahmedegypt2001@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37671>
2025-10-11 16:58:24 +00:00
Mel Henning
e9432eb3e0 nvk/cmd_copy: Use PIPELINED for user transfers
Vulkan requires applications to insert any necessary pipeline barriers.

Reviewed-by: Mohamed Ahmed <mohamedahmedegypt2001@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37671>
2025-10-11 16:58:24 +00:00
Mel Henning
08861bad46 nvk: WFI on the most recent subc
This should be a bit faster. It also matches what the proprietary driver
generates, based on the reverse engineering done here:
https://gitlab.freedesktop.org/mhenning/re/-/tree/main/vk_test_overlap_exec

Reviewed-by: Mohamed Ahmed <mohamedahmedegypt2001@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37671>
2025-10-11 16:58:24 +00:00
Mel Henning
8447dba5b3 nvk: INVALIDATE_SHADER_CACHES on most recent subc
This should be a bit faster.

Reviewed-by: Mohamed Ahmed <mohamedahmedegypt2001@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37671>
2025-10-11 16:58:24 +00:00
Mohamed Ahmed
7a0e7d24bb nvk: Use the compute MME for compute dispatch
Switching from compute to 3D and vice versa leads to a long stall which
destroys compute performance. This switches to the compute MME on Ampere
onwards (which was where it was added) for compute dispatches which eliminates
stalling from sub-channel switching in these cases.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37671>
2025-10-11 16:58:24 +00:00
Mohamed Ahmed
146a64524d nouveau/mme: Add unit tests for sharing between compute and 3D scratch registers
Co-developed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Tested-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37671>
2025-10-11 16:58:24 +00:00
Faith Ekstrand
0bfe27553d nvk: Actually reserve 1/2 for FALCON
In 03f785083f ("nvk: Reserve MME scratch area for communicating with
FALCON"), we said we reserved these but actually only reserved 0.  Only
0 is actually used today but if we're going to claim to reserve
registers we should actually do it.

Reviewed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mohamed Ahmed <mohamedahmedegypt2001@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37671>
2025-10-11 16:58:24 +00:00
Mohamed Ahmed
17ab1d463f nouveau/headers: Add AMPERE_B compute subchannel definition
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37671>
2025-10-11 16:58:24 +00:00
Mel Henning
0e3781df7f vulkan: Drop vk_pipeline_stage_flags2_has_*_shader
These are no longer used anywhere. Moreover, it's not clear that they
can be used for a correct implementation of pipeline barriers since a
correct implementation cannot ignore execution deps in non-shader
stages.

Reviewed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mohamed Ahmed <mohamedahmedegypt2001@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37671>
2025-10-11 16:58:24 +00:00
Mel Henning
2eeef34e35 nvk/cmd_buffer: Remove redundant tests for access
In each of these cases, the spec mandates that apps pair a memory barrier
specified with access with a relevant exec barrrier specified by stages.
We therefore don't need to wfi based on access - the tests on stage are
sufficient.

Acked-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mohamed Ahmed <mohamedahmedegypt2001@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37671>
2025-10-11 16:58:24 +00:00
Mel Henning
515793d5bb nvk: Fix execution deps in pipeline barriers
We were under-synchronizing before. In particular, `stages` form
execution barriers even in the absence of a memory barrier in the
`access` flags.

The particular issue that prompted this was one where we weren't waiting
on a pipeline barrier in Baldur's Gate 3 with:

    srcStageMask == VK_PIPELINE_STAGE_2_FRAGMENT_SHADER_BIT &&
    srcAccessMask == VK_ACCESS_2_SHADER_READ_BIT &&
    dstStageMask == (VK_PIPELINE_STAGE_2_EARLY_FRAGMENT_TESTS_BIT |
                     VK_PIPELINE_STAGE_2_LATE_FRAGMENT_TESTS_BIT) &&
    dstAccessMask == (VK_ACCESS_2_DEPTH_STENCIL_ATTACHMENT_READ_BIT |
                      VK_ACCESS_2_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT)

Based on the spec and discussion in
https://github.com/KhronosGroup/Vulkan-Docs/issues/131 the read bit in
srcAccessMask doesn't really matter here - what matters is that there's
an execution barrier on the fragment stage which needs to prevent the
fragment shader exection from overlapping with the later call's
fragment tests (which write to the depth attachment).

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13909
Reviewed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mohamed Ahmed <mohamedahmedegypt2001@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37671>
2025-10-11 16:58:24 +00:00
Mel Henning
895bbb7601 nvk: Combine BARRIER_{COMPUTE,RENDER}_WFI
When we want to WFI, we only need to do so on a single channel. The
others will implicitly get a WFI from the channel switch.

Reviewed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mohamed Ahmed <mohamedahmedegypt2001@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37671>
2025-10-11 16:58:24 +00:00
Mel Henning
6c44390e80 nvk: Only run one INVALIDATE_SHADER_CACHES
This is presumably the same cache across compute and 3d, so we only need
to run one of these, not two.

Reviewed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mohamed Ahmed <mohamedahmedegypt2001@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37671>
2025-10-11 16:58:23 +00:00
Lorenzo Rossi
b56b5b90f7 nvk: Fix QMD buffer length on upload
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Current code allocates the maximum QMD data for all generations and
uploads everything, even on generations where a smaller QMD buffer
suffices. This is not only wasteful, but actually crashes Kepler GPUs
due to complications with the QMD queue.

Only upload the useful bytes of the QMD buffer.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14070
Fixes: 0e268dad00 ("nvk: Allow for larger QMDs")
Signed-off-by: Lorenzo Rossi <git@rossilorenzo.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37815>
2025-10-11 08:20:22 +00:00
Surafel Assefa
a219308867 wsi: Implements scaling controls for DRI3 presentation.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30701>
2025-10-11 06:59:37 +00:00
Caio Oliveira
74859c19fb intel/executor: Add a matrix multiplication example
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37805>
2025-10-11 01:02:45 +00:00
Caio Oliveira
1e0ee84841 intel/executor: Add DPAS examples for HF/F, UB/UD and BF/F
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37805>
2025-10-11 01:02:45 +00:00
Caio Oliveira
62f07dc5e3 intel/executor: Add script directory to package.path
In Lua, modules (i.e. files with lua code) are loaded by using
the standard library require(), e.g.

```
local mylib = require("mylib")

mylib.do_something()
```

The require() will decide where to look by peeking at `package.path`
table.  By default it doesn't include the scripts directory, so running
executor from the script directory vs. from the root of the repo would
yield different results (require works vs. require fail to find the
module).  This patch includes the script directory to avoid this issue.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37805>
2025-10-11 01:02:45 +00:00
Caio Oliveira
86947062e9 intel/executor: Expose a devinfo table
So we can pull other values from devinfo struct.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37805>
2025-10-11 01:02:44 +00:00
Caio Oliveira
5987269750 intel/executor: Drop check_ver and check_verx10 functions
Favor explicit version checks, that can use different types of
comparisons other than equality on a list.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37805>
2025-10-11 01:02:44 +00:00
Emma Anholt
f8729ee920 ir3: Use bitset range operations.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This sped up the debugoptimized compile of a fossil I was looking at by
7%.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37777>
2025-10-10 23:13:04 +00:00
Emma Anholt
aa85e3331f ir3/parser: Make sure relative accesses have a size set.
This will avoid assertion failures about a size==0 in the upcoming change
to regmask bitset handling, when collect_info() usees them to track
references into the current alias table.  We know that relative accesses
won't go to the alias table, but that code doesn't.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37777>
2025-10-10 23:13:04 +00:00
Emma Anholt
30b7772ae4 ir3: Move the big block of C support code out of the parser .y file.
This way you get nice syntax highlighting and clang-formatting and all
that when trying to edit the C code.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37777>
2025-10-10 23:13:04 +00:00
Lionel Landwerlin
febac6d9bd anv: fix query copy with shaders
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
First this is only possible on RCS or CCS engines.

Second if on CCS, we need to use a compute shader, 3D won't work.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37818>
2025-10-10 21:31:09 +00:00
Jesse Natalie
c2d288bf97 microsoft/compiler: Respect write masks when lowering unaligned loads and stores
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37778>
2025-10-10 19:53:15 +00:00
Jesse Natalie
b3242516ad microsoft/compiler: Use lower_mem_access_bit_sizes for scratch/shared
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37778>
2025-10-10 19:53:15 +00:00
Emma Anholt
f7cbc7b1c5 radv: Allocate BOs as implicit sync even if the WSI is doing implicit sync.
As noted, the flag we allocate with controls whether *anyone* can implicit
sync on the BO through amdgpu interfaces, not just whether our fd does.
This restores radv to the behavior before the regressing commit.

Fixes: 4dcf32c56e ("wsi/drm: Don't request implicit sync if we're doing implicit sync ourselves.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37772>
2025-10-10 19:17:04 +00:00
Emma Anholt
38ac55ebff radv: Restore marking WSI image's mem->buffer as uncached.
Prior to 4dcf32c56e, radv was getting a request for implicit sync, even
when we were doing the work to do implicit sync in the WSI.  Once that was
turned off, we incidentally dropped flagging WSI's mem->buffer as
uncached, due to it being under the wrong condition.

Fixes: 4dcf32c56e ("wsi/drm: Don't request implicit sync if we're doing implicit sync ourselves.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37772>
2025-10-10 19:17:04 +00:00
Ian Romanick
ca493b5c45 brw: elk: Fix name of function in comment
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Trivial.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37186>
2025-10-10 17:25:11 +00:00
Ian Romanick
1e691e68e2 nir/algebraic: Optimize bfi with odd-valued mask to bitfield_select
shader-db:

Lunar Lake, Meteor Lake, and DG2 had similar results. (Lunar Lake shown)
total instructions in shared programs: 17181254 -> 17181046 (<.01%)
instructions in affected programs: 35834 -> 35626 (-0.58%)
helped: 130 / HURT: 2

total cycles in shared programs: 888543370 -> 888554248 (<.01%)
cycles in affected programs: 7443984 -> 7454862 (0.15%)
helped: 95 / HURT: 87

fossil-db:

Lunar Lake
Totals:
Instrs: 233260196 -> 233259474 (-0.00%); split: -0.00%, +0.00%
Cycle count: 32754567116 -> 32754515890 (-0.00%); split: -0.00%, +0.00%
Max live registers: 71738442 -> 71738398 (-0.00%); split: -0.00%, +0.00%

Totals from 6842 (0.87% of 790721) affected shaders:
Instrs: 5566926 -> 5566204 (-0.01%); split: -0.01%, +0.00%
Cycle count: 512487046 -> 512435820 (-0.01%); split: -0.20%, +0.19%
Max live registers: 1100656 -> 1100612 (-0.00%); split: -0.00%, +0.00%

Meteor Lake and DG2 had similar results. (Meteor Lake shown)
Totals:
Instrs: 264071212 -> 264066944 (-0.00%); split: -0.00%, +0.00%
Cycle count: 26552458051 -> 26553286277 (+0.00%); split: -0.00%, +0.01%
Spill count: 530380 -> 530084 (-0.06%)
Fill count: 613416 -> 612900 (-0.08%)
Scratch Memory Size: 20089856 -> 20075520 (-0.07%)
Max live registers: 46558852 -> 46558811 (-0.00%); split: -0.00%, +0.00%
Max dispatch width: 8034616 -> 8034584 (-0.00%)

Totals from 6653 (0.73% of 905545) affected shaders:
Instrs: 5750844 -> 5746576 (-0.07%); split: -0.08%, +0.00%
Cycle count: 416414845 -> 417243071 (+0.20%); split: -0.20%, +0.40%
Spill count: 1953 -> 1657 (-15.16%)
Fill count: 3556 -> 3040 (-14.51%)
Scratch Memory Size: 92160 -> 77824 (-15.56%)
Max live registers: 566003 -> 565962 (-0.01%); split: -0.01%, +0.00%
Max dispatch width: 55768 -> 55736 (-0.06%)

No shader-db or fossil-db changes on any previous Intel platforms.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37186>
2025-10-10 17:25:11 +00:00
Ian Romanick
b948e6d503 brw: Use BFN to implement nir_opt_bitfield_select
shader-db:

Lunar Lake, Meteor Lake, and DG2 had similar results. (Lunar Lake shown)
total instructions in shared programs: 17181559 -> 17181254 (<.01%)
instructions in affected programs: 250921 -> 250616 (-0.12%)
helped: 303 / HURT: 0

total cycles in shared programs: 888542568 -> 888543370 (<.01%)
cycles in affected programs: 49861772 -> 49862574 (<.01%)
helped: 181 / HURT: 110

fossil-db:

Lunar Lake, Meteor Lake, and DG2 had similar results. (Lunar Lake shown)
Totals:
Instrs: 233260591 -> 233260196 (-0.00%); split: -0.00%, +0.00%
Cycle count: 32754501248 -> 32754567116 (+0.00%); split: -0.00%, +0.00%
Max live registers: 71738476 -> 71738442 (-0.00%)
Non SSA regs after NIR: 67837262 -> 67837108 (-0.00%); split: -0.00%, +0.00%

Totals from 226 (0.03% of 790721) affected shaders:
Instrs: 382227 -> 381832 (-0.10%); split: -0.15%, +0.05%
Cycle count: 72863878 -> 72929746 (+0.09%); split: -0.65%, +0.74%
Max live registers: 36557 -> 36523 (-0.09%)
Non SSA regs after NIR: 60427 -> 60273 (-0.25%); split: -0.26%, +0.00%

No shader-db or fossil-db changes on any previous Intel platforms.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37186>
2025-10-10 17:25:11 +00:00
Ian Romanick
4193895145 brw/cmod: Enable limited cmod propagation for BFN
cmod propagation needs more work. Since the result type is always UD,
BRW_CONDITION_G should be able to substitute for NZ. Either that or
users of the condition could be rewritten to use an inverted condition.

v2: Add a couple more unit tests. Suggested by Matt.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37186>
2025-10-10 17:25:11 +00:00
Ian Romanick
fb193ac190 brw/builder: Add BFN
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37186>
2025-10-10 17:25:10 +00:00