Commit graph

189224 commits

Author SHA1 Message Date
Rob Clark
471961d0ca ir3: Comment re-indent
To make this more readable.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33433>
2025-04-08 15:38:38 +00:00
Patrick Lerda
e4a60c216a r600: clean up not used fields detected by clang
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
../src/gallium/drivers/r600/sfn/sfn_shader_gs.h:54:9: warning: private field 'm_first_vertex_emitted' is not used [-Wunused-private-field]
   54 |    bool m_first_vertex_emitted{false};
      |         ^
...

Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34153>
2025-04-08 13:23:47 +00:00
Patrick Lerda
bd88a92dde r600: enable ARB_compute_variable_group_size
This change was tested and passes the piglit tests (20/20)
on cypress, palm and cayman.

Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34404>
2025-04-08 13:04:17 +00:00
Patrick Lerda
58ddf6aaf0 r600: fix points clipping
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This is the backport of eca57f85ee ("radeonsi: fix
gl_ClipDistance and gl_ClipVertex for points").

This change was tested on rv770, palm, barts and cayman. It
fixes 450 khr-gl tests and 64 khr-gles tests on evergreen
and cayman gpus. Here is the list:
spec/glsl-1.20/execution/clipping/vs-clip-vertex-primitives: fail pass
spec/glsl-1.30/execution/clipping/vs-clip-distance-primitives: fail pass
spec/glsl-1.50/execution/compatibility/clipping/gs-clip-vertex-primitives-points: fail pass
khr-gl(3[0-3]|4[0-5])/clip_distance/functional: fail pass
khr-gl(33|4[0-5])/cull_distance/functional_test_item_[0-8]_primitive_mode_points_max_culldist_[0-7]: fail pass
khr-gles3/clip_distance/functional: fail pass
khr-gles3/cull_distance/functional_test_item_[0-8]_primitive_mode_points_max_culldist_[0-7]: fail pass

Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34403>
2025-04-08 12:41:10 +00:00
Patrick Lerda
8fc01db1ac r600: fix pa_su_vtx_cntl rounding mode
This is the backport of 9c49550163. This rounding functionality
is available on all the gpus of the r600 family.

This change was tested on rv770, palm and cayman. This change fixes
at least the "turn-on-off" tests on all these gpus and it does not
add any regression. Here are the tests fixed on palm:
spec/ext_framebuffer_multisample/interpolation 6 centroid-edges: fail pass
spec/ext_framebuffer_multisample/interpolation 8 centroid-edges: fail pass
spec/ext_framebuffer_multisample/turn-on-off 2: fail pass
spec/ext_framebuffer_multisample/turn-on-off 4: fail pass
spec/ext_framebuffer_multisample/turn-on-off 6: fail pass
spec/ext_framebuffer_multisample/turn-on-off 8: fail pass

Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34403>
2025-04-08 12:41:10 +00:00
Patrick Lerda
4d17f8d10a r600: fallback to util_blitter_draw_rectangle when required
This is the backport of dc293ffe50 ("radeonsi:
fallback to util_blitter_draw_rectangle").

This change was tested on rv770, palm and cayman. Here is
the test fixed:
spec/ext_framebuffer_blit/fbo-blit-check-limits: fail pass

Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34403>
2025-04-08 12:41:10 +00:00
Patrick Lerda
9b95e4181e r600: remove deprecated NIR_PASS_V
This change is done in two steps:
find src/gallium/drivers/r600 -type f -exec grep -l NIR_PASS_V {} + | xargs sed -r -i "s/NIR_PASS_V[(]/NIR_PASS(_, /"
git clang-format <previous_commit>

Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33976>
2025-04-08 12:21:24 +00:00
Xaver Hugl
0c1f2b90c9 vulkan/wsi: warn once when HDR metadata is skipped because of protocol errors
Signed-off-by: Xaver Hugl <xaver.hugl@kde.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34000>
2025-04-08 10:30:42 +00:00
Xaver Hugl
cb7726bb2c vulkan/wsi: validate HDR metadata to not cause protocol errors
If it would trigger a protocol error, we must not use it.

Signed-off-by: Xaver Hugl <xaver.hugl@kde.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34000>
2025-04-08 10:30:42 +00:00
Georg Lehmann
64cae5c48d aco: form mixed MTBUF/MUBUF clauses
This should be one clause (all of the instructions load from the same vertex buffer)

s_clause 0x2                                                ; bfa10002
tbuffer_load_format_xyzw v[8:11], v5, s[4:7], 0 format:[BUF_FMT_8_8_8_8_UNORM] idxen offset:36 ; e9c32024 80010805
tbuffer_load_format_xyzw v[12:15], v5, s[4:7], 0 format:[BUF_FMT_8_8_8_8_UNORM] idxen offset:16 ; e9c32010 80010c05
tbuffer_load_format_xyzw v[16:19], v5, s[4:7], 0 format:[BUF_FMT_8_8_8_8_UNORM] idxen offset:12 ; e9c3200c 80011005
s_clause 0x2                                                ; bfa10002
buffer_load_dwordx3 v[20:22], v5, s[4:7], 0 idxen           ; e03c2000 80011405
buffer_load_dwordx3 v[23:25], v5, s[4:7], 0 idxen offset:20 ; e03c2014 80011705
buffer_load_dwordx4 v[28:31], v5, s[4:7], 0 idxen offset:48 ; e0382030 80011c05
tbuffer_load_format_xy v[0:1], v5, s[4:7], 0 format:[BUF_FMT_8_8_UNORM] idxen offset:32 ; e8712020 80010005

Foz-DB Navi21:
Totals from 5624 (7.08% of 79395) affected shaders:
MaxWaves: 149894 -> 149898 (+0.00%)
Instrs: 3032697 -> 3034853 (+0.07%); split: -0.05%, +0.12%
CodeSize: 15907852 -> 15915752 (+0.05%); split: -0.05%, +0.10%
VGPRs: 216248 -> 216144 (-0.05%)
Latency: 10955137 -> 11008760 (+0.49%); split: -0.22%, +0.70%
InvThroughput: 2032857 -> 2033916 (+0.05%); split: -0.03%, +0.08%
VClause: 50120 -> 41778 (-16.64%); split: -16.66%, +0.02%
SClause: 62034 -> 62004 (-0.05%); split: -0.33%, +0.29%
Copies: 253836 -> 254505 (+0.26%); split: -0.17%, +0.43%
VALU: 1621606 -> 1622274 (+0.04%); split: -0.03%, +0.07%
SALU: 653251 -> 653252 (+0.00%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34379>
2025-04-08 09:22:04 +00:00
Georg Lehmann
babe7f3e12 aco/gfx10: simpler solution to avoid store instructions in clauses
Foz-DB Navi21 has no changes.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34379>
2025-04-08 09:22:04 +00:00
Samuel Pitoiset
0ba3a8b3cc radv: add clip rects state bit for emitting discard rectangles
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Better match the hw naming.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34361>
2025-04-08 08:42:17 +00:00
Samuel Pitoiset
08918f0880 radv: regroup emitting all MSAA states in one function
All register writes are optimized out. Also this will allow to use
paired context register writes on GFX12.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34361>
2025-04-08 08:42:17 +00:00
Samuel Pitoiset
e8d787e1ef radv: track more MSAA related register writes
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34361>
2025-04-08 08:42:17 +00:00
Samuel Pitoiset
a327bc677a radv: configure COVERAGE_TO_SHADER_SELECT only if conservative rast is enabled
When conservative rasterization isn't enabled, FullyCoveredEXT is
expected to return 0.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34361>
2025-04-08 08:42:17 +00:00
Samuel Pitoiset
6e9782b39c radv: emit conservative raster mode as part of the MSAA state
From the hw perspective, it's more like a MSAA state.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34361>
2025-04-08 08:42:17 +00:00
Samuel Pitoiset
ed744b5c68 radv: move emitting raster and depth/stencil state slightly earlier
To avoid a redundant chekc if no dynamic states are dirtied.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34361>
2025-04-08 08:42:17 +00:00
Lars-Ivar Hesselberg Simonsen
37595775a0 panvk: Add barrier for interleaved ZS copy cmds
When executing CopyBufferToImage or CopyImage with multiple regions of
both depth and stencil aspects targeting an interleaved depth stencil
image, we must split the regions into one copy-command for each aspect
and add a barrier between them to avoid a write-after-write race.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Fixes: 5067921349 ("panvk: Switch to vk_meta")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34384>
2025-04-08 08:08:35 +00:00
Samuel Pitoiset
ef9e7cb3f5 radv: add before/after draw functions for DGC
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34381>
2025-04-08 08:15:05 +02:00
Samuel Pitoiset
d2da54e6f3 radv: apply the workaround for buggy HiZ/HiS on GFX12 for DGC
Backport-to: 25.0
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34381>
2025-04-08 08:15:04 +02:00
Samuel Pitoiset
6388db03c8 radv: add a workaround for buggy HiZ/HiS on GFX12
HiZ/HiS is buggy and can cause random GPU hangs when stencil is enabled.
There are basically two alternatives but RADV follows RadeonSI and emit
a dummy RELEASE_MEM packet after every draw which should workaround the
issue and maintain performance.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12944
Backport-to: 25.0
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34381>
2025-04-08 08:09:13 +02:00
Samuel Pitoiset
11b6d2ba60 radv: determine if HiZ/HiS is enabled earlier on GFX12
To lower CPU overhead of the hardware workaround.

Backport-to: 25.0
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34381>
2025-04-08 08:03:11 +02:00
Faith Ekstrand
2ff22de626 nak: Use suld.b on Kepler if we have a format
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This works on all GPU generations but we don't actually need it since we
have formatted image loads on everything Maxwell+.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34336>
2025-04-08 04:06:45 +00:00
Faith Ekstrand
6aa2c152b8 nak,nir: Add an image_load_raw_nv intrinsic
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34336>
2025-04-08 04:06:45 +00:00
Faith Ekstrand
e7843720c2 nak: Add support for suld/st.b
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34336>
2025-04-08 04:06:45 +00:00
Faith Ekstrand
3d9185f17e nak: Add a ChannelMask type
We use this for tex and image ops instead of a u8.  This lets us assert
some variants up-front as well as pretty print them.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34336>
2025-04-08 04:06:45 +00:00
Lars-Ivar Hesselberg Simonsen
c2570055d5 vulkan/wsi/wayland: Avoid duplicate colorspace entry
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The colorspace SRGB_NONLINEAR could be added twice when querying
available formats, leading to duplicate entries and VulkanCTS WSI test
failures.

Fixes: 789507c99c ("vulkan/wsi: implement the Wayland color management protocol")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34410>
2025-04-07 23:55:25 +00:00
Faith Ekstrand
436f175187 intel/compiler: Use nir_split_conversions()
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34266>
2025-04-07 17:45:21 -05:00
Caio Oliveira
bf9ad36f2d brw: Properly handle cooperative matrices created with constants
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Expand constant sources to cover the region read by DPAS, and also
use NULL register as accumulator when possible.

Reviewed-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34373>
2025-04-07 14:27:43 -07:00
Mel Henning
16e3e0d93b nvk: Support blackwell in max_warps_per_mp_for_sm
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34161>
2025-04-07 20:28:48 +00:00
Mel Henning
f2aac0f96a nvk: SET_PS_{REGISTER,WARP}_WATERMARKS
Brings Baldur's Gate 3 from 32 to 35 fps on the character creator. (+9%)
Brings Horizon Zero Dawn from 7098 to 7872 points in its bencmark. (+11%)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34161>
2025-04-07 20:28:48 +00:00
Marek Olšák
39d2a1e773 radeonsi: add a VOP3P swizzle requirement for 16-bit packed math
Otherwise ACO fails an assertion.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34016>
2025-04-07 19:44:23 +00:00
Marek Olšák
15b0198d7f radeonsi: lower load/store bit sizes before load/store vectorization
to match RADV and also to reduce code size by -2.33% in 178 affected shaders.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34016>
2025-04-07 19:44:23 +00:00
Marek Olšák
5e5b04cb27 radeonsi/ci: don't run GTF tests (they have been removed from glcts)
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34016>
2025-04-07 19:44:22 +00:00
Marek Olšák
5039feb192 radeonsi/ci: update gfx11 failures
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34016>
2025-04-07 19:44:22 +00:00
Marek Olšák
a4b71e5b2d radeonsi: expose 16-bit NIR types for ALU, MEM, and LDS (no inputs/outputs)
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34016>
2025-04-07 19:44:22 +00:00
Marek Olšák
58f3d6fa20 radeonsi: always use ACO callbacks to scalarize/vectorize 16-bit ALU
This fixes 16-bit ALU with LLVM.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34016>
2025-04-07 19:44:22 +00:00
Marek Olšák
a82705911e radeonsi: work around a primitive restart bug on gfx10-10.3
Using the GE instead of the VGT register has no effect because it's
the same value. SQ_NON_EVENT is the fix.

Discovered by Samuel Pitoiset.

Cc: mesa-stable
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34016>
2025-04-07 19:44:22 +00:00
Marek Olšák
e4a30b7241 ac/surface: remove 64K_2D modifier with 64B max compressed blocks for gfx12
It has no use and is slower.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34016>
2025-04-07 19:44:22 +00:00
Marek Olšák
27d5be13c6 ac/nir/cull: always do frustum culling, skip only small prim culling
Only small prim culling uses the viewport state, so only that must be
disabled when there are multiple viewports.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34016>
2025-04-07 19:44:22 +00:00
Marek Olšák
0f97dc707d ac/nir/cull: rename skip_viewport_culling -> skip_viewport_state_culling
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34016>
2025-04-07 19:44:22 +00:00
Marek Olšák
bc27ad8064 ac: define physical VGPRs for fake hw overrides
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34016>
2025-04-07 19:44:22 +00:00
Alyssa Rosenzweig
e5097a7c9d glsl_to_nir: upcast array indices
array indices need to match the pointer size, otherwise we fail NIR assertions.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6075
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34016>
2025-04-07 19:44:22 +00:00
Marek Olšák
1d5c42528b nir/opt_algebraic: lower 16-bit imul_high & umul_high
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34016>
2025-04-07 19:44:22 +00:00
Mike Blumenkrantz
b14c8128bf tu: check for valid descriptor set when binding descriptors
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
these pointers can be null, and they are checked as null in
pipeline layout creation, but here if the pointer is null it will crash

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34412>
2025-04-07 18:49:10 +00:00
Collabora's Gfx CI Team
fcf19bf335 Uprev ANGLE to 3818d37d5e94317f01810053b8f28c1f1e8b98e6
1b34d2a18a...3818d37d5e

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34378>
2025-04-07 18:16:00 +00:00
Ian Romanick
f33faa4648 brw/nir: Allow b2f(not(X)) optimization on Gfx12.5+
Since there are no type conversions, no restrictions are violated.

No shader-db or fossil-db changes on any Gfx12 or older Intel
platforms.

shader-db:

Lunar Lake, Meteor Lake, and DG2 had similar results. (Lunar Lake shown)
total instructions in shared programs: 16956077 -> 16944933 (-0.07%)
instructions in affected programs: 1957573 -> 1946429 (-0.57%)
helped: 4629 / HURT: 35

total cycles in shared programs: 915668518 -> 915684808 (<.01%)
cycles in affected programs: 341925598 -> 341941888 (<.01%)
helped: 3040 / HURT: 1305
helped stats (abs) min: 2 max: 23034 x̄: 205.36 x̃: 16
helped stats (rel) min: <.01% max: 41.21% x̄: 1.28% x̃: 0.48%
HURT stats (abs)   min: 2 max: 68820 x̄: 490.88 x̃: 22
HURT stats (rel)   min: <.01% max: 103.69% x̄: 2.29% x̃: 0.37%
95% mean confidence interval for cycles value: -50.28 57.78
95% mean confidence interval for cycles %-change: -0.35% -0.07%
Inconclusive result (value mean confidence interval includes 0).

LOST:   40
GAINED: 42

fossil-db:

Lunar Lake, Meteor Lake, and DG2 had similar results. (Lunar Lake shown)
Totals:
Instrs: 209828027 -> 209790349 (-0.02%); split: -0.03%, +0.01%
Cycle count: 30504938008 -> 30514045408 (+0.03%); split: -0.06%, +0.09%
Spill count: 512182 -> 512168 (-0.00%)
Fill count: 623432 -> 623426 (-0.00%); split: -0.00%, +0.00%
Max live registers: 65465029 -> 65464959 (-0.00%)

Totals from 57895 (8.19% of 706589) affected shaders:
Instrs: 50144907 -> 50107229 (-0.08%); split: -0.11%, +0.03%
Cycle count: 7549692606 -> 7558800006 (+0.12%); split: -0.25%, +0.37%
Spill count: 58834 -> 58820 (-0.02%)
Fill count: 102324 -> 102318 (-0.01%); split: -0.01%, +0.01%
Max live registers: 9129045 -> 9128975 (-0.00%)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33931>
2025-04-07 17:42:05 +00:00
Ian Romanick
853ead2073 brw/nir: Optimize b2f(not(X)) using logical operations instead of arithmetic
Funny story... this is how regular b2f was implemented before Curro
implmented the `MOV dst:F -src:D` method 9 years ago (see
3ee2daf23d).

Eliminating the type conversion in the arithmetic operation enables the
next commit.

No shader-db or fossil-db changes on any Intel platform.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33931>
2025-04-07 17:42:05 +00:00
Ian Romanick
3d23496fd9 brw/copy: Copy prop -X into Y&1
This commit prevents code quality regressions in the next
commit. Without this, some fragment shaders in Batman: Arkham Origins
have code like:

    shr(8)          g51<1>UW        g1.28<1,8,0>UB  0x76543210V
    ...
    and(8)          g52<1>UD        ~g51<8,8,1>UW   0x0001UW
    ...
    add(8)          g56<1>D         -g52<8,8,1>D    1D

transformed to

    shr(8)          g51<1>UW        g1.28<1,8,0>UB  0x76543210V
    ...
    and(8)          g52<1>UD        ~g51<8,8,1>UW   0x0001UW
    ...
    mov(8)          g56<1>D         -g52<8,8,1>D
    ...
    and(8)          g57<1>UD        ~g56<8,8,1>D    0x00000001UD

Propagating through the negation allows the added MOV to be deleted.

shader-db:

All Intel platforms had simlar results. (Lunar Lake shown)
total instructions in shared programs: 16968020 -> 16968019 (<.01%)
instructions in affected programs: 281 -> 280 (-0.36%)
helped: 1 / HURT: 0

total cycles in shared programs: 914598850 -> 914598832 (<.01%)
cycles in affected programs: 5398 -> 5380 (-0.33%)
helped: 1 / HURT: 0

A single Blender vertex shader was affected.

fossil-db:

Lunar Lake, Tiger Lake, Ice Lake, and Skylake had similar results. (Lunar Lake shown)
Totals:
Instrs: 209894650 -> 209894651 (+0.00%)
Cycle count: 30545958586 -> 30545952860 (-0.00%)

Totals from 2 (0.00% of 706657) affected shaders:
Instrs: 3582 -> 3583 (+0.03%)
Cycle count: 1875100 -> 1869374 (-0.31%)

Meteor Lake and DG2 had similar results. (Meteor Lake shown)
Totals:
Subgroup size: 9906400 -> 9906416 (+0.00%)

Totals from 2 (0.00% of 805770) affected shaders:
Subgroup size: 16 -> 32 (+100.00%)

Two compute shaders in Hogwarts Legacy were affected.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33931>
2025-04-07 17:42:05 +00:00
Ian Romanick
e82464e6e0 brw/copy: Refactor source modifier type checking
This simplifies the next commit.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33931>
2025-04-07 17:42:05 +00:00