Julia Zhang
ede29afcff
amd/ds: implement amdgpu_ctx_create
...
Implement amdgpu_ctx_create to create a context to handle the
main process of querying performance counter.
Change-Id: Ic6f741436886bdc3ee3023b52d0f582ce7d2b6b6
2025-10-11 17:18:46 +08:00
Julia Zhang
557fab121e
amd/ds: implement amdgpu_device_create
...
This initializes libdrm_amdgpu and creates amdgpu_device according
to the drm_fd passed from pps driver.
Signed-off-by: Julia Zhang <Julia.Zhang@amd.com>
2025-10-11 17:18:45 +08:00
Julia Zhang
a45b5c99cb
pps: Implement amdgpu pps driver
...
Create derived pps driver for amdgpu and implement basic interfaces.
With this, perfetto can get default data from amdgpu pps driver as
perfcounter value and display it with UI web.
Signed-off-by: Julia Zhang <Julia.Zhang@amd.com>
2025-10-11 17:18:42 +08:00
Faith Ekstrand
99707271d5
nouveau: Import the Blackwell 3D class headers from NVIDIA
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36143 >
2025-07-16 01:18:28 +00:00
Lionel Landwerlin
440e2e9200
genxml: fix 3DSTATE_TE definition on Gfx12.[05]
...
Since Gfx12+ the instruction is 5 dwords.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36146 >
2025-07-16 01:01:11 +00:00
Lionel Landwerlin
ac78693b6a
intel/genxml: rename body field
...
So that the body field has the same name in COMPUTE_WALKER &
EXECUTE_INDIRECT_DISPATCH.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36146 >
2025-07-16 01:01:11 +00:00
Christian Gmeiner
ba0c1d6956
mesa: Include mask value in glStencilMask VERBOSE_API debug output
...
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36140 >
2025-07-16 00:32:13 +00:00
Faith Ekstrand
cff5ee0cf3
nvk: Kepler is now Vulkan 1.2 conformant
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
https://www.khronos.org/conformance/adopters/conformant-products#submission_932
https://www.khronos.org/conformance/adopters/conformant-products#submission_934
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35846 >
2025-07-16 00:17:33 +00:00
Lorenzo Rossi
f2e6bacafd
nak/kepler: Add texdepbar insertion pass
...
This commit adds a forward data-flow pass to insert
texdepbar before using registers of texture fetch instructions.
The new algorithm started as a port of the old codegen pass, but finished
in a complete rewrite that is substantially simpler and should generate
less conservative code in some edge cases.
Signed-off-by: Lorenzo Rossi <snowycoder@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35403 >
2025-07-16 00:02:47 +00:00
Faith Ekstrand
fbeb70cbbc
nak/sm20: TexDepBar::textures_left is 6 bits
...
Fixes: 309c48cbb7 ("nak/sm20: Add texture ops")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35403 >
2025-07-16 00:02:47 +00:00
Lorenzo Rossi
b932ae00e5
nak: Add forward dataflow algorithm
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35403 >
2025-07-16 00:02:47 +00:00
Karol Herbst
0d8b165f11
nvk: add support for 16x8x16 IMMA on Ampere+
...
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32777 >
2025-07-15 23:34:33 +00:00
Karol Herbst
070ac68619
nak: support faster back to back latencies for MMA
...
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32777 >
2025-07-15 23:34:33 +00:00
Mary Guillemard
669c8a5145
nvk: Advertise VK_KHR_cooperative_matrix
...
v2: advertize more int combinations (Karol)
enable saturatingAccumulation for integer matrices (Karol)
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32777 >
2025-07-15 23:34:33 +00:00
Karol Herbst
9c511f7301
nak: Add cooperative matrix lowering pass
...
Mary has written the initial code, I've documented my changes below.
v2: support cmat_convert (Karol)
fix cross matmul (Karol)
rework matrix layout clasasification (Karol)
add support for saturated cmat_muladd (Karol)
Co-authored-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32777 >
2025-07-15 23:34:33 +00:00
Mary Guillemard
f99db217a7
nak: Wire up coop matrix opcodes
...
v2: rebase and scheduling (Karol)
remove Ldsm and Movm (Karol)
add support for saturated cmat_muladd
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32777 >
2025-07-15 23:34:32 +00:00
Mary Guillemard
90438bae51
nir: Add NVIDIA-specific muladd intrinsics
...
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32777 >
2025-07-15 23:34:31 +00:00
Karol Herbst
053b975ca1
nak: fix MMA latencies for Ampere
...
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32777 >
2025-07-15 23:34:30 +00:00
Aleksi Sapon
e54fc3a23c
draw: remove unused prim_flags from run_linear_elts
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35096 >
2025-07-15 23:12:50 +00:00
Aleksi Sapon
c20bb020a6
draw: fix prim_info.start for linear_run_elts
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35096 >
2025-07-15 23:12:49 +00:00
Eric Engestrom
d4a9d62920
turnip+zink/ci: add piglit to the a750 job
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36125 >
2025-07-15 22:52:40 +00:00
Eric Engestrom
855c51fbe7
lavapipe/ci: document recent flakes
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36137 >
2025-07-15 22:27:40 +00:00
Eric Engestrom
b9d1db2092
llvmpipe/ci: document recent flakes
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36137 >
2025-07-15 22:27:40 +00:00
Eric Engestrom
2e13239278
zink+radv/ci: document recent flakes
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36137 >
2025-07-15 22:27:40 +00:00
Eric Engestrom
52f473ae11
broadcom/ci: document recent flakes
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36137 >
2025-07-15 22:27:40 +00:00
Eric Engestrom
1b8a073e4c
radv/ci: document recent flakes
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36137 >
2025-07-15 22:27:40 +00:00
Eric Engestrom
3fc6d51a03
radeonsi/ci: document recent flakes
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36137 >
2025-07-15 22:27:40 +00:00
Eric Engestrom
1840543c6e
lavapipe/ci: document one fixed and two new failures
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36137 >
2025-07-15 22:27:40 +00:00
Eric Engestrom
decbb17324
zink+radv/ci: document new failures
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36137 >
2025-07-15 22:27:40 +00:00
Eric Engestrom
e703847410
zink+nvk/ci: document crash->fail change from !36031
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36137 >
2025-07-15 22:27:40 +00:00
Eric Engestrom
561d73bf20
etnaviv/ci: document fixed tests
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36137 >
2025-07-15 22:27:40 +00:00
Natalie Vock
ac96594b86
aco/isel: Use vector-aligned operands for ds_stack_push8_pop1_rtn_b32
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35269 >
2025-07-15 21:34:40 +00:00
Natalie Vock
b2a95d2133
aco/ra: Add affinities for DS vector-aligned operands
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35269 >
2025-07-15 21:34:40 +00:00
Natalie Vock
df5495b934
aco/assembler: Support vector-aligned operands on DS instructions
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35269 >
2025-07-15 21:34:40 +00:00
Natalie Vock
e978f6e247
radv/rt: Use ds_bvh_stack_push8_pop1_rtn_b32
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35269 >
2025-07-15 21:34:40 +00:00
Natalie Vock
ea66a8d1c5
aco,nir: Add support for GFX12 ds_bvh_stack_push8_pop1_rtn_b32 instruction
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35269 >
2025-07-15 21:34:40 +00:00
Natalie Vock
f0aa383e09
radv/rt: Use ds_bvh_stack_rtn
...
Improves Quake 2 RTX performance by 5% on RDNA3.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35269 >
2025-07-15 21:34:40 +00:00
Natalie Vock
9707b30965
nir,aco: Add ds_bvh_stack_rtn
...
This is a ds instruction that also overwrites its first input, so
introduce a new ds format with two outputs.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35269 >
2025-07-15 21:34:39 +00:00
Natalie Vock
8815845271
radv/rt/gfx12: Always overwrite origin/dir
...
They're unchanged if we don't test against instance nodes. This makes
image_bvh8_intersect_ray kill its direction/origin operands, improving
RA.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35269 >
2025-07-15 21:34:38 +00:00
Natalie Vock
c515f1fd58
aco: Use vector-aligned operands for image_bvh8_intersect_ray
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35269 >
2025-07-15 21:34:38 +00:00
Natalie Vock
c279dd6e61
aco: Support vector-aligned ops fixed to defs
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35269 >
2025-07-15 21:34:38 +00:00
Natalie Vock
f17fe05e32
aco/isel: Improve vector splits for image_bvh8_intersect_ray
...
Using split_vector to split everything into scalars allows copy-prop to
eliminate the final p_create_vector. Considerably reduces copies and
register thrashing.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35269 >
2025-07-15 21:34:38 +00:00
Eric Engestrom
76f1e08222
docs: improve "backport MR" instructions
...
Mainly, the MR should target `YY.N` instead of `staging/YY.N` to avoid
conflicts when the release manager works on the staging branch.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36136 >
2025-07-15 21:29:51 +00:00
Renato Pereyra
b74e641c04
pps: Generate libgpudataprofiling.so from pps-producer sources for Android CTS
...
Android CTS expects GPU counters to be provided by a DSO named
libgpudataprofiling.so. This change leaves pps-producer unchanged and builds
this DSO from the same sources.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36052 >
2025-07-15 20:29:57 +00:00
Renato Pereyra
a739889789
pps: Report available counters when gpu.counters* data source is registered
...
This is required for Android CTS but good to have in general.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36052 >
2025-07-15 20:29:57 +00:00
Ian Romanick
b57bad1fd7
brw/reg_allocate: Check source / destination hazard for all larger SIMD
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
All platforms needs this check for SIMD32. Xe2+ do not need this for
SIMD16.
Also... delete some really stale comments about Gfx4/Gfx5. This compiler
doesn't even support those platforms.
No shader-db changes on any pre-Xe2 Intel platforms:
shader-db:
Lunar Lake
total instructions in shared programs: 17108867 -> 17108855 (<.01%)
instructions in affected programs: 35211 -> 35199 (-0.03%)
helped: 19 / HURT: 6
total cycles in shared programs: 885026794 -> 885805580 (0.09%)
cycles in affected programs: 140449880 -> 141228666 (0.55%)
helped: 903 / HURT: 1142
LOST: 0
GAINED: 25
fossil-db:
Lunar Lake
Totals:
Instrs: 208578317 -> 208574097 (-0.00%); split: -0.00%, +0.00%
Cycle count: 31268800798 -> 31259914590 (-0.03%); split: -0.10%, +0.07%
Spill count: 504472 -> 504102 (-0.07%); split: -0.09%, +0.02%
Fill count: 606581 -> 606079 (-0.08%); split: -0.13%, +0.05%
Scratch Memory Size: 35001344 -> 34957312 (-0.13%)
Totals from 60714 (8.59% of 706970) affected shaders:
Instrs: 48923370 -> 48919150 (-0.01%); split: -0.01%, +0.01%
Cycle count: 11830486210 -> 11821600002 (-0.08%); split: -0.27%, +0.20%
Spill count: 397150 -> 396780 (-0.09%); split: -0.12%, +0.02%
Fill count: 469651 -> 469149 (-0.11%); split: -0.17%, +0.06%
Scratch Memory Size: 25971712 -> 25927680 (-0.17%)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35903 >
2025-07-15 19:35:44 +00:00
Ian Romanick
7e98ca89f2
brw/reg_allocate: Adjust source / destination hazard conditions for broadcast
...
Broadcast selects one lane from the source to write to all the lanes
of the destination. This makes it possible for the first half to
overwrite the source used by the second half.
No shader-db changes on any Intel platform.
fossil-db:
Lunar Lake
Totals:
Instrs: 208705405 -> 208705374 (-0.00%); split: -0.00%, +0.00%
Cycle count: 31274597098 -> 31273711544 (-0.00%); split: -0.00%, +0.00%
Totals from 77 (0.01% of 707133) affected shaders:
Instrs: 220177 -> 220146 (-0.01%); split: -0.02%, +0.00%
Cycle count: 461694212 -> 460808658 (-0.19%); split: -0.33%, +0.14%
No fossil-db changes on any other Intel platforms.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35903 >
2025-07-15 19:35:44 +00:00
Ian Romanick
67dc02acc2
brw/reg_allocate: Only add interference for the source with the hazard
...
shader-db:
Lunar Lake
total instructions in shared programs: 17105892 -> 17105732 (<.01%)
instructions in affected programs: 55720 -> 55560 (-0.29%)
helped: 29 / HURT: 24
total cycles in shared programs: 884342344 -> 884663448 (0.04%)
cycles in affected programs: 154776382 -> 155097486 (0.21%)
helped: 719 / HURT: 761
total spills in shared programs: 3278 -> 3262 (-0.49%)
spills in affected programs: 320 -> 304 (-5.00%)
helped: 4 /HURT: 0
total fills in shared programs: 1632 -> 1616 (-0.98%)
fills in affected programs: 368 -> 352 (-4.35%)
helped: 4 / HURT: 0
LOST: 3
GAINED: 4
No shader-db changes on any other Intel platforms.
fossil-db:
Lunar Lake
Totals:
Instrs: 208696275 -> 208692511 (-0.00%); split: -0.00%, +0.00%
Cycle count: 31325252074 -> 31274118190 (-0.16%); split: -0.27%, +0.11%
Spill count: 504809 -> 504472 (-0.07%); split: -0.07%, +0.01%
Fill count: 607047 -> 606581 (-0.08%); split: -0.08%, +0.01%
Scratch Memory Size: 35037184 -> 35001344 (-0.10%); split: -0.11%, +0.01%
Totals from 44135 (6.24% of 707112) affected shaders:
Instrs: 39570465 -> 39566701 (-0.01%); split: -0.01%, +0.00%
Cycle count: 11140437886 -> 11089304002 (-0.46%); split: -0.76%, +0.30%
Spill count: 279756 -> 279419 (-0.12%); split: -0.13%, +0.01%
Fill count: 354706 -> 354240 (-0.13%); split: -0.14%, +0.01%
Scratch Memory Size: 18758656 -> 18722816 (-0.19%); split: -0.20%, +0.01%
Meteor Lake, DG2, Tiger Lake, Ice Lake, and Skylake had similar results. (Meteor Lake shown)
Totals:
Cycle count: 25377247343 -> 25377246251 (-0.00%); split: -0.00%, +0.00%
Totals from 11 (0.00% of 806166) affected shaders:
Cycle count: 899080 -> 897988 (-0.12%); split: -0.48%, +0.36%
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35903 >
2025-07-15 19:35:43 +00:00
Ian Romanick
4e05de7c3d
brw/reg_allocate: Require SIMD32 for destination / source interference on Xe2
...
No platforms other than Lunar Lake were affected in shader-db or
fossil-db for obvious reasons.
shader-db:
Lunar Lake
total instructions in shared programs: 17070074 -> 17069908 (<.01%)
instructions in affected programs: 151939 -> 151773 (-0.11%)
helped: 61 / HURT: 60
total cycles in shared programs: 891338314 -> 880188516 (-1.25%)
cycles in affected programs: 550482120 -> 539332322 (-2.03%)
helped: 8053 / HURT: 7183
total spills in shared programs: 3294 -> 3278 (-0.49%)
spills in affected programs: 138 -> 122 (-11.59%)
helped: 8 / HURT: 0
total fills in shared programs: 1653 -> 1632 (-1.27%)
fills in affected programs: 212 -> 191 (-9.91%)
helped: 8 / HURT: 0
LOST: 96
GAINED: 70
fossil-db:
Lunar Lake
Totals:
Instrs: 208555066 -> 208509387 (-0.02%); split: -0.03%, +0.00%
Cycle count: 31487691872 -> 31318442816 (-0.54%); split: -0.88%, +0.34%
Spill count: 508701 -> 504809 (-0.77%); split: -0.86%, +0.10%
Fill count: 612583 -> 607047 (-0.90%); split: -1.03%, +0.13%
Scratch Memory Size: 35311616 -> 35037184 (-0.78%); split: -0.81%, +0.04%
Totals from 214417 (30.33% of 706852) affected shaders:
Instrs: 123732970 -> 123687291 (-0.04%); split: -0.04%, +0.01%
Cycle count: 27410928904 -> 27241679848 (-0.62%); split: -1.01%, +0.39%
Spill count: 452458 -> 448566 (-0.86%); split: -0.97%, +0.11%
Fill count: 550991 -> 545455 (-1.00%); split: -1.15%, +0.14%
Scratch Memory Size: 31138816 -> 30864384 (-0.88%); split: -0.92%, +0.04%
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35903 >
2025-07-15 19:35:43 +00:00
Ian Romanick
e9ae997ffc
brw: Only apply GRF 127 send workaround to Gfx9
...
The portion of the Bspec dedicated to Gfx6-Gfx11 says that this
workaround applies to "Pre-CNL" (with CNL being Gfx10). There is no
mention of this workaround in the sections for Xe or Xe2.
No shader-db or fossil-db changes on Skylake or older Intel platforms.
shader-db:
Lunar Lake, Meteor Lake, DG2, Tiger Lake, and Ice Lake (Lunar Lake shown)
total instructions in shared programs: 17107031 -> 17107027 (<.01%)
instructions in affected programs: 32182 -> 32178 (-0.01%)
helped: 16 / HURT: 14
total cycles in shared programs: 895016760 -> 894975410 (<.01%)
cycles in affected programs: 312774834 -> 312733484 (-0.01%)
helped: 9279 / HURT: 8091
LOST: 40
GAINED: 33
The pre-Xe2 platforms had a lot more lost / gained shaders. This appears
to be due to churn in the cycle counts and the SIMD32 heuristic.
fossil-db:
Lunar Lake
Totals:
Instrs: 208667436 -> 208671853 (+0.00%); split: -0.00%, +0.01%
Subgroup size: 14241168 -> 14241200 (+0.00%)
Cycle count: 31495149690 -> 31481397970 (-0.04%); split: -0.17%, +0.13%
Spill count: 508467 -> 508701 (+0.05%); split: -0.10%, +0.14%
Fill count: 611979 -> 612583 (+0.10%); split: -0.07%, +0.17%
Scratch Memory Size: 35288064 -> 35311616 (+0.07%); split: -0.07%, +0.14%
Totals from 205773 (29.10% of 707019) affected shaders:
Instrs: 103153541 -> 103157958 (+0.00%); split: -0.01%, +0.01%
Subgroup size: 4563584 -> 4563616 (+0.00%)
Cycle count: 12979963010 -> 12966211290 (-0.11%); split: -0.42%, +0.32%
Spill count: 494741 -> 494975 (+0.05%); split: -0.10%, +0.15%
Fill count: 597988 -> 598592 (+0.10%); split: -0.07%, +0.17%
Scratch Memory Size: 33351680 -> 33375232 (+0.07%); split: -0.08%, +0.15%
Meteor Lake and DG2 had similar results. (Meteor Lake shown)
Totals:
Instrs: 233063764 -> 233057897 (-0.00%); split: -0.01%, +0.00%
Subgroup size: 9892840 -> 9892856 (+0.00%)
Cycle count: 25387597341 -> 25373885583 (-0.05%); split: -0.36%, +0.31%
Spill count: 518469 -> 517940 (-0.10%); split: -0.19%, +0.09%
Fill count: 559444 -> 558537 (-0.16%); split: -0.29%, +0.13%
Scratch Memory Size: 19694592 -> 19658752 (-0.18%); split: -0.21%, +0.03%
Max dispatch width: 7135248 -> 7131672 (-0.05%); split: +0.13%, -0.18%
Totals from 301996 (37.49% of 805603) affected shaders:
Instrs: 144535999 -> 144530132 (-0.00%); split: -0.01%, +0.01%
Subgroup size: 3768528 -> 3768544 (+0.00%)
Cycle count: 18687102311 -> 18673390553 (-0.07%); split: -0.50%, +0.42%
Spill count: 515687 -> 515158 (-0.10%); split: -0.20%, +0.09%
Fill count: 557638 -> 556731 (-0.16%); split: -0.29%, +0.13%
Scratch Memory Size: 18662400 -> 18626560 (-0.19%); split: -0.22%, +0.03%
Max dispatch width: 2029872 -> 2026296 (-0.18%); split: +0.44%, -0.62%
Tiger Lake
Totals:
Instrs: 238813279 -> 238766482 (-0.02%); split: -0.04%, +0.02%
Subgroup size: 9851320 -> 9851328 (+0.00%)
Cycle count: 23668877036 -> 23646286421 (-0.10%); split: -0.51%, +0.42%
Spill count: 559060 -> 554241 (-0.86%); split: -1.12%, +0.26%
Fill count: 595926 -> 591843 (-0.69%); split: -1.46%, +0.78%
Scratch Memory Size: 19929088 -> 19764224 (-0.83%); split: -1.19%, +0.36%
Max dispatch width: 7102184 -> 7101840 (-0.00%); split: +0.13%, -0.13%
Totals from 284125 (35.42% of 802235) affected shaders:
Instrs: 144695094 -> 144648297 (-0.03%); split: -0.06%, +0.03%
Subgroup size: 3567312 -> 3567320 (+0.00%)
Cycle count: 11303753658 -> 11281163043 (-0.20%); split: -1.07%, +0.87%
Spill count: 554624 -> 549805 (-0.87%); split: -1.13%, +0.26%
Fill count: 592252 -> 588169 (-0.69%); split: -1.47%, +0.78%
Scratch Memory Size: 19553280 -> 19388416 (-0.84%); split: -1.21%, +0.37%
Max dispatch width: 1895488 -> 1895144 (-0.02%); split: +0.48%, -0.50%
Ice Lake
Totals:
Instrs: 239034316 -> 239049108 (+0.01%); split: -0.03%, +0.04%
Subgroup size: 9926440 -> 9926448 (+0.00%)
Cycle count: 24944253156 -> 24919967386 (-0.10%); split: -0.25%, +0.15%
Spill count: 575498 -> 571612 (-0.68%); split: -1.18%, +0.51%
Fill count: 709760 -> 716665 (+0.97%); split: -1.31%, +2.28%
Scratch Memory Size: 20699136 -> 20599808 (-0.48%); split: -1.45%, +0.97%
Max dispatch width: 7140856 -> 7143568 (+0.04%); split: +0.15%, -0.12%
Totals from 233451 (29.01% of 804669) affected shaders:
Instrs: 127440610 -> 127455402 (+0.01%); split: -0.07%, +0.08%
Subgroup size: 2835784 -> 2835792 (+0.00%)
Cycle count: 11818511030 -> 11794225260 (-0.21%); split: -0.53%, +0.32%
Spill count: 559557 -> 555671 (-0.69%); split: -1.22%, +0.52%
Fill count: 694460 -> 701365 (+0.99%); split: -1.34%, +2.33%
Scratch Memory Size: 19774464 -> 19675136 (-0.50%); split: -1.52%, +1.02%
Max dispatch width: 1602736 -> 1605448 (+0.17%); split: +0.69%, -0.52%
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35903 >
2025-07-15 19:35:42 +00:00