mesa/src/intel at 24c66d387110467b3525712b836c2a624d52a934 - fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-19 17:58:09 +02:00

History

Kenneth Graunke 24c66d3871 brw: Vectorize URB intrinsics using nir_opt_load_store_vectorize This helps cut down URB messages on tessellation and mesh shaders significantly. fossil-db results on Battlemage: Instrs: 505172392 -> 505207187 (+0.01%); split: -0.00%, +0.01% Send messages: 23678197 -> 23656126 (-0.09%); split: -0.09%, +0.00% Cycle count: 63150470088 -> 63147482640 (-0.00%); split: -0.01%, +0.00% Spill count: 576554 -> 576616 (+0.01%) Fill count: 545304 -> 545413 (+0.02%) Max live registers: 141099192 -> 141150675 (+0.04%); split: -0.00%, +0.04% Max dispatch width: 39856192 -> 39856208 (+0.00%) Totals from 4231 (0.27% of 1583648) affected shaders: Instrs: 1620161 -> 1654956 (+2.15%); split: -0.25%, +2.40% Send messages: 128652 -> 106581 (-17.16%); split: -17.18%, +0.03% Cycle count: 24650700 -> 21663252 (-12.12%); split: -12.82%, +0.70% Spill count: 378 -> 440 (+16.40%) Fill count: 1308 -> 1417 (+8.33%) Max live registers: 364676 -> 416159 (+14.12%); split: -0.24%, +14.36% Max dispatch width: 67952 -> 67968 (+0.02%) There are several reasons we didn't go with nir_opt_vectorize_io: 1. nir_opt_vectorize_io appears to work on the slot location level. We want to be able to vectorize based on the URB offsets, especially for cases like point size, layer, and viewport which have different VARYING_SLOT_* values but live in the same vec4 in a URB entry. 2. We want vec8 stores, and nir_opt_vectorize_io only seems to vectorize within a single 32-bit vec4. It does handle 8 components, but that's only for packing 16-bit values into a 32-bit vec4. Improves performance of Sascha Willems' tessellation demo by around 4% on Meteorlake. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39250>		2026-01-27 16:08:36 +00:00
..
blorp	blorp: fix asserts hit with msaa blorp blits on xe3	2026-01-27 15:28:55 +00:00
ci	ci: update trace checksums	2026-01-19 16:11:29 +00:00
common	intel/measure: Define snapshot type for HiZ partial resolves.	2026-01-27 08:52:16 +00:00
compiler	brw: Vectorize URB intrinsics using nir_opt_load_store_vectorize	2026-01-27 16:08:36 +00:00
decoder	intel/decoder: make libvulkan_intel to depend on stub decoder when buildtyle=release.	2025-11-24 16:40:02 +08:00
dev	intel/dev: Add INTEL_DEVICE_INFO_MMAP_MODE_INVALID	2026-01-26 15:24:55 +00:00
ds	anv: instrument resource barriers instruction in u_trace	2025-12-15 08:25:42 +00:00
executor	meson: make dep_lua a disabler	2025-11-21 21:48:57 +00:00
genxml	intel/blorp: Add support for partial resolves of HiZ-CCS surfaces.	2026-01-27 08:52:17 +00:00
isl	intel/isl: Add unit tests for ISL_AUX_STATE_COMPRESSED_HIER_DEPTH.	2026-01-27 08:52:18 +00:00
mda	intel/mda: Handle better processing a lot of archives	2025-12-13 01:21:08 +00:00
nullhw-layer	build: avoid redefining unreachable() which is standard in C23	2025-07-31 17:49:42 +00:00
perf	intel/perf: Add Gfx 12.5 mdap_metrics struct and set it	2026-01-19 19:24:16 +00:00
shaders	util/glsl2spirv: Use better glslang flag for -Olib	2025-11-20 02:14:50 +00:00
tools	intel/hang_replay: add option to dump VM state as part of the dump	2026-01-07 19:16:25 +00:00
vulkan	driconf: LTO disable	2026-01-27 14:57:20 +00:00
vulkan_hasvk	intel/isl: Define ISL_AUX_STATE_COMPRESSED_HIER_DEPTH aux state.	2026-01-27 08:52:12 +00:00
meson.build	brw: Move into a new src/intel/compiler/brw subdirectory	2025-10-09 07:01:47 +00:00