fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-21 06:48:09 +02:00

Author	SHA1	Message	Date
Lionel Landwerlin	983b62ea50	anv: fix query clearing with blorp compute operations If we did clear a query buffer in compute mode, the flushing needs to match the engine used for clearing. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `6823ffe70e` ("anv: try to keep the pipeline in GPGPU mode when buffer transfer ops") Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28285>	2024-03-26 14:29:38 +00:00
Lionel Landwerlin	601d219257	anv: fix bitfield checks in gfx runtime flushing s/SET/TEST/ Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `50f6903bd9` ("anv: add new low level emission & dirty state tracking") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28387>	2024-03-26 12:59:37 +00:00
Lionel Landwerlin	341a9e9194	anv: fix temporary state pool allocation failures Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `82d772fa9b` ("anv: create new helper for small allocations") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28387>	2024-03-26 12:59:37 +00:00
Lionel Landwerlin	0264fc688f	anv: fix block pool allocation failure Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28387>	2024-03-26 12:59:37 +00:00
Lionel Landwerlin	58a91f6a8c	anv: fix invalid border color free The right one is a few lines below. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `44bf552704` ("anv: allocate border colors for descriptor buffers") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28387>	2024-03-26 12:59:37 +00:00
Lionel Landwerlin	1d7c38a5de	blorp: handle a few allocation failure cases Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28387>	2024-03-26 12:59:37 +00:00
José Roberto de Souza	0113a2d4b3	intel/decoder: Fix binding table pointer entry being marked as invalid If entry goes until the last byte of the bo it was being marked as not valid while it is valid. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28376>	2024-03-25 20:27:06 +00:00
Eric Engestrom	14279087fb	ci/deqp-runner: split gl & gles groups to use the correct binary Now that these can come from different releases, with different sets of patches backported to them, it matters that we use the correct one. Fixes: `78ea3bb43d` ("ci/deqp: use the proper gl/gles releases for deqp-gl, deqp-gles, deqp-egl") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28343>	2024-03-24 22:14:06 +00:00
Paulo Zanoni	460bacc223	anv: set shaderFloat64 to true when fp64_workaround_enabled According to 00-mesa-defaults.conf, the only game that seems to care about fp64_workaround_enabled right now is Doom Eternal. After some brief testing I couldn't spot any performance difference by setting shaderFloat64 to true. We want to set this to true so that DIRT 5 can work, as it looks at shaderFloat64 and then refuses to launch today. Link: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9882 Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28213>	2024-03-23 17:01:17 +00:00
Ian Romanick	b835784dde	intel/brw: Remove last vestiges of could_coissue Most of the obvious bits were removed by `7ac5696157` ("intel/brw: Remove Gfx8- code from backend passes"). No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28342>	2024-03-23 01:29:22 +00:00
Yonggang Luo	1ac1c0843f	treewide: Replace usage of macro DEBUG with MESA_DEBUG when possible This is achieved by the following steps: #ifndef DEBUG => #if !MESA_DEBUG defined(DEBUG) => MESA_DEBUG #ifdef DEBUG => #if MESA_DEBUG This is done by replace in vscode excludes docs,.rs,addrlib,src/imgui,.sh,src/intel/vulkan/grl/gpu These are safe because those files should keep DEBUG macro is already excluded; and not directly replace DEBUG, as we have some symbols around it. Use debug or NDEBUG instead of DEBUG in comments when proper This for reduce the usage of DEBUG, so it's easier migrating to MESA_DEBUG These are found when migrating DEBUG to MESA_DEBUG, these are all comment update, so it's safe Replace comment /* DEBUG / and / !DEBUG / with proper / MESA_DEBUG / or / !MESA_DEBUG */ manually DEBUG \|\| !NDEBUG -> MESA_DEBUG \|\| !NDEBUG !DEBUG && NDEBUG -> !(MESA_DEBUG \|\| !NDEBUG) Replace the DEBUG present in comment with proper new MESA_DEBUG manually Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Acked-by: David Heidelberg <david.heidelberg@collabora.com> Reviewed-by: Eric Engestrom <eric@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28092>	2024-03-22 18:22:34 +00:00
Mark Janes	4acea392af	intel/compiler: drop unused ray-tracing fields from cache hash The compiler only references `intel_device_info->subslice_masks` for ray tracing workloads. Platforms which lack raytracing support can share a cache even if they differ on this field. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28311>	2024-03-22 00:01:28 +00:00
Kenneth Graunke	9a72116367	intel/brw: Unify DF and Q/UQ lowering for MOV Using the new unsupported_64bit_type helper. Fixes: `ea423aba1b` ("intel/brw: Split out 64-bit lowering from algebraic optimizations") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28328>	2024-03-21 23:25:56 +00:00
Kenneth Graunke	97c7d5113d	intel/brw: Use correct execution pipe for lowering SEL on DF This is a float operation, let's keep it on the float pipe. Fixes: `ea423aba1b` ("intel/brw: Split out 64-bit lowering from algebraic optimizations") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28328>	2024-03-21 23:25:56 +00:00
Kenneth Graunke	26d65e96dd	intel/brw: Assert that min/max are not happening in 64-bit SEL lowering These aren't handled, only pure selects. Fixes: `ea423aba1b` ("intel/brw: Split out 64-bit lowering from algebraic optimizations") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28328>	2024-03-21 23:25:56 +00:00
Kenneth Graunke	a2c2a7bc00	intel/brw: Fix check for 64-bit SEL lowering types The 64-bit type lowering for SEL in opt_algebraic had a pre-existing bug where it only triggered when 64-bit float _and_ integer types were unsupported. Meteorlake supports 64-bit float but not integer, so we need to lower Q/UQ in that case still. When I moved this to a later pass, opt_peephole_sel started generating Q/UQ SEL instructions which were failing to be lowered. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10867 Fixes: `ea423aba1b` ("intel/brw: Split out 64-bit lowering from algebraic optimizations") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28328>	2024-03-21 23:25:56 +00:00
Dylan Baker	75ede9d9bc	intel/brw: track last successful pass and leave the loop early This is similar to what RADV implements using the NIR_LOOP_PASS helpers. I have not used those helpers for a couple of reasons: 1. They use the pointer to the optimization function, which doesn't work if the same function is called multiple times in one invocation of the loop (fixable) 2. After fixing them, due to Intel's use of sub-expressions, the amount of code added to wrap the shared macro becomes more than simply reimplementing them for the Intel compiler On most workloads the results are a wash, but on compile heavy workloads like Cyberpunk 2077 and Rise of the Tomb Raider, I saw fossil-db runtimes fall by 1-2% on my ICL, with no changes to the compiled shaders. Caio saw closer to 2.5% on TGL. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27510>	2024-03-21 23:02:32 +00:00
Caio Oliveira	b2ee98d2db	intel/brw: Handle Xe2 in brw_fs_opt_zero_samples The mlen tracking is in REG_SIZE units, but in Xe2 each GRF has doubled the size. The optimization can only elide full GRFs, so round down the amount of trailing zeros to ensure the optimization will remove only full GRFs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28279>	2024-03-21 22:38:54 +00:00
Ian Romanick	cd70e49394	intel/brw: Allow SIMD16 F and HF type conversion moves On DG2, the lowering generated for these MOV instructions is awful. The original SIMD16 MOV { 18} 67: mov(16) vgrf54+0.0:HF, vgrf46+0.0:F NoMask group0 is lowered to SIMD8 MOVs: { 18} 118: mov(8) vgrf54+0.0:HF, vgrf46+0.0:F NoMask group0 { 18} 119: mov(8) vgrf54+0.16:HF, vgrf46+1.0:F NoMask group8 These MOVs violate Gfx12.5 region restrictions, so these are further lowered: { 17} 119: mov(8) vgrf83<2>:HF, vgrf46+0.0:F NoMask group0 { 19} 120: mov(8) vgrf54+0.0:UW, vgrf83<2>:UW NoMask group0 { 19} 122: mov(8) vgrf84<2>:HF, vgrf46+1.0:F NoMask group8 { 19} 123: mov(8) vgrf54+0.16:UW, vgrf84<2>:UW NoMask group8 The shader-db and fossil-db results are nothing to get excited about. However, the affect on vk_cooperative_matrix_perf is substantial. In one subtest shader: shaders/shmemfp16.spv cooperativeMatrixProps = 8x8x16 A = float16_t B = float16_t C = float16_t D = float16_t scope = subgroup TILE_M=128 TILE_N=128, TILE_K=32 BLayout=0 performance on my DG2 improved by ~60% due to a MASSIVE reduction in spills and fills: -Native code for unnamed compute shader (null) (src_hash 0x00000000) (sha1 c6a41b1c4e7aa2da327a39a70ed36c822a4b172f) -SIMD32 shader: 32484 instructions. 1 loops. 1893868 cycles. 737:1820 spills:fills, 442 sends, scheduled with mode none. Promoted 1 constants. Compacted 519744 to 492224 bytes (5%) - START B0 (20782 cycles) +Native code for unnamed compute shader (null) (src_hash 0x00000000) (sha1 621e960daad5b5579b176717f24a315e7ea560a1) +SIMD32 shader: 23918 instructions. 1 loops. 1089894 cycles. 432:1166 spills:fills, 442 sends, scheduled with mode none. Promoted 1 constants. Compacted 382688 to 353232 bytes (8%) shader-db: All Gfx9 and later platforms had similar results. (Meteor Lake shown) total instructions in shared programs: 19656270 -> 19653981 (-0.01%) instructions in affected programs: 61810 -> 59521 (-3.70%) helped: 116 / HURT: 0 total cycles in shared programs: 823368888 -> 823375854 (<.01%) cycles in affected programs: 1165284 -> 1172250 (0.60%) helped: 51 / HURT: 57 fossil-db: DG2 and Meteor Lake had similar results. (Meteor Lake shown) * Shaders only in 'before' results are ignored: fossil-db/steam-dxvk/total_war_warhammer3/2a3ed2ca632a7cb7/fs.32, fossil-db/steam-dxvk/total_war_warhammer3/18b9d4a3b1961616/fs.32, fossil-db/steam-dxvk/total_war_warhammer3/04ac9f3146a6db19/fs.32, fossil-db/steam-dxvk/total_war_warhammer3/f37ebec6aa1b379a/fs.32, fossil-db/steam-dxvk/total_war_warhammer3/255c987feb0d4310/fs.32, and 25 more from 1 apps: fossil-db/steam-dxvk/total_war_warhammer3 Totals: Instrs: 160946537 -> 160928389 (-0.01%); split: -0.01%, +0.00% Cycles: 14125908620 -> 14125873958 (-0.00%); split: -0.00%, +0.00% Totals from 1002 (0.15% of 652134) affected shaders: Instrs: 411261 -> 393113 (-4.41%); split: -4.41%, +0.00% Cycles: 16676735 -> 16642073 (-0.21%); split: -0.48%, +0.27% Tiger Lake Totals: Instrs: 164511816 -> 164497202 (-0.01%); split: -0.01%, +0.00% Cycles: 13801675722 -> 13801629397 (-0.00%); split: -0.00%, +0.00% Subgroup size: 7955168 -> 7955152 (-0.00%) Send messages: 8544494 -> 8544486 (-0.00%) Totals from 997 (0.15% of 651454) affected shaders: Instrs: 460820 -> 446206 (-3.17%); split: -3.17%, +0.00% Cycles: 16265514 -> 16219189 (-0.28%); split: -0.84%, +0.56% Subgroup size: 17552 -> 17536 (-0.09%) Send messages: 26045 -> 26037 (-0.03%) Ice Lake Totals: Instrs: 165504747 -> 165489970 (-0.01%); split: -0.01%, +0.00% Cycles: 15145244554 -> 15145149627 (-0.00%); split: -0.00%, +0.00% Subgroup size: 8107032 -> 8107016 (-0.00%) Send messages: 8598680 -> 8598672 (-0.00%) Spill count: 45427 -> 45423 (-0.01%) Fill count: 74749 -> 74747 (-0.00%) Totals from 1125 (0.17% of 656115) affected shaders: Instrs: 521676 -> 506899 (-2.83%); split: -2.83%, +0.00% Cycles: 19555434 -> 19460507 (-0.49%); split: -0.59%, +0.10% Subgroup size: 21616 -> 21600 (-0.07%) Send messages: 28623 -> 28615 (-0.03%) Spill count: 603 -> 599 (-0.66%) Fill count: 1362 -> 1360 (-0.15%) Skylake * Shaders only in 'after' results are ignored: fossil-db/steam-native/red_dead_redemption2/cef460b80bad8485/fs.16, fossil-db/steam-native/red_dead_redemption2/cd5fe081e2e5529d/fs.16 from 1 apps: fossil-db/steam-native/red_dead_redemption2 Totals: Instrs: 141607617 -> 141593776 (-0.01%); split: -0.01%, +0.00% Cycles: 14257812441 -> 14257661671 (-0.00%); split: -0.00%, +0.00% Subgroup size: 7743752 -> 7743736 (-0.00%) Send messages: 7552728 -> 7552720 (-0.00%) Spill count: 43660 -> 43661 (+0.00%) Fill count: 71301 -> 71303 (+0.00%) Totals from 1017 (0.16% of 636964) affected shaders: Instrs: 392454 -> 378613 (-3.53%); split: -3.53%, +0.00% Cycles: 16622974 -> 16472204 (-0.91%); split: -1.04%, +0.13% Subgroup size: 19840 -> 19824 (-0.08%) Send messages: 23021 -> 23013 (-0.03%) Spill count: 484 -> 485 (+0.21%) Fill count: 1155 -> 1157 (+0.17%) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28281>	2024-03-21 15:12:58 -07:00
Ian Romanick	66dc6e07f5	intel/brw: Fix handling of accumulator register numbers Folks, there's more than one accumulator. In general, when the register file is ARF, the upper 4 bits of the register number specify which ARF, and the lower 4 bits specify which one of that ARF. This can be further partitioned by the subregister number. This is already mostly handled correctly for flags register, but lots of places wanted to check the register number for equality with BRW_ARF_ACCUMULATOR. If acc1 is ever specified, that won't work. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28281>	2024-03-21 15:12:54 -07:00
David Heidelberg	d8f53f698c	util: move gen_zipped_file into generic util and rename to gen_zipped_xml_file Make the filename more descriptive and since the file is used by multiple drivers, move it into appropriate util/ directory. Cosmetics: - use SPDX license tag - add newline before main function Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: David Heidelberg <david.heidelberg@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27804>	2024-03-21 20:48:41 +00:00
Rohan Garg	cc570dbada	isl: enable CCS for 3D surfaces on gen12.5 and above Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23632>	2024-03-21 18:28:27 +00:00
Rohan Garg	49ed35c08a	anv: 3D surfaces have fewer layers for higher miplevels Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23632>	2024-03-21 18:28:27 +00:00
Rohan Garg	9628723943	anv,blorp: implement restrictions from WA 1406738321 Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23632>	2024-03-21 18:28:27 +00:00
José Roberto de Souza	47bbd1c7ff	intel/tools/error_decode: Parse HW context in Xe decoder Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27888>	2024-03-21 16:59:09 +00:00
José Roberto de Souza	ec3a41960b	intel/tools/error_decode: Add function to print batch in Xe decoder This will be useful to decode HW context in the next patch. Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27888>	2024-03-21 16:59:09 +00:00
José Roberto de Souza	171eb89b75	intel/tools/error_decode: Fix parsing in Xe decoder xe_topic can't be inside of the for loop otherwise it will be set to TOPIC_INVALID at every iteration. TOPIC_INVALID was added after it was reviewed by Lionel because CI complained that xe_topic may be not initialized, turns out leaving it not initialized was causing the xe_topic value to keep the value set in the previous interation makeing the parser to work by luck. Fixes: `90e38bbb3b` ("intel/tools/error_decode: Parse Xe KMD error dump file") Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27888>	2024-03-21 16:59:09 +00:00
Dylan Baker	477943cc9d	meson: Allow building intel-clc for the host if it can be run In what is probably the most common case cross of compilation, x86_64 -> x86, it should be possible to build intel-clc for the host machine and run it. Doing so simplifies the build by not needing to be able to cross compile half of mesa, and should ease developer and distro strain for building Intel drivers for x86. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28222>	2024-03-21 16:31:35 +00:00
Lionel Landwerlin	098136e52a	anv: avoid partially compiled warning with GPL Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28318>	2024-03-21 16:09:54 +00:00
Ian Romanick	3556dbb97f	intel/brw/xe2: Correctly disassemble RT write subtypes The encoding changed when SIMD32 was added. Part of Wa_14011334914. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28306>	2024-03-20 15:46:44 -07:00
Francisco Jerez	c4325f426c	intel/brw/xe2+: Setup PS thread payload registers required for ALU-based pixel interpolation. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28306>	2024-03-20 15:46:44 -07:00
Francisco Jerez	6427f16074	intel/brw/gfx12: Setup PS thread payload registers required for ALU-based pixel interpolation. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28306>	2024-03-20 15:46:44 -07:00
Rohan Garg	2df6d208c8	intel/brw: Adjust src1 length bits for xe2+ Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28306>	2024-03-20 15:46:44 -07:00
Rohan Garg	83f2bdc116	intel/brw: Set the right cache control bits for xe2 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28306>	2024-03-20 15:46:44 -07:00
Rohan Garg	adb853ed10	intel/brw: Update written size depending on the LSC message Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28306>	2024-03-20 15:46:44 -07:00
Rohan Garg	48376ac3b8	intel/brw: Cleanup send generation Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28306>	2024-03-20 15:46:44 -07:00
Rohan Garg	65f66974a5	intel/brw: Use the dimensions supplied in the instruction Rework: * Francisco Jerez: Rebase on `07b9bfacc7` ("intel/compiler: Move logical-send lowering to a separate file") Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28306>	2024-03-20 15:46:44 -07:00
Francisco Jerez	644a0ede1e	intel/blorp/xe2+: Don't use replicated-data clears. They've been removed from the hardware. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28306>	2024-03-20 15:46:44 -07:00
Francisco Jerez	af8b9af700	intel/brw/xe2+: Allow dual-source blending in SIMD16 mode. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28306>	2024-03-20 15:46:44 -07:00
Francisco Jerez	762ec3fd59	intel/brw/xe2+: Allow FS stencil output in SIMD16 dispatch mode. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28306>	2024-03-20 15:46:44 -07:00
Francisco Jerez	efc0601ddf	intel/brw/xe2+: Double allowed SIMD width of FB write SEND messages. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28306>	2024-03-20 15:46:44 -07:00
Francisco Jerez	d96bfb160f	intel/brw/xe2+: Update encoding of FB write extended descriptor. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28306>	2024-03-20 15:46:44 -07:00
Francisco Jerez	189422de1b	intel/brw/xe2+: Update encoding of FB write descriptor message control. Ref: bspec: 65209, 63908 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28306>	2024-03-20 15:46:44 -07:00
Francisco Jerez	7b0fbc22dd	intel/brw/xe2: Render target reads have been removed from the hardware. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28306>	2024-03-20 15:46:44 -07:00
Paulo Zanoni	6ec1e322f0	anv: don't leak device->vma_samplers The vma_samplers vma heap is initialized unconditionally. Don't use device->physical->indirect_descriptors as a condition on whether to free it or not. From my TGL machine: ==373617== 32 bytes in 1 blocks are definitely lost in loss record 1 of 1 ==373617== at 0x48459F3: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) ==373617== by 0x6926DC0: util_vma_heap_free (vma.c:339) ==373617== by 0x6925ED3: util_vma_heap_init (vma.c:53) ==373617== by 0x5334EDA: anv_CreateDevice (anv_device.c:3404) ==373617== by 0x685593A: vk_tramp_CreateDevice (vk_dispatch_trampolines.c:78) ==373617== by 0x48A6D56: terminator_CreateDevice (loader.c:5833) ==373617== by 0x9C2293F: vulkan_layer_chassis::CreateDevice(VkPhysicalDevice_T, VkDeviceCreateInfo const, VkAllocationCallbacks const, VkDevice_T*) (chassis.cpp:497) ==373617== by 0x48B0690: loader_create_device_chain (loader.c:4937) ==373617== by 0x48B1327: loader_layer_create_device (loader.c:4317) ==373617== by 0x48B8D79: vkCreateDevice (trampoline.c:1004) ==373617== by 0x10CC7A: MyApp::MyApp(int, bool) (sparse.cpp:608) ==373617== by 0x1201E8: main (sparse.cpp:6025) Fixes: `7c76125db2` ("anv: use 2 different buffers for surfaces/samplers in descriptor sets") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28303>	2024-03-20 21:55:55 +00:00
Lionel Landwerlin	4fbdfdce9c	anv: allocate pipeline bindings tables dynamically on the heap Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28290>	2024-03-20 19:29:05 +00:00
Lionel Landwerlin	7730fa5683	anv: track embedded sampler counts in layouts Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28290>	2024-03-20 19:29:05 +00:00
Joshua Ashton	145ab5b853	anv: Enable EXT_swapchain_maintenance1 This was missing, this is implemented in common code. Signed-off-by: Joshua Ashton <joshua@froggi.es> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28275>	2024-03-20 18:24:25 +00:00
Kenneth Graunke	a075b44493	intel/brw: Eliminate top-level FIND_LIVE_CHANNEL & BROADCAST once brw_fs_opt_eliminate_find_live_channel eliminates FIND_LIVE_CHANNEL outside of control flow. None of our optimization passes generate additional cases of that instruction, so once it's gone, we shouldn't ever have to run the pass again. Moving it out of the loop should save a bit of CPU time. While we're at it, also clean adjacent BROADCAST instructions that consume the result of our FIND_LIVE_CHANNEL. Without this, we have to perform copy propagation to get the MOV 0 immediate into the BROADCAST, then algebraic to turn it into a MOV, which enables more copy propagation...not to mention CSE gets involved. Since this FIND_LIVE_CHANNEL + BROADCAST pattern from emit_uniformize() is really common, and it's trivial to clean up, we can do that. This lets the initial copy prop in the loop see MOV instead of BROADCAST. Zero impact on fossil-db, but less work in the optimization loop. Together with the previous patches, this cuts compile time in Borderlands 3 on Alchemist by -1.38539% +/- 0.1632% (n = 24). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28286>	2024-03-20 01:04:22 -07:00
Kenneth Graunke	5814534de5	intel/brw: Don't consider UNIFORM_PULL_CONSTANT_LOAD a send-from-GRF It's a logical opcode which is lowered to a send-from-GRF later. That lowering code is responsible for ensuring the sources are set up in a proper SEND payload. This was preventing copy propagation of surface handles which started out as scalars, were splatted out to full-SIMD values with NoMask, then actually consumed as only component 0 (scalar again), because we thought that scalar values were not allowed. fossil-db on Alchemist shows improvements in q2rtx but no other titles: Totals: Instrs: 161310436 -> 161310152 (-0.00%) Cycles: 14370605159 -> 14370601066 (-0.00%) Totals from 17 (0.00% of 652298) affected shaders: Instrs: 16097 -> 15813 (-1.76%) Cycles: 185508 -> 181415 (-2.21%) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28286>	2024-03-20 01:04:22 -07:00

1 2 3 4 5 ...

11652 commits