fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-19 02:48:07 +02:00

Author	SHA1	Message	Date
Lionel Landwerlin	9b779068c3	anv: prevent access to destroyed vk_sync objects post submission Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `36ea90a361` ("anv: Convert to the common sync and submit framework") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12145 Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32219>	2024-11-19 19:40:03 +00:00
Caio Oliveira	0b66cb1f82	intel/brw: Allow extra SWSB encodings for Xe2 There are new combinations of ordered and unordered dependencies available for the instructions to use, which among others include: - combining FLOAT and INT pipe deps in SENDs; - combining SRC mode deps in regular instructions for the inferred type. This patch enables a couple of tests checking for the first case. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31375>	2024-11-19 04:27:00 +00:00
Caio Oliveira	1b13eea642	intel/brw: Add test for combining SWSB dependencies in SENDs These are currently DISABLED_ since they fail. A later patch will enable them. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31375>	2024-11-19 04:27:00 +00:00
Lionel Landwerlin	8845255881	anv: fix missing push constant reallocation Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `62d96a6546` ("anv: add dirty tracking for push constant data") Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12151 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30007>	2024-11-18 16:31:33 +00:00
Nanley Chery	f1724b44d0	anv: Drop fast-clear value conversion check Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5622 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32187>	2024-11-18 14:57:46 +00:00
Nanley Chery	93e42f9700	anv: Store fast-clear colors with the view swizzle Prevents the next patch from failing CTS tests such as: dEQP-VK.api.image_clearing.core.clear_color_image..b4g4r4a4 Brings back the feature that was introduced in commit `46187bb54f` ("anv: Swizzle fast-clear values"), but went unused in commit `721d0c3e77` ("anv,hasvk: Always use BLORP_BATCH_NO_UPDATE_CLEAR_COLOR"). Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32187>	2024-11-18 14:57:46 +00:00
Nanley Chery	2a9d7a3bd0	anv: Support non-0/1 sRGB fast-clear colors on gfx9 We're going to drop a generic restriction on clear color conversions in anv_can_fast_clear_color(). Without preparing for it, the following tests would fail: * piglit.spec.arb_framebuffer_srgb.blit texture srgb msaa disabled clear.gen9_zinkm64 * piglit.spec.arb_framebuffer_srgb.blit renderbuffer srgb msaa disabled clear.gen9_zinkm64 * piglit.spec.arb_framebuffer_srgb.blit texture srgb downsample enabled clear.gen9_zinkm64 * piglit.spec.arb_framebuffer_srgb.blit renderbuffer srgb downsample enabled clear.gen9_zinkm64 * piglit.spec.arb_framebuffer_srgb.blit renderbuffer srgb msaa enabled clear.gen9_zinkm64 * piglit.spec.arb_framebuffer_srgb.blit texture srgb msaa enabled clear.gen9_zinkm64 So, add support for sRGB sampling via BLORP transfer operations and drop the gfx9-specific restriction on sRGB fast-clears. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32187>	2024-11-18 14:57:46 +00:00
Kenneth Graunke	5848035443	brw: Fix try_rebuild_source's ult32/ushr handling to use unsigned types We were accidentally doing a signed integer comparison here for ult32, or a sign-extending shift for ushr. One notable bit of fallout was that load_global_uniform_block_intel address calculations broke on platforms that don't have native 64-bit integer support, as the iadd64 lowering for "do I need to carry?" was using ult32...and performing the wrong comparison. We spotted this in Borderlands 3 on Alchemist once we turned on other optimizations. Thanks to Lionel Landwerlin for helping spot the problem! Fixes: `c7b312ad45` ("brw: factor out source extraction for rematerialization") Fixes: `339630ab05` ("brw: enable A64 loads source rematerialization") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31995>	2024-11-18 12:55:47 +00:00
Kenneth Graunke	0a376a672a	brw: Fix emit_a64_oword_block_header UNIFORM -> VGRF copies This was triggering an assertion in the fs_builder::MOV helper that the destination stride can't be 0 when dispatch_width > 1. What we want to do is copy the single 64-bit channel of data from the UNIFORM file to a VGRF. We can use a SIMD1 builder for that. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31995>	2024-11-18 12:55:47 +00:00
Lionel Landwerlin	431f353bfe	anv: fix incorrect aspect flag for depth/stencil formats We're asking if compression is supported and anv_formats_ccs_e_compatible() is assuming color aspect. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `0317c44872` ("anv: add VK_EXT_host_image_copy support") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12155 Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32153>	2024-11-18 07:01:28 +00:00
Sagar Ghuge	e5776bcb39	blorp: Use the calculated execution mask Instead of setting execution mask to 0xFFFFFFFF, use the previously calculated execution mask. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30474>	2024-11-18 04:42:52 +00:00
Jianxun Zhang	8db71c95e1	isl: Move a CCS restriction in GFX 12.x 3D+MSAA is not supported and depth-stencil formats are all 32bpp or less. Move this restriction into single-sample case. Suggested-by: Nanley Chery <nanley.g.chery@intel.com> Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31496>	2024-11-17 22:41:56 +00:00
Jianxun Zhang	ab56a9eecd	isl: Allow CCS in more cases (xe2) By restricting these limitations up to GFX 12, CCS support can be present on these cases that we think Xe2+ platform should support compression. Noticeably, CCS is allowed on depth resources without HiZ, multi-sampled resources without CCS, and multi-sampled stencil resources. Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31496>	2024-11-17 22:41:56 +00:00
Jianxun Zhang	705555b6b0	isl: Refactor WA 22015614752 Using intel_needs_workaround() within a block of GFX version checker requires extra carefulness on the road because both of them specify a range of applicable platforms. The WA block can be unexpectedly skipped once the GFX version checker gets updated later. Moving the WA implementation out of the GFX block to decouple them for more clarity and less chance of messing up next time. Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31496>	2024-11-17 22:41:55 +00:00
Francisco Jerez	0ad835a929	intel/fs/xe2: Fix up subdword integer region restriction with strided byte src and packed byte dst. This fixes a corner case of the LNL sub-dword integer restrictions that wasn't being detected by has_subdword_integer_region_restriction(), specifically: > if(Src.Type==Byte && Dst.Type==Byte && Dst.Stride==1 && W!=2) { > // ... > if(Src.Stride == 2) && (Src.UniformStride) && (Dst.SubReg%32 == Src.SubReg/2 ) { Allowed } > // ... > } All the other restrictions that require agreement between the SubReg number of source and destination only affect sources with a stride greater than a dword, which is why has_subdword_integer_region_restriction() was returning false except when "byte_stride(srcs[i]) >= 4" evaluated to true, but as implied by the pseudocode above, in the particular case of a packed byte destination, the restriction applies for source strides as narrow as 2B. The form of the equation that relates the subreg numbers is consistent with the existing calculations in brw_fs_lower_regioning (see required_src_byte_offset()), we just need to enable lowering for this corner case, and change lower_dst_region() to call lower_instruction() recursively, since some of the cases where we break this restriction are copy instructions introduced by brw_fs_lower_regioning() itself trying to lower other instructions with byte destinations. This fixes some Vulkan CTS test-cases that were hitting these restrictions with byte data types. Fixes: `217d412360` ("intel/fs/gfx20+: Implement sub-dword integer regioning restrictions.") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30630>	2024-11-15 07:39:33 +00:00
Tapani Pälli	50243892b4	isl: modify existing assert by allowing CCS_E aux usage Relax this assert based on x/y offsets for GFX_VERx10 >= 200. This is getting hit when running gfxbench5 on LNL/BMG. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32128>	2024-11-15 05:20:07 +00:00
Iván Briano	d32a26b3e6	anv: remove unused/misleading/wrong parameters from the RT trampoline Since the shader parameters are passed as inline data, push constants are no longer used and so, not actually set on dispatch. But the nr_params = 4 was still making the shader emit the code to load them, causing page faults on simulation, and would also on HW if we didn't always have a scratch page set. The uses_inline_data parameter will be set from brw_compile_cs(), called shortly after this point, so we don't need it here. The subgroup_size is misleading, as we don't actually require that size and the code that checks for it isn't even running for this shader. Fixes: `97b17aa0b1` ("brw/nir: rework inline_data_intel to work with compute") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12152 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32150>	2024-11-14 19:23:42 -08:00
Lionel Landwerlin	5cfd841dda	anv: fix descriptor asserts Lots of tests are hitting the assert, one in particular : dEQP-VK.binding_model.mutable_descriptor.single.switches.sampler_combined_image_sampler.update_copy.nonmutable_source.normal_source.pool_same_types.pre_update.no_array.comp Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `b6d11ba5b4` ("anv: Protect memcpy/memset/qsort calls against NULL arguments") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32133>	2024-11-14 13:45:19 +00:00
Lionel Landwerlin	a21cd8c5b6	brw: allocate physical register sizes for spilling All of the spilling code should work with physical register units because for example SEND messages will expect a physical register as destination. So always allocate a full physical register for the spilled/unspilled values and adjust the offsets of the registers to physical sizes too. Cc: mesa-stable Fixes: `aa494cba` ("brw: align spilling offsets to physical register sizes") Closes: mesa/mesa#11967 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Found-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32124>	2024-11-14 08:44:03 +00:00
Caio Oliveira	15ea28b835	intel/executor: Fix exec_size in @read macro for Xe2 Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32122>	2024-11-14 05:31:03 +00:00
Matt Turner	b3a14d7b91	intel: Avoid unaligned pointer access Avoids the sanitizer error: ``` ../src/intel/common/intel_debug_identifier.c:122:15: runtime error: member access within misaligned address 0x7f5ca8b32051 for type 'struct intel_debug_block_base', which requires 4 byte alignment 0x7f5ca8b32051: note: pointer points here 66 30 29 00 03 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 08 00 00 00 00 00 00 00 ^ ``` Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32098>	2024-11-14 01:05:02 +00:00
Matt Turner	1f3e24f4f3	anv: Avoid null ptr dereference Avoids the sanitizer error: ``` ../src/intel/vulkan/anv_instance.c:266:37: runtime error: member access within null pointer of type 'struct anv_instance' ``` Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32098>	2024-11-14 01:05:01 +00:00
Matt Turner	b6d11ba5b4	anv: Protect memcpy/memset/qsort calls against NULL arguments Avoids sanitizer errors like: ``` ../src/intel/vulkan/anv_pipeline_cache.c:409:4: runtime error: null pointer passed as argument 1, which is declared to never be null ../src/intel/vulkan/anv_descriptor_set.c:696:4: runtime error: null pointer passed as argument 1, which is declared to never be null ../src/intel/vulkan/anv_descriptor_set.c:2709:10: runtime error: null pointer passed as argument 1, which is declared to never be null ../src/intel/vulkan/anv_descriptor_set.c:2709:10: runtime error: null pointer passed as argument 2, which is declared to never be null ``` Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32098>	2024-11-14 01:05:01 +00:00
Rhys Perry	45c1280d2c	nir_lower_mem_access_bit_sizes: pass access to callback Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31904>	2024-11-13 12:59:26 +00:00
Rhys Perry	61752152f7	nir_lower_mem_access_bit_sizes: add nir_mem_access_shift_method Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31904>	2024-11-13 12:59:26 +00:00
Tapani Pälli	fbe5d41b58	anv: extend Wa_14017794102 with lineage Wa_14023061436 This workaround is applicable for Xe3 with new lineage. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31963>	2024-11-13 04:54:32 +00:00
Tapani Pälli	9429c0075b	anv: utilize ray query bo per queue for Wa_14022863161 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31963>	2024-11-13 04:54:32 +00:00
Tapani Pälli	1bd9e51a73	intel/dev: update mesa_defs.json from workaround database Brings in some PTL workarounds. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31963>	2024-11-13 04:54:32 +00:00
Iván Briano	f2f4206d49	intel/decoder: fix INTEL_DEBUG=bat Now that all genxml filenames are in verx10 format, we don't need to fix the number up when we look them up. Fixes: `8906816f49` ("anv,hasvk,genxml: Rename genxml files using verx10") Acked-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32099>	2024-11-13 00:45:40 +00:00
Lionel Landwerlin	08530462bd	anv: implement Wa_16011107343/22018402687 for generated draws Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32059>	2024-11-12 22:48:39 +00:00
Lionel Landwerlin	53eed61a90	intel: make sure intel_wa.h can be included by opencl code Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32059>	2024-11-12 22:48:39 +00:00
Lionel Landwerlin	672d41d22a	anv: split generated draw flags from mocs/dword-count We'll add more flags. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32059>	2024-11-12 22:48:39 +00:00
Lionel Landwerlin	d6acb56f11	anv: update shader descriptor resource limits Some limits got stuck to the old binding table limits. Those don't apply anymore since EXT_descriptor_indexing was implemented. Fixes: `6e230d7607` ("anv: Implement VK_EXT_descriptor_indexing") Fixes: `96c33fb027` ("anv: enable direct descriptors on platforms with extended bindless offset") Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31999>	2024-11-12 22:01:52 +00:00
Sagar Ghuge	fef8490eb9	anv: Enable MCS_CCS compression on Gfx12+ Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32009>	2024-11-12 12:27:21 +00:00
Matt Turner	a2c4a34303	anv: Align anv_descriptor_pool::host_mem Otherwise anv_descriptor_set is accessed through an unaligned pointer, which is undefined behavior in C. ``` anv_descriptor_set.c:1620:17: runtime error: member access within misaligned address 0x61900002c2b5 for type 'struct anv_descriptor_set', which requires 8 byte alignment 0x61900002c2b5 ``` Fixes: `2570a58bcd` ("anv: Implement descriptor pools") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32070>	2024-11-11 19:45:14 +00:00
Jianxun Zhang	8906816f49	anv,hasvk,genxml: Rename genxml files using verx10 It could be confusing that a newer platform named with a smaller number than a half-generation of an older platform like 'gfx20' and 'gfx75' in xml files. Down the road, it can be a little worse once we pass something like 'gfx40' when there is already a gfx45.xml for the oldest platform. Unify naming xml files with verx10 numbers to resolve the issue. Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31943>	2024-11-09 00:04:47 +00:00
Iván Briano	aee04bf4fb	intel/rt: fix ray_query stack address calculation While the documentation says to use NUM_SIMD_LANES_PER_DSS for the stack address calculation, what the HW actually uses is NUM_SYNC_STACKID_PER_DSS. The former may vary depending on the platform, while the latter is fixed to 2048 for all current platforms. Fixes: `6c84cbd8c9` ("intel/dev/xe: Set max_eus_per_subslice using topology query") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32049>	2024-11-08 18:31:52 +00:00
Ian Romanick	7aad19ccd2	brw/lower: Lower invalid source conversion to better code There are two fragment shaders from RDR2 that is hurt for spills and fills on Lunar Lake. Totals from 2 (0.00% of 551413) affected shaders: Spill count: 1252 -> 1317 (+5.19%) Fill count: 2518 -> 2642 (+4.92%) Those shaders... have a lot of room for improvement. There are some patterns in those shaders that we handle very, very poorly. Improving those patterns would likely improve the spills and fills in these shaders quite dramatically. Given how much other platforms are helped, I don't this should block this commit. No shader-db or fossil-db changes on any pre-Gfx12.5 Intel platforms. v2: Add some comments and an additional assertion. Suggested by Ken. shader-db: Lunar Lake total instructions in shared programs: 18094517 -> 18094511 (<.01%) instructions in affected programs: 809 -> 803 (-0.74%) helped: 6 / HURT: 0 total cycles in shared programs: 921532158 -> 921532168 (<.01%) cycles in affected programs: 2266 -> 2276 (0.44%) helped: 0 / HURT: 3 Meteor Lake and DG2 had similar results. (Meteor Lake shown) total instructions in shared programs: 19820845 -> 19820839 (<.01%) instructions in affected programs: 803 -> 797 (-0.75%) helped: 6 / HURT: 0 total cycles in shared programs: 906372999 -> 906372949 (<.01%) cycles in affected programs: 3216 -> 3166 (-1.55%) helped: 6 / HURT: 0 fossil-db: Lunar Lake Totals: Instrs: 141887377 -> 141884465 (-0.00%); split: -0.00%, +0.00% Cycle count: 21990301498 -> 21990267232 (-0.00%); split: -0.00%, +0.00% Spill count: 69732 -> 69797 (+0.09%) Fill count: 128521 -> 128645 (+0.10%) Totals from 349 (0.06% of 551413) affected shaders: Instrs: 506117 -> 503205 (-0.58%); split: -0.79%, +0.21% Cycle count: 32362996 -> 32328730 (-0.11%); split: -0.52%, +0.41% Spill count: 1951 -> 2016 (+3.33%) Fill count: 4899 -> 5023 (+2.53%) Meteor Lake and DG2 had similar results. (Meteor Lake shown) Totals: Instrs: 152773732 -> 152761383 (-0.01%); split: -0.01%, +0.00% Cycle count: 17187529968 -> 17187450663 (-0.00%); split: -0.00%, +0.00% Spill count: 79279 -> 79003 (-0.35%) Fill count: 148803 -> 147942 (-0.58%) Scratch Memory Size: 3949568 -> 3946496 (-0.08%) Max live registers: 31879325 -> 31879230 (-0.00%) Totals from 366 (0.06% of 633185) affected shaders: Instrs: 557377 -> 545028 (-2.22%); split: -2.22%, +0.01% Cycle count: 26171205 -> 26091900 (-0.30%); split: -0.54%, +0.24% Spill count: 3238 -> 2962 (-8.52%) Fill count: 10018 -> 9157 (-8.59%) Scratch Memory Size: 257024 -> 253952 (-1.20%) Max live registers: 28187 -> 28092 (-0.34%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32041>	2024-11-08 17:46:45 +00:00
Ian Romanick	2a57568ebd	brw/build: Add scalar_group() helper Some uses of the old pattern still exist. The use in brw_fs_nir.cpp is deleted by commits !29884. The use in brw_lower_logical_sends.cpp seems different, so I decided to keep it. The next commit wants to use this. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32041>	2024-11-08 17:46:45 +00:00
Ian Romanick	5dfea87623	brw/opt: Always do both kinds of copy propagation before lower_load_payload shader-db: All Intel platforms except Skylake had similar results. (Lunar Lake shown) total instructions in shared programs: 18092932 -> 18092713 (<.01%) instructions in affected programs: 139290 -> 139071 (-0.16%) helped: 103 HURT: 18 helped stats (abs) min: 1 max: 8 x̄: 2.43 x̃: 2 helped stats (rel) min: 0.02% max: 9.09% x̄: 0.73% x̃: 0.29% HURT stats (abs) min: 1 max: 5 x̄: 1.72 x̃: 1 HURT stats (rel) min: 0.02% max: 0.55% x̄: 0.10% x̃: 0.08% 95% mean confidence interval for instructions value: -2.17 -1.45 95% mean confidence interval for instructions %-change: -0.83% -0.38% Instructions are helped. total cycles in shared programs: 922792268 -> 921495900 (-0.14%) cycles in affected programs: 400296984 -> 399000616 (-0.32%) helped: 765 HURT: 635 helped stats (abs) min: 2 max: 77018 x̄: 6739.33 x̃: 60 helped stats (rel) min: <.01% max: 35.59% x̄: 1.98% x̃: 0.32% HURT stats (abs) min: 2 max: 88658 x̄: 6077.51 x̃: 152 HURT stats (rel) min: <.01% max: 51.33% x̄: 2.75% x̃: 0.63% 95% mean confidence interval for cycles value: -1620.41 -231.54 95% mean confidence interval for cycles %-change: -0.10% 0.44% Inconclusive result (%-change mean confidence interval includes 0). LOST: 4 GAINED: 3 Skylake total instructions in shared programs: 18658324 -> 18579715 (-0.42%) instructions in affected programs: 2089957 -> 2011348 (-3.76%) helped: 9842 HURT: 23 helped stats (abs) min: 1 max: 24 x̄: 7.99 x̃: 8 helped stats (rel) min: 0.05% max: 40.00% x̄: 5.37% x̃: 4.52% HURT stats (abs) min: 1 max: 5 x̄: 1.57 x̃: 1 HURT stats (rel) min: 0.02% max: 1.28% x̄: 0.36% x̃: 0.24% 95% mean confidence interval for instructions value: -7.98 -7.95 95% mean confidence interval for instructions %-change: -5.43% -5.29% Instructions are helped. total cycles in shared programs: 860031654 -> 860237548 (0.02%) cycles in affected programs: 449175235 -> 449381129 (0.05%) helped: 7895 HURT: 4416 helped stats (abs) min: 1 max: 14129 x̄: 113.70 x̃: 22 helped stats (rel) min: <.01% max: 40.95% x̄: 1.31% x̃: 0.56% HURT stats (abs) min: 1 max: 33397 x̄: 249.89 x̃: 34 HURT stats (rel) min: <.01% max: 67.47% x̄: 2.65% x̃: 0.65% 95% mean confidence interval for cycles value: 1.46 31.98 95% mean confidence interval for cycles %-change: 0.02% 0.19% Cycles are HURT. LOST: 557 GAINED: 900 fossil-db: Lunar Lake Totals: Instrs: 141933621 -> 141884681 (-0.03%); split: -0.03%, +0.00% Cycle count: 21990657282 -> 21990200212 (-0.00%); split: -0.14%, +0.14% Spill count: 69754 -> 69732 (-0.03%); split: -0.05%, +0.02% Fill count: 128559 -> 128521 (-0.03%); split: -0.05%, +0.02% Scratch Memory Size: 5934080 -> 5925888 (-0.14%) Max live registers: 48021653 -> 48051253 (+0.06%); split: -0.00%, +0.06% Totals from 13510 (2.45% of 551410) affected shaders: Instrs: 19497180 -> 19448240 (-0.25%); split: -0.25%, +0.00% Cycle count: 2455370202 -> 2454913132 (-0.02%); split: -1.25%, +1.23% Spill count: 10975 -> 10953 (-0.20%); split: -0.32%, +0.12% Fill count: 21709 -> 21671 (-0.18%); split: -0.28%, +0.10% Scratch Memory Size: 674816 -> 666624 (-1.21%) Max live registers: 2502653 -> 2532253 (+1.18%); split: -0.01%, +1.19% Meteor Lake and DG2 had similar results. (Meteor Lake shown) Totals: Instrs: 152763523 -> 152772716 (+0.01%); split: -0.00%, +0.01% Cycle count: 17188701887 -> 17187510768 (-0.01%); split: -0.10%, +0.09% Spill count: 79280 -> 79279 (-0.00%); split: -0.00%, +0.00% Fill count: 148809 -> 148803 (-0.00%) Max live registers: 31879240 -> 31879093 (-0.00%); split: -0.00%, +0.00% Max dispatch width: 5559984 -> 5559712 (-0.00%); split: +0.00%, -0.01% Totals from 20524 (3.24% of 633183) affected shaders: Instrs: 20366964 -> 20376157 (+0.05%); split: -0.01%, +0.05% Cycle count: 2406162382 -> 2404971263 (-0.05%); split: -0.68%, +0.63% Spill count: 19935 -> 19934 (-0.01%); split: -0.02%, +0.01% Fill count: 34487 -> 34481 (-0.02%) Max live registers: 1745598 -> 1745451 (-0.01%); split: -0.01%, +0.01% Max dispatch width: 117992 -> 117720 (-0.23%); split: +0.03%, -0.26% Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) Totals: Instrs: 150694108 -> 150683859 (-0.01%); split: -0.01%, +0.00% Cycle count: 15526754059 -> 15529031079 (+0.01%); split: -0.10%, +0.12% Max live registers: 31791599 -> 31791441 (-0.00%); split: -0.00%, +0.00% Max dispatch width: 5569488 -> 5569296 (-0.00%); split: +0.00%, -0.01% Totals from 15000 (2.37% of 632406) affected shaders: Instrs: 10965577 -> 10955328 (-0.09%); split: -0.11%, +0.02% Cycle count: 2025347115 -> 2027624135 (+0.11%); split: -0.80%, +0.91% Max live registers: 983373 -> 983215 (-0.02%); split: -0.02%, +0.00% Max dispatch width: 83064 -> 82872 (-0.23%); split: +0.12%, -0.35% Skylake Totals: Instrs: 140588784 -> 140413758 (-0.12%); split: -0.13%, +0.00% Cycle count: 14724286265 -> 14723402393 (-0.01%); split: -0.04%, +0.04% Fill count: 100130 -> 100129 (-0.00%) Max live registers: 31418029 -> 31417146 (-0.00%); split: -0.00%, +0.00% Max dispatch width: 5513400 -> 5535192 (+0.40%); split: +0.89%, -0.49% Totals from 39733 (6.35% of 625986) affected shaders: Instrs: 17240737 -> 17065711 (-1.02%); split: -1.02%, +0.01% Cycle count: 1994668203 -> 1993784331 (-0.04%); split: -0.31%, +0.27% Fill count: 44481 -> 44480 (-0.00%) Max live registers: 2766781 -> 2765898 (-0.03%); split: -0.03%, +0.00% Max dispatch width: 210600 -> 232392 (+10.35%); split: +23.23%, -12.89% Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32041>	2024-11-08 17:46:45 +00:00
Ian Romanick	be26012f1d	brw/opt: Always do copy prop, DCE, and register coalesce after lower_regioning shader-db: Lunar Lake total instructions in shared programs: 18100289 -> 18083853 (-0.09%) instructions in affected programs: 790048 -> 773612 (-2.08%) helped: 3058 / HURT: 1 total cycles in shared programs: 921691992 -> 921293816 (-0.04%) cycles in affected programs: 37210762 -> 36812586 (-1.07%) helped: 2329 / HURT: 624 LOST: 27 GAINED: 26 Meteor Lake, DG2, Tiger Lake, and Ice Lake had similar results. (Meteor Lake shown) total instructions in shared programs: 19825635 -> 19821391 (-0.02%) instructions in affected programs: 138675 -> 134431 (-3.06%) helped: 877 / HURT: 0 total cycles in shared programs: 907900598 -> 907885713 (<.01%) cycles in affected programs: 7127161 -> 7112276 (-0.21%) helped: 318 / HURT: 242 total spills in shared programs: 5790 -> 5758 (-0.55%) spills in affected programs: 660 -> 628 (-4.85%) helped: 8 / HURT: 0 total fills in shared programs: 6744 -> 6712 (-0.47%) fills in affected programs: 708 -> 676 (-4.52%) helped: 8 / HURT: 0 LOST: 10 GAINED: 0 Skylake total instructions in shared programs: 18722197 -> 18637637 (-0.45%) instructions in affected programs: 2757553 -> 2672993 (-3.07%) helped: 12290 / HURT: 1 total cycles in shared programs: 859716039 -> 859432560 (-0.03%) cycles in affected programs: 113731837 -> 113448358 (-0.25%) helped: 9555 / HURT: 2422 LOST: 265 GAINED: 714 fossil-db: Lunar Lake, Meteor Lake, and DG2 had similar results. (Lunar Lake shown) Totals: Instrs: 142000618 -> 141928331 (-0.05%); split: -0.05%, +0.00% Subgroup size: 10995136 -> 10995072 (-0.00%) Cycle count: 21994723230 -> 21990481140 (-0.02%); split: -0.08%, +0.06% Spill count: 69911 -> 69754 (-0.22%); split: -0.23%, +0.00% Fill count: 128723 -> 128559 (-0.13%); split: -0.15%, +0.02% Scratch Memory Size: 5936128 -> 5934080 (-0.03%) Max live registers: 48006880 -> 48020936 (+0.03%); split: -0.01%, +0.04% Totals from 17450 (3.16% of 551410) affected shaders: Instrs: 14984149 -> 14911862 (-0.48%); split: -0.48%, +0.00% Subgroup size: 365744 -> 365680 (-0.02%) Cycle count: 2585095128 -> 2580853038 (-0.16%); split: -0.71%, +0.54% Spill count: 20893 -> 20736 (-0.75%); split: -0.76%, +0.00% Fill count: 44181 -> 44017 (-0.37%); split: -0.44%, +0.07% Scratch Memory Size: 995328 -> 993280 (-0.21%) Max live registers: 2378069 -> 2392125 (+0.59%); split: -0.20%, +0.79% Tiger Lake, Ice Lake, and Skylake had similar results. (Tiger Lake shown) Totals: Instrs: 150719758 -> 150676269 (-0.03%); split: -0.04%, +0.01% Subgroup size: 7764560 -> 7764632 (+0.00%) Cycle count: 15526689814 -> 15525687740 (-0.01%); split: -0.03%, +0.02% Spill count: 60120 -> 59472 (-1.08%); split: -1.17%, +0.10% Fill count: 105973 -> 104675 (-1.22%); split: -1.40%, +0.17% Scratch Memory Size: 2396160 -> 2381824 (-0.60%); split: -0.73%, +0.13% Max live registers: 31782879 -> 31788857 (+0.02%); split: -0.01%, +0.03% Max dispatch width: 5569200 -> 5569344 (+0.00%); split: +0.00%, -0.00% Totals from 10089 (1.60% of 632405) affected shaders: Instrs: 6389866 -> 6346377 (-0.68%); split: -0.87%, +0.19% Subgroup size: 102912 -> 102984 (+0.07%) Cycle count: 681310278 -> 680308204 (-0.15%); split: -0.65%, +0.51% Spill count: 19571 -> 18923 (-3.31%); split: -3.61%, +0.30% Fill count: 38229 -> 36931 (-3.40%); split: -3.88%, +0.48% Scratch Memory Size: 808960 -> 794624 (-1.77%); split: -2.15%, +0.38% Max live registers: 677473 -> 683451 (+0.88%); split: -0.45%, +1.33% Max dispatch width: 88672 -> 88816 (+0.16%); split: +0.27%, -0.11% Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32041>	2024-11-08 17:46:45 +00:00
Ian Romanick	b2d7a823be	brw/lower: Don't emit spurious moves to or from NULL register Previously an instruction like cmp.l.f0.0(16) null:F, v359:F, 0f would get lowered to undef(16) v13703:UD cmp.l.f0.0(16) v13703:F, v359:F, 0f mov(16) null:UD, v13703:UD After copy propagation and dead-code elimination are run again, the original CMP gets turned back into its original form! Some cases can also emit MOVs from the original NULL register. It should be possible to not do any lowering here, but there are some interactions with source lowering passes for things like cmp.l.f0.0(16) null:HF, g89.1<16,16,1>:HF, 0hf What inspired this was... diff'ing step-by-step dumps from INTEL_DEBUG=optimizer had a lot of useless changes due to these MOVs and undefs. It was very annoying. This low-effort change gets the majority of the possible benefit. No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32041>	2024-11-08 17:46:45 +00:00
Ian Romanick	9aba731d03	brw/cse: Don't eliminate instructions that write flags With other changes in my tree, I observed this code from dEQP-VK.subgroups.vote.compute.subgroupallequal_float have the second cmp.z removed. undef(8) %69:UD cmp.z.f0.0(8) %69:F, %37:F, %57+0.0<0>:F mov(1) v58+0.0:D, 0d NoMask group0 (+f0.0) mov(1) v58+0.0:D, -1d NoMask group0 cmp.nz.f0.0(8) null:D, v58+0.0<0>:D, 0d ... undef(8) %72:UD cmp.z.f0.0(8) %72:F, %37:F, %57+0.0<0>:F mov(1) v63+0.0:D, 0d NoMask group0 (+f0.0) mov(1) v63+0.0:D, -1d NoMask group0 This was also fixed by running dead-code elimination before CSE. That seems more like avoiding the problem than fixing it, though. I believe this affects shader-db results because leaving the second CMP in the shader can give more opportunities for cmod propagation. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Fixes: `234c45c929` ("intel/brw: Write a new global CSE pass that works on defs") shader-db: All Intel platforms had similar results. (Lunar Lake shown) total cycles in shared programs: 922097690 -> 922260862 (0.02%) cycles in affected programs: 3178926 -> 3342098 (5.13%) helped: 130 HURT: 88 helped stats (abs) min: 2 max: 2194 x̄: 296.71 x̃: 16 helped stats (rel) min: <.01% max: 16.56% x̄: 1.86% x̃: 0.18% HURT stats (abs) min: 4 max: 11992 x̄: 2292.55 x̃: 47 HURT stats (rel) min: 0.04% max: 57.32% x̄: 11.82% x̃: 0.61% 95% mean confidence interval for cycles value: 320.36 1176.63 95% mean confidence interval for cycles %-change: 1.59% 5.73% Cycles are HURT. LOST: 2 GAINED: 1 fossil-db: Lunar Lake, Meteor Lake, Tiger Lake had similar results. (Lunar Lake shown) Totals: Instrs: 142022960 -> 142022928 (-0.00%); split: -0.00%, +0.00% Cycle count: 21995242782 -> 21995384040 (+0.00%); split: -0.00%, +0.00% Max live registers: 48013385 -> 48013343 (-0.00%) Totals from 507 (0.09% of 551441) affected shaders: Instrs: 886191 -> 886159 (-0.00%); split: -0.01%, +0.01% Cycle count: 69302492 -> 69443750 (+0.20%); split: -0.66%, +0.86% Max live registers: 94413 -> 94371 (-0.04%) DG2 Totals: Instrs: 152856370 -> 152856093 (-0.00%); split: -0.00%, +0.00% Cycle count: 17237159885 -> 17236804052 (-0.00%); split: -0.00%, +0.00% Fill count: 150673 -> 150631 (-0.03%) Max live registers: 31871520 -> 31871476 (-0.00%) Totals from 506 (0.08% of 633197) affected shaders: Instrs: 831795 -> 831518 (-0.03%); split: -0.04%, +0.01% Cycle count: 55578509 -> 55222676 (-0.64%); split: -1.38%, +0.74% Fill count: 2779 -> 2737 (-1.51%) Max live registers: 51383 -> 51339 (-0.09%) Ice Lake and Skylake had similar results. (Ice Lake shown) Totals: Instrs: 152017826 -> 152017793 (-0.00%); split: -0.00%, +0.00% Cycle count: 15180773451 -> 15180761166 (-0.00%); split: -0.00%, +0.00% Fill count: 106610 -> 106614 (+0.00%) Max live registers: 32195006 -> 32194966 (-0.00%) Totals from 411 (0.06% of 637268) affected shaders: Instrs: 705935 -> 705902 (-0.00%); split: -0.01%, +0.01% Cycle count: 47830019 -> 47817734 (-0.03%); split: -0.05%, +0.02% Fill count: 2865 -> 2869 (+0.14%) Max live registers: 42883 -> 42843 (-0.09%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32041>	2024-11-08 17:46:45 +00:00
Ian Romanick	80a5d158ae	brw/copy: Don't copy propagate through smaller entry dest size Copy propagation would incorrectly occur in this code mov(16) v4+2.0:UW, u0<0>:UW NoMask ... mov(8) v6+2.0:UD, v4+2.0:UD NoMask group0 to create mov(16) v4+2.0:UW, u0<0>:UW NoMask ... mov(8) v6+2.0:UD, u0<0>:UD NoMask group0 This has different behavior. I think I just made a mistake when I changed this condition in `e3f502e007`. It seems like this condition could be relaxed to cover cases like (note the change of destination stride) mov(16) v4+2.0<2>:UW, u0<0>:UW NoMask ... mov(8) v6+2.0:UD, v4+2.0:UD NoMask group0 I'm not sure it's worth it. No shader-db or fossil-db changes on any Intel platform. Even the code for the test case mentioned in the original commit did not change. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Fixes: `e3f502e007` ("intel/fs: Allow copy propagation between MOVs of mixed sizes") Closes: #12116 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32041>	2024-11-08 17:46:45 +00:00
Ian Romanick	c1c09e3c4a	brw/emit: Add correct 3-source instruction assertions for each platform Specifically, allow two immediate sources for BFE on Gfx12+. I stumbled on this while trying some stuff with !31852. v2: Don't be lazy. Add proper assertions for all the things on all the platforms. Based on a suggestion by Ken. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Fixes: `7bed11fbde` ("intel/brw: Allow immediates in the BFE instruction on Gfx12+") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31858>	2024-11-08 16:48:57 +00:00
Lionel Landwerlin	3ecf2a0518	anv: fix extent computation in image->image host copies Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `0317c44872` ("anv: add VK_EXT_host_image_copy support") Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32027>	2024-11-07 22:44:41 +00:00
Felix DeGrood	bf96702985	intel/measure: increase size of filename malloc to account for \0 Corrects regression caused by prior commit that created memory overwrite by not mallocing enough space for filename string. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32013>	2024-11-06 22:12:29 +00:00
Lionel Landwerlin	0ab2849597	anv: move pipe control debug to anv_util.c We're going to add more printing. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31928>	2024-11-06 12:20:23 +00:00
Lionel Landwerlin	b5403a4e40	anv: fix indentation Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31928>	2024-11-06 12:20:23 +00:00
Lionel Landwerlin	f9e76e8ca6	anv: add texture cache inval after binding pool update Cc: mesa-stable Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31928>	2024-11-06 12:20:22 +00:00

1 2 3 4 5 ...

13026 commits