fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-19 02:48:07 +02:00

Author	SHA1	Message	Date
Jordan Justen	c5c349a690	intel/dev: Fix warning for max_threads_per_psd when devinfo->verx10 == 120 Although we don't want to rely on hwconfig for devinfo->verx10 == 120, due to the dependence on closed source software, we do check to see if hwconfig reports different values in the DEVINFO_HWCONFIG macro. Matt was seeing this warning on 8086:a7a0: > MESA: warning: INTEL_HWCONFIG_TOTAL_PS_THREADS (128) != devinfo->max_threads_per_psd (64) Reported-by: Matt Turner <mattst88@gmail.com> Fixes: `3e4f73b3a0` ("intel/dev: Update hwconfig => max_threads_per_psd for Xe2") Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31077>	2024-09-10 03:21:12 +00:00
Nanley Chery	c92e49e8f4	intel/isl: Always set EnableUnormPathInColorPipe The TGL PRM says, This bit should never be programmed to 0 So, set it to true. I chose not to use the MBO attribute in genxml because the field lacks the "Format: MBO" line in the PRM. We previously made this programming conditional with commit `2e1be771e4` because of tests failing in dEQP-GLES3.functional.texture.specification.texdepth. However, those failures were fixed when we started using gl_FragDepth for depth buffer copies in commit `6cec618e82`. Note: when bisecting this, I cherry-picked commit `7a68045b5d` in order to get past build failures related to a deprecated python function. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31066>	2024-09-09 23:48:31 +00:00
Sviatoslav Peleshko	fa51595c7f	brw: Fix mov cmod propagation when there's int signedness mismatch If there's difference between scan_inst dest type and inst src type we should be more careful, because difference in signedness can cause incorrect results after the propagation. Updated ror-default.trace hash, as the change fixes misrendering there. Fixes: `b23432c5` ("intel/fs: Fix a cmod prop bug when the source type of a mov doesn't match the dest type of scan_inst") Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30998>	2024-09-09 22:13:08 +00:00
Lionel Landwerlin	05dc524c75	anv: selectively disable binding table usage on Gfx20 Workaround broken Gfx20 dynamic BTI. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `e9f63df2f2` ("intel/dev: Enable LNL PCI IDs without INTEL_FORCE_PROBE") Backport-to: 24.2 Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30931>	2024-09-09 20:33:25 +00:00
Rohan Garg	7f65035078	hasvk: enable VK_KHR_shader_relaxed_extended_instruction The extension only affects non semantic instructions that need no handling in the backend compiler. Signed-off-by: Rohan Garg <rohan.garg@intel.com> Acked-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31098>	2024-09-09 17:46:32 +00:00
Rohan Garg	5f3339e44a	anv: enable the VK_KHR_shader_relaxed_extended_instruction feature Fixes: 29a2e5 ('anv: enable KHR_shader_relaxed_extended_instruction') Signed-off-by: Rohan Garg <rohan.garg@intel.com> Acked-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31098>	2024-09-09 17:46:32 +00:00
Daniel Stone	a78539e704	intel/tests: Reduce load from anv_tests anv_tests tries to create a large number of threads, all of which wait to be able to execute simultaneously, then launch a reasonable-size workload. Under load, cloning each of the 16 threads takes 15ms serially, for a delay of 240ms before the tests start running; running the test 64 times gives us 15.36s for a single testcase in isolation, assuming that the bits which aren't forking are free. To give it the best shot at completing in time, mark it as a non-parallelisable test (since Meson will also try to parallelise it out), and also halve the number of runs it attempts. And then give it a longer timeout so it doesn't fail even in extremis. Signed-off-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31059>	2024-09-09 12:54:34 +00:00
Caio Oliveira	2a5a12cb71	intel/executor: Small fixes to the help message Add missing @eot to the example. Reword INTEL_DEBUG=color description. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31076>	2024-09-07 16:32:50 +00:00
Alyssa Rosenzweig	1753bf599c	ci: update traces 🤕 thanks Mike Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30934>	2024-09-07 00:54:35 +00:00
Tapani Pälli	39a1f53890	anv: initialize pixel struct to zero when setting clear color Otherwise we can end up with uninitialized values, this fixes following valgrind warning: ==31283== Uninitialised byte(s) found during client check request ==31283== at 0x503E4DE: anv_batch_bo_finish (anv_batch_chain.c:345) ==31283== by 0x504220A: anv_cmd_buffer_end_batch_buffer (anv_batch_chain.c:1103) ==31283== by 0x55A0E4F: end_command_buffer (genX_cmd_buffer.c:3455) ==31283== by 0x55A0E82: gfx11_EndCommandBuffer (genX_cmd_buffer.c:3466) ==31283== by 0x11233A: ??? (in /usr/bin/vkcube) ==31283== by 0x10BDEE: ??? (in /usr/bin/vkcube) ==31283== by 0x49B5149: (below main) (in /usr/lib64/libc.so.6) ==31283== Address 0xc10c4d8 is 1,240 bytes inside a block of size 8,192 client-defined ==31283== at 0x5036EF6: anv_bo_pool_alloc (anv_allocator.c:1284) ==31283== by 0x503E0E1: anv_batch_bo_create (anv_batch_chain.c:262) ==31283== by 0x5040D3F: anv_cmd_buffer_init_batch_bo_chain (anv_batch_chain.c:868) ==31283== by 0x504F9C1: anv_create_cmd_buffer (anv_cmd_buffer.c:147) ==31283== by 0x6B718C4: vk_common_AllocateCommandBuffers (vk_command_pool.c:206) ==31283== by 0x4FB06B2: vkAllocateCommandBuffers (trampoline.c:1996) ==31283== by 0x111E6B: ??? (in /usr/bin/vkcube) ==31283== by 0x10BDEE: ??? (in /usr/bin/vkcube) ==31283== by 0x49B5149: (below main) (in /usr/lib64/libc.so.6) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30990>	2024-09-06 13:19:04 +00:00
David Heidelberg	d16581652f	ci/iris: implement nightly CL testing using piglit on ADL Reviewed-by: Eric Engestrom <eric@igalia.com> Signed-off-by: David Heidelberg <david@ixit.cz> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29516>	2024-09-05 08:57:51 +00:00
Lionel Landwerlin	aa494cbacf	brw: align spilling offsets to physical register sizes In commit `fe3d90aedf` ("intel/fs/xe2+: Fix calculation of spill message width for Xe2 regs.") we aligned the width of scratch messages to physical register sizes (32B prior to Xe2, 64B for Xe2+). But our spilling offsets are computed using the register allocations sizes which are in units of 32B. That means on Xe2, you can end up spilling a virtual register allocated at 32B (which we use for surface state computations with exec_all) and then the spilling of that register will be emitted in SIMD16, having the upper 8 lanes overwriting the next spilled register. We could potentially limit spills to SIMD8 messages on Xe2 (only writing 32B of data), but we're also unlikely to have all 32B virtual register spilled next to one another. And if not tightly packed, we would have 64B registers stored on 2 different cachelines which sounds inefficient. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `fe3d90aedf` ("intel/fs/xe2+: Fix calculation of spill message width for Xe2 regs.") Backport-to: 24.2 Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30983>	2024-09-04 23:05:31 +00:00
Jordan Justen	f817870aa9	anv: Don't warn about unsupported devices if INTEL_FORCE_PROBE was used The user must have used INTEL_FORCE_PROBE to force the device to be loaded, so they specifically opted-in to enabled unsupported device support. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31011>	2024-09-04 12:09:12 -07:00
Jordan Justen	ee727d7b66	intel/dev: Add devinfo::probe_forced based on INTEL_FORCE_PROBE Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31011>	2024-09-04 12:09:08 -07:00
Jordan Justen	aaaf9a3b87	anv: Do hasvk devices check first Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31011>	2024-09-04 12:09:05 -07:00
Jordan Justen	16a835ed3d	anv: Drop "not yet supported" warning for Xe2 Backport-to: 24.2 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31011>	2024-09-04 12:09:01 -07:00
José Roberto de Souza	ca13e35304	anv: Add anv_device_perf_close() Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31026>	2024-09-04 10:04:38 -07:00
José Roberto de Souza	2d216c12fa	anv: Drop useless '>= 0' check over a unsigned Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31026>	2024-09-04 10:04:38 -07:00
José Roberto de Souza	023120d1fc	intel/perf: Fix intel_gem.h include The intention here was to get include the common intel_gem.h to get the intel_ioctl() signature. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31026>	2024-09-04 10:04:38 -07:00
José Roberto de Souza	5d4e319aec	anv: Nuke perf_metric This is not used. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31026>	2024-09-04 10:04:37 -07:00
Caio Oliveira	74be809237	compiler: Allow derivative_group to be used for all stages in shader_info These will now also be used by stages that have workgroups. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30950>	2024-09-03 20:03:18 +00:00
Vignesh Raman	1eb98bc457	ci: move mtl-fw.json to .gitlab-ci directory Placing mtl-fw.json in src/intel/ci/mtl-fw.json works for the mesa build, but it fails to fetch in drm-ci. Move it to the .gitlab-ci directory so it is included in the artifacts used for building the kernel/rootfs in drm-ci. Signed-off-by: Vignesh Raman <vignesh.raman@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30947>	2024-09-03 19:25:49 +00:00
Caio Oliveira	5be6f3b089	intel/executor: Fix SWSB for sync.nop Surfaced after recent improvements on SWSB handling, the previous assembly code was gracefully lowering the $1 into $1.dst. Fixes: 37674196221 ("intel: Add executor tool") Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30960>	2024-09-02 16:07:55 +00:00
Caio Oliveira	3f6b5ea27a	intel/brw: Use linear walk when shader requires DERIVATIVE_GROUP_LINEAR Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30955>	2024-08-30 20:24:42 +00:00
Sai Teja	05f6e9f11e	ci: Disable angle jobs for GL changes Mesa's GL stack changes doesn't affect angle in any way for now. Thus, drop angle jobs for GL changes from intel and amd CI. Signed-off-by: Sai Teja <saiteja13427@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30943>	2024-08-30 15:09:15 +00:00
Jordan Justen	3e4f73b3a0	intel/dev: Update hwconfig => max_threads_per_psd for Xe2 Backport-to: 24.2 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30887>	2024-08-30 01:53:55 -07:00
Caio Oliveira	e4f090d3a6	intel/brw: Remove special treatment for 2-src in emit() helper For Gfx9+ no 2-src instructions need sources to fixed up. Special treatment remains for 3-src instructions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30911>	2024-08-30 04:33:47 +00:00
Ian Romanick	73f365e208	intel/brw: load_offset cannot be constant on this path Literally inside an if-statement (about 26 lines before this hunk) that checks for !nir_src_is_const(instr->src[1]). No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>	2024-08-30 03:39:31 +00:00
Ian Romanick	fef175de09	intel/brw: Enable constant propagation for a couple more logical sends This prevents some regressions later in the MR. Once load_const operations are marked as is_scalar, they will cesase to get the automatic constant propagation that occurs in try_rebuild_source. No shader-db or fossil-db changes on any Intel platform. v2: Slightly relax source restrictions on SHADER_OPCODE_UNALIGNED_OWORD_BLOCK_READ_LOGICAL. Add a comment explaining the restriction. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>	2024-08-30 03:39:31 +00:00
Ian Romanick	c6a8b382fd	intel/brw: Relax is_partial_write check in cmod propagation The is_partial_write check is too strict because it tests two separate things. It tests whether or not the instruction always writes a value (i.e., is it predicated), and it tests whether or not the instruction writes a complete register. This latter check is problematic as it perevents cmod propagation in SIMD1, and it prevents cmod propagation in SIMD8 when the destination size is 16 bits. This check is unnecessary. Cmod propagation already checks that the region written and region read overlap. It also already checks that the execution sizes of the instructions match. Further restriction based on the specific parts of the register written only generates false negatives. v2: Relax all of the calls to is_partial_write. Suggested by Caio. No shader-db changes on any Intel platform. fossil-db: Meteor Lake Totals: Instrs: 151505520 -> 151502923 (-0.00%); split: -0.00%, +0.00% Cycle count: 17201385104 -> 17194901423 (-0.04%); split: -0.06%, +0.02% Spill count: 80827 -> 80837 (+0.01%) Fill count: 152693 -> 152692 (-0.00%); split: -0.01%, +0.01% Totals from 346 (0.05% of 630198) affected shaders: Instrs: 1257205 -> 1254608 (-0.21%); split: -0.21%, +0.00% Cycle count: 5532845647 -> 5526361966 (-0.12%); split: -0.18%, +0.06% Spill count: 32903 -> 32913 (+0.03%) Fill count: 64338 -> 64337 (-0.00%); split: -0.03%, +0.03% DG2 Totals: Instrs: 151531440 -> 151528055 (-0.00%); split: -0.00%, +0.00% Cycle count: 17200238927 -> 17197996676 (-0.01%); split: -0.03%, +0.02% Spill count: 81003 -> 80971 (-0.04%); split: -0.04%, +0.00% Fill count: 152975 -> 152912 (-0.04%); split: -0.05%, +0.01% Totals from 346 (0.05% of 630198) affected shaders: Instrs: 1260363 -> 1256978 (-0.27%); split: -0.27%, +0.00% Cycle count: 5532019670 -> 5529777419 (-0.04%); split: -0.09%, +0.05% Spill count: 33046 -> 33014 (-0.10%); split: -0.11%, +0.01% Fill count: 64581 -> 64518 (-0.10%); split: -0.13%, +0.03% Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) Totals: Instrs: 149972324 -> 149972289 (-0.00%) Cycle count: 15566495293 -> 15565151171 (-0.01%); split: -0.01%, +0.00% Totals from 16 (0.00% of 629912) affected shaders: Instrs: 351194 -> 351159 (-0.01%) Cycle count: 3922227030 -> 3920882908 (-0.03%); split: -0.04%, +0.00% Skylake Totals: Instrs: 140787999 -> 140787983 (-0.00%); split: -0.00%, +0.00% Cycle count: 14665614947 -> 14665515855 (-0.00%); split: -0.00%, +0.00% Spill count: 58500 -> 58501 (+0.00%) Fill count: 102097 -> 102100 (+0.00%) Totals from 16 (0.00% of 625685) affected shaders: Instrs: 343560 -> 343544 (-0.00%); split: -0.01%, +0.01% Cycle count: 3354997898 -> 3354898806 (-0.00%); split: -0.01%, +0.01% Spill count: 16864 -> 16865 (+0.01%) Fill count: 27479 -> 27482 (+0.01%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>	2024-08-30 03:39:31 +00:00
Ian Romanick	13332c236b	intel/brw: Unconditionally run optimizations after nir_opt_uniform_subgroup I observed some ray tracing shaders where a resource_intel inside a loop was non-uniform, and some code was lowered to account for that. Eventually the loop containing the resource_intel was unrolled, and the resource_intel became uniform. For example, nir_opt_uniform_subgroup can transform something like con loop { con block b5: // preds: b4 b8 con 32 %330 = @read_first_invocation (%329) con 1 %331 = ieq %330, %329 // succs: b6 b7 if %331 { con block b6: // preds: b5 con 32 %332 = iadd %120.b, %330 con 32 %333 = @resource_intel (%125 (0xdeaddeed), %332, %125 (0xdeaddeed), %3 (0x0)) (desc_set=1, binding=2, resource_intel=bindless\|non-uniform, resource_block_intel=-1) div 32x4 %334 = (float32)txl %333 (texture_handle), %130 (sampler_handle), %327 (coord), %275 (lod), 0 (texture), 0 (sampler) break // succs: b9 } else { con block b7: // preds: b5, succs: b8 } con block b8: // preds: b7, succs: b5 } into con loop { con block b5: // preds: b4 b8 con 1 %331 = ieq %329, %329 // succs: b6 b7 if %331 { con block b6: // preds: b5 con 32 %332 = iadd %120.b, %329 con 32 %333 = @resource_intel (%125 (0xdeaddeed), %332, %125 (0xdeaddeed), %3 (0x0)) (desc_set=1, binding=2, resource_intel=bindless\|non-uniform, resource_block_intel=-1) div 32x4 %334 = (float32)txl %333 (texture_handle), %130 (sampler_handle), %327 (coord), %275 (lod), 0 (texture), 0 (sampler) break // succs: b9 } else { con block b7: // preds: b5, succs: b8 } con block b8: // preds: b7, succs: b5 } Notice that %331 is now a tautology. Running brw_nir_optimize again eliminates the loop. v2: Add a comment in the code explaining the rationale. Suggested by Ken. Update the commit message. Suggested by Caio. shader-db: Meteor Lake, DG2, and Tiger Lake had similar results. (Meteor Lake shown) total instructions in shared programs: 19733448 -> 19733330 (<.01%) instructions in affected programs: 14120 -> 14002 (-0.84%) helped: 32 / HURT: 3 total cycles in shared programs: 916254496 -> 916226288 (<.01%) cycles in affected programs: 2035116 -> 2006908 (-1.39%) helped: 19 / HURT: 13 total spills in shared programs: 5807 -> 5807 (0.00%) spills in affected programs: 26 -> 26 (0.00%) helped: 1 / HURT: 1 total fills in shared programs: 6794 -> 6792 (-0.03%) fills in affected programs: 84 -> 82 (-2.38%) helped: 1 / HURT: 1 LOST: 1 GAINED: 1 Ice Lake and Skylake had similar results. (Ice Lake shown) total instructions in shared programs: 20393084 -> 20392971 (<.01%) instructions in affected programs: 21750 -> 21637 (-0.52%) helped: 31 / HURT: 4 total cycles in shared programs: 880273065 -> 880247818 (<.01%) cycles in affected programs: 2546748 -> 2521501 (-0.99%) helped: 18 / HURT: 9 total spills in shared programs: 4628 -> 4630 (0.04%) spills in affected programs: 287 -> 289 (0.70%) helped: 1 / HURT: 2 total fills in shared programs: 5381 -> 5376 (-0.09%) fills in affected programs: 711 -> 706 (-0.70%) helped: 2 / HURT: 2 LOST: 1 GAINED: 1 fossil-db: Meteor Lake and DG2 had similar results. (Meteor Lake shown) Totals: Instrs: 151513669 -> 151505520 (-0.01%); split: -0.01%, +0.00% Send messages: 7459339 -> 7459396 (+0.00%) Loop count: 49111 -> 47588 (-3.10%) Cycle count: 17208178205 -> 17201385104 (-0.04%); split: -0.05%, +0.01% Spill count: 80830 -> 80827 (-0.00%); split: -0.02%, +0.01% Fill count: 152754 -> 152693 (-0.04%); split: -0.04%, +0.00% Scratch Memory Size: 4136960 -> 4130816 (-0.15%) Max live registers: 32016493 -> 32015955 (-0.00%); split: -0.00%, +0.00% Totals from 672 (0.11% of 630198) affected shaders: Instrs: 1352428 -> 1344279 (-0.60%); split: -0.78%, +0.17% Send messages: 54302 -> 54359 (+0.10%) Loop count: 6124 -> 4601 (-24.87%) Cycle count: 1260266379 -> 1253473278 (-0.54%); split: -0.69%, +0.16% Spill count: 15967 -> 15964 (-0.02%); split: -0.09%, +0.08% Fill count: 36245 -> 36184 (-0.17%); split: -0.18%, +0.01% Scratch Memory Size: 740352 -> 734208 (-0.83%) Max live registers: 50699 -> 50161 (-1.06%); split: -1.45%, +0.39% Tiger Lake, Ice Lake, and Skylake had similar results. (Tiger Lake shown) Totals: Instrs: 149976046 -> 149971100 (-0.00%); split: -0.00%, +0.00% Subgroup size: 7685264 -> 7685256 (-0.00%) Cycle count: 15566401168 -> 15566405478 (+0.00%); split: -0.00%, +0.00% Spill count: 61238 -> 61240 (+0.00%) Fill count: 107301 -> 107289 (-0.01%) Max live registers: 31992969 -> 31993857 (+0.00%); split: -0.00%, +0.00% Totals from 553 (0.09% of 629912) affected shaders: Instrs: 557027 -> 552081 (-0.89%); split: -0.90%, +0.01% Subgroup size: 8648 -> 8640 (-0.09%) Cycle count: 150154496 -> 150158806 (+0.00%); split: -0.23%, +0.24% Spill count: 181 -> 183 (+1.10%) Fill count: 440 -> 428 (-2.73%) Max live registers: 33698 -> 34586 (+2.64%); split: -0.02%, +2.65% Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>	2024-08-30 03:39:31 +00:00
Ian Romanick	65eb7ed5fc	intel/brw: Run intel_nir_lower_conversions only after brw_nir_optimize Without this, the next commit tiggers assertions. v2: Unconditionally do the lowering after brw_nir_optimize. Suggested by Caio. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v1] Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>	2024-08-30 03:39:31 +00:00
Ian Romanick	572e00dd66	intel/brw: Copy prop from raw integer moves with mismatched types The specific pattern from the unit test was observed in ray tracing trampoline shaders. v2: Refactor the is_raw_move tests out to a utility function. Suggested by Ken. v3: Fix a regression caused by being too picky about source modifiers. This was introduced somewhere between when I did initial shader-db runs an v2. v4: Fix typo in comment. Noticed by Caio. shader-db: All Intel platforms had similar results. (Meteor Lake shown) total instructions in shared programs: 19734086 -> 19733997 (<.01%) instructions in affected programs: 135388 -> 135299 (-0.07%) helped: 76 / HURT: 2 total cycles in shared programs: 916290451 -> 916264968 (<.01%) cycles in affected programs: 41046002 -> 41020519 (-0.06%) helped: 32 / HURT: 29 fossil-db: Meteor Lake, DG2, and Skylake had similar results. (Meteor Lake shown) Totals: Instrs: 151531355 -> 151513669 (-0.01%); split: -0.01%, +0.00% Cycle count: 17209372399 -> 17208178205 (-0.01%); split: -0.01%, +0.00% Max live registers: 32016490 -> 32016493 (+0.00%) Totals from 17361 (2.75% of 630198) affected shaders: Instrs: 2642048 -> 2624362 (-0.67%); split: -0.67%, +0.00% Cycle count: 79803066 -> 78608872 (-1.50%); split: -1.75%, +0.25% Max live registers: 421668 -> 421671 (+0.00%) Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) Totals: Instrs: 149995644 -> 149977326 (-0.01%); split: -0.01%, +0.00% Cycle count: 15567293770 -> 15566524840 (-0.00%); split: -0.02%, +0.01% Spill count: 61241 -> 61238 (-0.00%) Fill count: 107304 -> 107301 (-0.00%) Max live registers: 31993109 -> 31993112 (+0.00%) Totals from 17813 (2.83% of 629912) affected shaders: Instrs: 3738236 -> 3719918 (-0.49%); split: -0.49%, +0.00% Cycle count: 4251157049 -> 4250388119 (-0.02%); split: -0.06%, +0.04% Spill count: 28268 -> 28265 (-0.01%) Fill count: 50377 -> 50374 (-0.01%) Max live registers: 470648 -> 470651 (+0.00%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30251>	2024-08-30 03:39:31 +00:00
Lionel Landwerlin	14d772d678	anv: fix utrace compute timestamp reads on Gfx20 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30923>	2024-08-29 20:10:11 +00:00
Tapani Pälli	096acf8c0c	anv: change existing ICL workaround to depend on BLEND_STATE Commit `f900b763b1` we started to dirty MS as WM changes. However later on things changed with `eebb6cd236`, we need to dirty with BLEND_STATE now. Fixes: `eebb6cd236` ("anv: stop using 3DSTATE_WM::ForceThreadDispatchEnable") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30920>	2024-08-29 13:58:08 +00:00
Rohan Garg	51e05c2844	iris,anv: simplify and inline sampler count calculations Use the CLAMP macro to clamp the value and simplify the sampler count encoding. Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30922>	2024-08-29 11:49:56 +00:00
Rohan Garg	32f606486f	anv: prefetch samplers when dispatching compute shaders Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30922>	2024-08-29 11:49:56 +00:00
Tapani Pälli	44e1cf2748	anv: set correct miplevel for anv_image_hiz_op Fixes: `5efecc9782` ("anv: Enable HiZ on multi-LOD depth buffers.") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11787 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30892>	2024-08-29 04:50:44 +00:00
Faith Ekstrand	42114aa723	vulkan: Handle VIEW_INDEX_FROM_DEVICE_INDEX_BIT in the runtime Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30876>	2024-08-29 03:30:31 +00:00
Faith Ekstrand	8c60f1461b	vulkan: Take a VkPipelineCreateFlags2KHR in vk_pipeline_shader_stage() Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30876>	2024-08-29 03:30:31 +00:00
Jesse Natalie	03655dfda1	compiler, vk: Support subgroup size of 4 Relax the assert and assign it an enum value Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30876>	2024-08-29 03:30:31 +00:00
Kenneth Graunke	da395e6985	intel/brw: Fix extract_imm for subregion reads of 64-bit immediates We could be trying to extract a D/UD from a Q/UQ, for example. We were ignoring the top 32-bits, which is incorrect. Fixes: `580e1c592d` ("intel/brw: Introduce a new SSA-based copy propagation pass") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30884>	2024-08-28 12:33:26 -07:00
Kenneth Graunke	51c85e0363	intel/brw: Drop misguided sign extension attempts in extract_imm() This function never expands a type - it only narrows it. As such, we don't need to ever sign extend to fill additional new bits. I think this code was left over from earlier versions of my optimization pass that was buggy and trying to handle cases it should not have. Fixes: `580e1c592d` ("intel/brw: Introduce a new SSA-based copy propagation pass") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30884>	2024-08-28 12:33:26 -07:00
Deborah Brouwer	18f15da94d	ci/intel: add i915/MTL firmware to rootfs Add Meteor Lake firmware directly to rootfs since it is not available from debian package. Signed-off-by: Deborah Brouwer <deborah.brouwer@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30770>	2024-08-28 04:31:10 +00:00
Caio Oliveira	695f5314d6	intel/brw: Simplify fs_inst annotation When INTEL_DEBUG=ann is also set, the disassembler would annotate the output with either a string or the string verison of a NIR instruction. This was done by keeping two pointers (but only using one at a time). Change the code to print the instruction into a string instead of keeping it pointer around (peg the string to the shader). That way, only one pointer is needed for annotations. Because that serialization is not free, only do that when the environment variable is set. Since we are here, move the annotation string field to the end, moving it to the least commonly used cacheline. Further packing might allow the entire fs_inst to fit in two cachelines. For release builds, don't even add the debug annotation to the struct. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30822>	2024-08-28 03:59:50 +00:00
Caio Oliveira	ec15cdfa2a	intel/brw: Pack brw_reg struct The alignment required for the second union (has 64-bit size) causes a hole between the first and second union. Move the remaining data there. In 64-bit build, shrinks brw_reg from 24 bytes to 16 bytes. And by consequence, shirnks fs_inst from 200 bytes to 160 bytes, making it use one less cacheline. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30822>	2024-08-28 03:59:50 +00:00
Iván Briano	2261b298d1	anv: fix adding to wa_addr Fixes: `6336e0fe7f` ("anv: order data in wa_bo to leave wa_addr last") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30881>	2024-08-27 18:10:58 -07:00
Sagar Ghuge	063715ed45	anv: Reduce clear color state alignment to 64B Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26793>	2024-08-27 21:13:30 +00:00
Lionel Landwerlin	e97b968aeb	brw: add a comment what Gfx12.5 URB fences Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30849>	2024-08-27 13:38:14 +00:00
Lionel Landwerlin	93fba40389	brw: switch mesh/task URB fence prior to EOT to GPU Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30849>	2024-08-27 13:38:14 +00:00

1 2 3 4 5 ...

12664 commits