fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-27 10:08:13 +02:00

Author	SHA1	Message	Date
Sagar Ghuge	cb423ee636	anv: Fix Wa_14021821874, Wa_14018813551, Wa_14026600921 WA states that we need to allocate maximum number of stackIDs per DSS from RT_DISPATCH_GLOBALS to 2048. We can still throttle/control the CFE_STATE::StackID to be in range specified by the field. This does impact performance having CFE_STATE::stackIDs capped to 2K by default. More the outstanding ray queries, larger the working set and have more impact on cache hit rate. This affect performance on Xe2+ onwards: * Boundary Benchmark: 36.2% * Solar Bay extreme: 9.8% * Hitman world of assassination: 3.9% Fixes: `c1a44e8d43` ("anv: force StackIDControl value for Wa_14021821874") Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40310>	2026-03-10 22:41:54 +00:00
Sagar Ghuge	3a62dc0218	anv: Set max outstanding ray queries to 1024 Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Set max outstanding ray queries to 1024. This value can be tuned later specific to apps. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40182>	2026-03-04 21:45:14 +00:00
Lionel Landwerlin	db964068bf	anv: add drirc option to workaround missing application barriers on typed/untyped data Enable it for Horizon Forbidden West (only seems to have untyped data issue). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14889 Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40187>	2026-03-04 20:40:59 +00:00
Jordan Justen	0b94b15a3c	anv: Add Xe3P (GFX_VERx10==350) Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40208>	2026-03-04 11:10:34 -08:00
Caio Oliveira	df4042371f	anv: Set PIPELINE_SELECT systolic mode based on shader usage For Gfx125 workloads that use systolic mode, this might mean an extra PIPELINE_SELECT when flipping between a compute shader that use the mode and another that doesn't use the mode (or vice-versa). Reviewed-by: Iván Briano <ivan.briano@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40014>	2026-02-26 19:05:56 +00:00
Lionel Landwerlin	487586fefa	anv: implement inline parameter promotion from push constants Push constants on bindless stages of Gfx12.5+ don't get the data delivered in the registers automatically. Instead the shader needs to load the data with SEND messages. Those stages do get a single InlineParameter 32B block of data delivered into the EU. We can use that to promote some of the push constant data that has to be pulled otherwise. The driver will try to promote all push constant data (app + driver values) if it can, if it can't it'll try to promote only the driver values (usually a shader will only use a few driver values). If even the drivers values won't fit, give up and don't use the inline parameter at all. LNL internal fossil-db: Totals from 315738 (20.08% of 1572649) affected shaders: Instrs: 155053691 -> 154920901 (-0.09%); split: -0.09%, +0.00% CodeSize: 2578204272 -> 2574991568 (-0.12%); split: -0.15%, +0.02% Send messages: 8235628 -> 8184485 (-0.62%); split: -0.62%, +0.00% Cycle count: 43911938816 -> 43901857748 (-0.02%); split: -0.05%, +0.03% Spill count: 481329 -> 473185 (-1.69%); split: -1.82%, +0.13% Fill count: 405617 -> 399243 (-1.57%); split: -1.86%, +0.28% Max live registers: 34309395 -> 34309300 (-0.00%); split: -0.00%, +0.00% Max dispatch width: 8298224 -> 8299168 (+0.01%) Non SSA regs after NIR: 18492887 -> 17631285 (-4.66%); split: -4.73%, +0.08% Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39405>	2026-02-25 10:44:09 +00:00
Lionel Landwerlin	4fa1eddb4c	anv: optimize binding table flushing Split emission from pointers programming. That way we can switch back & forth between blorp & applications shaders and never emit binding tables, we just reprogram the pointers. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39405>	2026-02-25 10:44:05 +00:00
Lionel Landwerlin	79a56ef448	anv: add a debug printout for dirty descriptors Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39405>	2026-02-25 10:44:04 +00:00
Lionel Landwerlin	2ef29502ed	brw: enable ex_bso for LSC_SS Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35160>	2026-02-12 16:45:22 +00:00
Calder Young	895ff7fe92	Revert "anv,brw: Allow multiple ray queries without spilling to a shadow stack" Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details This optimization doesn't work when the ray query index isn't uniform across the subgroup, which is something the spec allows. While there are some smart ways to fix this and still avoid unnecessary spilling, its not worth investing the time until we find a realtime raytracing workload that actually needs to use multiple live ray queries for something. Fixes: `1f1de7eb` ("anv,brw: Allow multiple ray queries without spilling to a shadow stack") Acked-by: Sagar Ghuge <sagar.ghuge@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39445>	2026-01-23 21:33:55 +00:00
Tapani Pälli	840e6e855b	anv: add handling for Wa_14026600921 This is the Xe3 version of the earlier workaround. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39404>	2026-01-23 11:10:07 +00:00
Calder Young	d69daf28d0	anv,brw: Add helper to get stack ids per dss for ray queries Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38778>	2026-01-16 09:21:50 +00:00
Calder Young	1f1de7ebd6	anv,brw: Allow multiple ray queries without spilling to a shadow stack Allows a shader to have multiple ray queries without spilling them to a shadow stack. Instead, the driver provides the shader with an array of multiple RTDispatchGlobals structs to give each query its own dedicated stack. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38778>	2026-01-16 09:21:50 +00:00
Lionel Landwerlin	6d19b898e7	anv/brw: prep work for SIMD32 ray queries Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36181>	2026-01-12 12:19:21 +00:00
Lionel Landwerlin	f2c571fabf	anv: add tracking of involved stages in pipe flushes Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38707>	2025-12-15 08:25:32 +00:00
Lionel Landwerlin	578d2f0daa	anv: move load_num_workgroups tracking to driver Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38735>	2025-12-02 22:44:04 +00:00
Lionel Landwerlin	c478b6355a	anv/blorp/iris: rework Wa_14025112257 Drivers already have to track this workaround, so remove the logic from Blorp and let the driver manage this. Also in Anv don't accumulate this workaround, emit it directly in place right after COMPUTE_WALKER. Accumulating can be problematic when you want to dispatch concurrent compute shaders that do not need any cache flush interaction (typical example with the internal simple_shader framework). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `3e0ad0176b` ("anv: Emit state cache invalidation after every compute dispatch") Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38306>	2025-11-10 08:57:06 +00:00
José Roberto de Souza	a21b925caa	anv: Rename anv_shader_bin to anv_shader_internal Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details It is now only used by internal shaders to the rename make it more clear. Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37749>	2025-10-08 19:58:30 +00:00
Dylan Baker	1c930a505e	anv: don't attempt to memcpy if allocation fails Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Based on git history thhese appears to be a subset of `anv_batch_emit_batch`, so I've structured the code similarly, if `anv_batch_emit_dwords` returns `nullptr`, we just move on without copying the memory. CID: 1665339 CID: 1664814 Reviewed-by: Iván Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37534>	2025-09-24 15:29:48 +00:00
Lionel Landwerlin	e76ed91d3f	anv: switch over to runtime pipelines Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34872>	2025-09-05 07:46:20 +00:00
Sagar Ghuge	3e0ad0176b	anv: Emit state cache invalidation after every compute dispatch Implement HSD 16028171704/14025112257: LSC state cache livelock:- Once state cache entries are full, subsequent walker dispatches with two threads per thread group maybe gets stuck infinitely because of state cache live lock. One thread continuously stuck in loop doing UGM fence + evict and UGM read is waiting on UGM read to have certain value. while other thread supposed to update the value that first thread is waiting for. But since entries are full in state cache, there is second thread never make progress. Closes: #12352 Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37128>	2025-09-04 00:14:48 +00:00
Sagar Ghuge	cac3b4f404	anv: Mask off excessive invocations Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details For unaligned invocations, don't launch two COMPUTE_WALKER, instead we can mask off excessive invocations in the shader itself at nir level and launch one additional workgroup. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36245>	2025-08-12 23:17:02 +00:00
Lionel Landwerlin	5a2fb0da32	anv: actually use the COMPUTE_WALKER_BODY prepacked field Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36711>	2025-08-11 11:14:52 +00:00
Lionel Landwerlin	e7aeed1f09	anv: pass active stages to push descriptor flushing Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36711>	2025-08-11 11:14:51 +00:00
Qiang Yu	196569b1a4	all: rename gl_shader_stage to mesa_shader_stage It's not only for GL, change to a generic name. Use command: find . -type f -not -path '/.git/' -exec sed -i 's/\bgl_shader_stage\b/mesa_shader_stage/g' {} + Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Yonggang Luo <luoyonggang@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569>	2025-08-06 10:28:40 +08:00
Lionel Landwerlin	8966088cc5	anv: store gfx/compute bound shaders on command buffer state Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36512>	2025-08-01 11:35:08 +00:00
Lionel Landwerlin	094ddc35cc	anv: constify some helpers Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36512>	2025-08-01 11:35:08 +00:00
Lionel Landwerlin	18f234a8a2	anv: avoid looking at the pipeline to flush push descriptors We do this at the cost of recomputing some values that where available on the pipeline at vkCmdBindPipeline() time. We can look at the shaders on graphics/compute which will work nicely with the runtime. The runtime doesn't have support for ray tracing pipelines so we keep using them. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36512>	2025-08-01 11:35:07 +00:00
Lionel Landwerlin	99016a893a	anv: avoid storing L3 config on the pipeline On Gfx9 we only use 2 L3 config depending on SLM use or not. So it's the same config for all Gfx pipelines. On Gfx11+ there is only one config (since SLM is allocated from somewhere else). So avoid store this on the pipeline, pick the config when flushing the pipeline. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36512>	2025-08-01 11:35:05 +00:00
Antonio Ospite	ddf2aa3a4d	build: avoid redefining unreachable() which is standard in C23 In the C23 standard unreachable() is now a predefined function-like macro in <stddef.h> See https://android.googlesource.com/platform/bionic/+/HEAD/docs/c23.md#is-now-a-predefined-function_like-macro-in And this causes build errors when building for C23: ----------------------------------------------------------------------- In file included from ../src/util/log.h:30, from ../src/util/log.c:30: ../src/util/macros.h:123:9: warning: "unreachable" redefined 123 \| #define unreachable(str) \ \| ^~~~~~~~~~~ In file included from ../src/util/macros.h:31: /usr/lib/gcc/x86_64-linux-gnu/14/include/stddef.h:456:9: note: this is the location of the previous definition 456 \| #define unreachable() (__builtin_unreachable ()) \| ^~~~~~~~~~~ ----------------------------------------------------------------------- So don't redefine it with the same name, but use the name UNREACHABLE() to also signify it's a macro. Using a different name also makes sense because the behavior of the macro was extending the one of __builtin_unreachable() anyway, and it also had a different signature, accepting one argument, compared to the standard unreachable() with no arguments. This change improves the chances of building mesa with the C23 standard, which for instance is the default in recent AOSP versions. All the instances of the macro, including the definition, were updated with the following command line: git grep -l '[^_]unreachable(' -- "src/**" \| sort \| uniq \| \ while read file; \ do \ sed -e 's/$[^_]$unreachable(/\1UNREACHABLE(/g' -i "$file"; \ done && \ sed -e 's/#undef unreachable/#undef UNREACHABLE/g' -i src/intel/isl/isl_aux_info.c Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36437>	2025-07-31 17:49:42 +00:00
Lionel Landwerlin	ac78693b6a	intel/genxml: rename body field So that the body field has the same name in COMPUTE_WALKER & EXECUTE_INDIRECT_DISPATCH. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36146>	2025-07-16 01:01:11 +00:00
Sagar Ghuge	e761c45390	anv: Set TG size based on number of threads Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Series shows improvement on TotalWarPharaoh-trace-dx11-1440p-ultra-n=2080 title by 0.96% (not a lot but still it's improvement, so will take that.) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35904>	2025-07-10 22:08:36 +00:00
José Roberto de Souza	59019a05f6	anv: Program DispatchWalkOrder and ThreadGroupBatchSize with optimized values for regular computer walkers It was only added to indirect compute walkers while HSD don't say anything about this optimization be specific to indirect compute walkers. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36058>	2025-07-10 20:54:30 +00:00
Sushma Venkatesh Reddy	29fc96cb80	anv: Add GPU breakpoint before/after specific compute dispatch call Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13089 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35353>	2025-07-07 17:43:41 +00:00
José Roberto de Souza	bdd20457ed	anv: Emit STATE_COMPUTE_MODE before COMPUTE_WALKER when new async compute limits are needed Cc: stable Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35563>	2025-06-23 18:57:25 +00:00
Sagar Ghuge	3696f85b63	anv: Drop unused helper cmd_buffer_dispatch_kernel Drop some more unused fields: (Lionel) - kernel_args_size, kernel_arg_count & kernel_args - anv_kernel_arg - anv_kernel - max_grl_scratch_size Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35530>	2025-06-16 15:22:09 +00:00
Iván Briano	99405647a4	anv: vkCmdTraceRays* are not covered by conditional rendering Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details The spec says: Certain rendering commands can be executed conditionally based on a value in buffer memory. These rendering commands are limited to drawing commands, dispatching commands, and clearing attachments with vkCmdClearAttachments within a conditional rendering block which is defined by commands vkCmdBeginConditionalRenderingEXT and vkCmdEndConditionalRenderingEXT. Other rendering commands remain unaffected by conditional rendering. It would seem that vkCmdTraceRays* are not covered by that. Fixes new tests dEQP-VK.conditional_rendering.conditional_ignore.trace_rays* Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34864>	2025-05-08 21:08:06 +00:00
Sagar Ghuge	0463e14b94	anv: Enable 64bit memory structure mode for RT Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kevin Chuang <kaiwenjon23@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33047>	2025-04-21 20:10:45 +00:00
Sagar Ghuge	6deb1950a4	anv: Update RT dispatch globals to use 64bit data structure Rework (Kevin) - Fix Hit/Miss/Resume shader group table value Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kevin Chuang <kaiwenjon23@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33047>	2025-04-21 20:10:45 +00:00
Lionel Landwerlin	72bc74f0be	anv: add shader-hash debug option Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Emits a dummy MI_STORE_DATA_IMM with the shader hash in front of : - 3DSTATE_VS - 3DSTATE_HS - 3DSTATE_DS - 3DSTATE_HS - 3DSTATE_PS - COMPUTE_WALKER / GPGPU_WALKER Example : 0x00000000: 0x10000002: MI_STORE_DATA_IMM 0x00000000: 0x10000002 : Dword 0 DWord Length: 2 Force Write Completion Check : false Store Qword: 0 Use Global GTT: false 0x00000004: 0xffffe0c0 : Dword 1 Core Mode Enable: 0 0x00000008: 0x0000effe : Dword 2 Address: 0xeffeffffe0c0 0x0000000c: 0x126e815a : Dword 3 <------------ shader hash 0x00000010: 0x78100007 : Dword 4 Immediate Data: 309231962 0x00000000: 0x78100007: 3DSTATE_VS 0x00000000: 0x78100007 : Dword 0 DWord Length: 7 0x00000004: 0x00000000 : Dword 1 0x00000008: 0x00000000 : Dword 2 Kernel Start Pointer: 0x00000000 0x0000000c: 0x00040000 : Dword 3 Software Exception Enable: false Accesses UAV: false It'll correlate with the value emitted in the pipeline stats from fossil replay : $ grep -i 126e815a /tmp/stats.csv fossilize.aab93c5c3f965151.1.foz,GRAPHICS,de1b925dec8a8083,507378,498283,303434,vertex,8,50,4,0,1826,0,0,0,8,17,0,0x00000000126e815a,15 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34332>	2025-04-04 15:18:28 +00:00
Caleb Callaway	c37ece75ea	anv: add INTEL_DEBUG=rt_notrace Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Kevin Chuang <kaiwenjon23@gmail.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34169>	2025-03-26 00:52:53 +00:00
Lionel Landwerlin	e4f31b8744	intel/ds: rework RT tracepoints That way we can identify single dispatch within each step. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Michael Cheng <michael.cheng@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33684>	2025-02-24 08:08:02 +00:00
Lionel Landwerlin	84f96a0199	anv: switch to use brw's prog_data source_hash Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Michael Cheng <michael.cheng@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33643>	2025-02-22 08:30:22 +00:00
Lionel Landwerlin	d75849aaea	anv: make compute state flush helper visible Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33550>	2025-02-15 18:38:24 +02:00
Lionel Landwerlin	9aef4ceb13	anv: hold a prepacked COMPUTE_WALKER instruction on CS pipelines Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33550>	2025-02-15 18:38:18 +02:00
Lionel Landwerlin	a8b84e1898	anv: use A64 messages for push constants loads on Gfx12.5+ Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32895>	2025-02-05 09:56:04 +00:00
Tapani Pälli	4e80045ae0	intel/genxml/anv: fix the layout of call stack handler struct Patch adds new CALL_STACK_HANDLER struct which has offset to start and end of RegistersPerThread field, this spec changes is described in Wa_22019854901 (see HSD 22019967134). Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33342>	2025-02-04 08:44:04 +00:00
Francisco Jerez	b25d0f899b	anv/xe3+: Set RegistersPerThread during shader state setup based on prog_data. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32664>	2025-01-29 23:39:32 +00:00
Michael Cheng	c3c05ffb5f	intel : Expose Shader hashes for utrace and Perfetto This patch exposes shader hashes (computes and draws) to Perfetto and utrace. By including these hashes in traces, developers can correlate compute and draw calls with their assoicated ASM dumps when analyzing the traces. To achieve this, intel_tracepoint.py has been reworked to preprocess tracepoint arguments dynamically. Any argument containing "hash" in its variable name is now forrmated as hexadecimal before being passed to the tracepoint definition. Signed-off-by: Michael <michael.cheng@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32708>	2025-01-10 17:38:16 +00:00
Sagar Ghuge	d3f9139e49	intel: Use Morton compute walk order According to HSD 14016252163 if compute shader uses the sample operation, morton walk order and set the thread group batch size to 4 is expected to increase sampler cache hit rates by increasing sample address locality within a subslice. Rework: * Caio: "\|\|" => "&&" for type checking in instr_uses_sampler() * Jordan: Use nir's foreach macros rather than nir_shader_lower_instructions() Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32430>	2024-12-12 19:56:47 -08:00

1 2

90 commits