fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-20 22:30:12 +01:00

Author	SHA1	Message	Date
Sagar Ghuge	17096f87c1	intel: Switch to COMPUTE_WALKER_BODY Stuff COMPUTE_WALKER_BODY in COMPUTER_WALKER in both iris and anv. This also fixes the tracepoint for ray dispatches. Stuffing COMPUTE_WALKER_BODY allow us to set the cmd_buffer->state.last_compute_walker. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31822>	2024-10-29 15:54:43 +00:00
Lionel Landwerlin	68a372f6ce	anv: use UINT32_MAX to be consistent Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31799>	2024-10-23 18:54:39 +00:00
Lionel Landwerlin	b4ae8cf381	anv: reemit push constants on pipeline changes Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `02294961ee` ("anv: stop using a binding table entry for gl_NumWorkgroups") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12058 Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31799>	2024-10-23 18:54:39 +00:00
Lionel Landwerlin	7d9449c873	anv: fix missing inline parameter emission Should only impact Xe2+ Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `02294961ee` ("anv: stop using a binding table entry for gl_NumWorkgroups") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31799>	2024-10-23 18:54:39 +00:00
Lionel Landwerlin	3a5b9ee59e	anv: fix binding table entry count for compute shaders We're not using a binding table entry anymore for num_workgroups. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `02294961ee` ("anv: stop using a binding table entry for gl_NumWorkgroups") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31799>	2024-10-23 18:54:39 +00:00
Lionel Landwerlin	02294961ee	anv: stop using a binding table entry for gl_NumWorkgroups This will make things easier in situations where we don't want to use the binding table at all (indirect draws/dispatches). The mechanism is simple, upload a vec3 either through push constants (<= Gfx12.0) or through the inline parameter register (>= Gfx12.5). In the shader, do this : if vec.x == 0xffffffff: addr = pack64_2x32 vec.y, vec.z vec = load_global addr This works because we limit the maximum number of workgroup size to 0xffff in all dimension : maxComputeWorkGroupCount = { 65535, 65535, 65535 }, So we can use the large values to signal the need for indirect loading. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31508>	2024-10-17 19:35:59 +00:00
Tapani Pälli	e4fcbe8d6f	anv: set StackIDControlOverride_RTGlobals for 2 workarounds GFX_VER block matches both workarounds and while these workarounds are almost about the same cause, other one applies only for LNL and other one for BMG, need to check for both. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31571>	2024-10-10 10:20:56 +00:00
Tapani Pälli	c1a44e8d43	anv: force StackIDControl value for Wa_14021821874 This is also encouraged by another wa, Wa_14018813551. Both workarounds state that StackIDControlOverride_RTGlobals should always be set to 0 (i.e. 2k). Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30937>	2024-09-30 07:33:37 +03:00
Rohan Garg	32f606486f	anv: prefetch samplers when dispatching compute shaders Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30922>	2024-08-29 11:49:56 +00:00
Lionel Landwerlin	7a55a930f6	anv: reuse common pipeline state for compute push allocations Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30713>	2024-08-22 19:44:39 +00:00
Nanley Chery	54631ebc68	anv: Batch MCS and CCS aux-op flushes The PRMs suggest that certain classes of auxiliary surface operations will automatically synchronize when performed back-to-back: Any transition from any value in {Clear, Render, Resolve} to a different value in {Clear, Render, Resolve} requires end of pipe synchronization. Make use of this functionality by batching CCS and MCS flushes when compatible auxiliary surface operations are performed within a command buffer. Ref: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11325 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29922>	2024-08-07 15:25:37 +00:00
Lionel Landwerlin	f4a812a229	anv: remove some unused includes Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30539>	2024-08-06 17:55:18 +00:00
Lionel Landwerlin	78ae7ab856	anv/hasvk: add indirect tracepoint arguments Gives visibility on some indirect parameter dispatches : - draw count - compute dispatch size Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29944>	2024-08-03 16:03:04 +03:00
José Roberto de Souza	69ee1c4b46	anv: Drop useless 'if (total_scratch > 0) {' block in cmd_buffer_ensure_cfe_state() cmd_buffer_ensure_cfe_state() returns ealier if total_scratch == 0 here: if (total_scratch <= comp_state->scratch_size) return; Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30271>	2024-07-22 18:17:38 +00:00
Lionel Landwerlin	57e74d7b56	anv: allocate compute scratch using the right scratch pool Cc: mesa-stable Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29778>	2024-07-01 06:48:06 +00:00
Francisco Jerez	8bbad903a2	anv/xe2+: Fix format of scratch space surface address in various 3DSTATE packets. This field encodes bits [27:6] of the scratch surface state offset according to the hardware spec, already on XeHP platforms. However, on previous platforms we were passing bits [25:4] instead, which was apparently okay for two reasons: 1/ We never used more than 8 MB of scratch surface states apparently. 2/ A shift right by 2 was implicitly happening while copying the value of r0.5 into the address register holding the extended descriptor, which with the ExBSO addressing mode disabled considered bits [31:12] as the surface state index within the pool. However on Xe2 ExBSO addressing mode is always enabled for the UGM shared function, so we have to add an extra SHR instruction to format the extended descriptor regardless, and there is no point in disobeying the hardware spec passing a left-shifted offset. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29543>	2024-06-21 01:49:43 +00:00
Lionel Landwerlin	5b4278ccd8	anv: use new mi-builder write check API to avoid stalls Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29571>	2024-06-13 11:04:31 +00:00
Lionel Landwerlin	3e4f6def87	anv: centralize mi_builder setup Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29571>	2024-06-13 11:04:31 +00:00
Lionel Landwerlin	e6efe2e3fe	anv: support setting CFE_STATE::StackIDControl per application This is a performance tuning value, recommended value is 512 on DG2. On DG2 this was in the privileged register RT_CTRL. Minor CFE_STATE defintion fixes from Jose. Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29616>	2024-06-10 14:08:03 +00:00
José Roberto de Souza	a472d415bc	anv/xe2: Enable compute walker and BTD thread preemption GFX versions older than GFX 20 have 'Thread Preemption disable' while GFX 20 has 'Thread Preemption' with value flipped in compute walker instruction. So here by default enabling thread preemption, only disabling it when BTD mode is enabled as instructed in Wa_14017794102. Similar for 3DSTATE_BTD, enabling preemption by default and only disabling when platform is affected by Wa_14017794102. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29616>	2024-06-10 14:08:02 +00:00
Lionel Landwerlin	a1ea0956b4	intel: fix HW generated local-id with indirect compute walker Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `5e7f4ff97f` ("intel: Add driver support for hardware generated local invocation IDs") Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29473>	2024-05-31 08:44:22 +00:00
José Roberto de Souza	07855b0431	intel: Compute the optimal preferred SLM size per subslice Up to now preferred SLM size was being set to maximum preferred SLM size for GFX 12.5 platforms and to workgroup SLM size for Xe2 but neither of those values are the optimal. The optimal value is: <number of workgroups that can run per subslice> * <workgroup SLM size> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28910>	2024-05-30 16:46:16 +00:00
José Roberto de Souza	ddda68bbf5	intel: Set preferred SLM allocation size >= than SLM size for Xe2 Xe2 has 2 requirements for preferred SLM size: - this value needs to be >= then SLM size - this value must be less than shared SLM/L1$ RAM in the sub-slice of platform Also Xe2 don't have the special '0' encode that sets preferred SLM allocation size to the maximum supported. So here setting a value that is equal or larger than SLM size. It was always setting SLM_ENCODES_128K for LNL A0 stepping probably because of Wa_16018610683 but this restriction applies to all Xe2 platforms, also because of the first restriction mentioned here this workaround is not being properly implemented, will fix that in the next patch. We should have a formula to calculate a preferred SLM allocation size for gfx125 and Xe2 platfoms but until that this is enough to fix at least the applications and tests below on LNL: - GFXBench Aztec Ruins VK - GravityMark VK - Wildlife Extreme VK - 5 crucible tests Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28910>	2024-05-30 16:46:16 +00:00
José Roberto de Souza	f5f71bae02	intel: Move slm functions from brw_compiler.h to intel_compute_slm.c/h This functions were inlined in a header and duplicated between brw and elk. That would be enough reasons to move to a C file but next patches will add more code to support Xe2 platforms, what would cause more code to be inlined, duplicating even more code and increasing lib size. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28910>	2024-05-30 16:46:16 +00:00
Lionel Landwerlin	265b2b1255	anv: move last compute command pointers to the state structure Makes it easier to clear. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29438>	2024-05-30 06:38:04 +00:00
Tapani Pälli	62d96a6546	anv: add dirty tracking for push constant data This allows us to skip allocating state if it exists already. There are different scenarios where this can help: when updating only descriptors (not push constant data) and after blorp or simple shader run. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10898 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28689>	2024-04-16 07:23:52 +03:00
José Roberto de Souza	cccb5e36f1	anv: Call flush_pipeline_select_gpgpu() for compute engines in compute code paths These 2 compute code paths were checking for anv_cmd_buffer_is_render_queue() before calling flush_pipeline_select_gpgpu() causing cmd_buffer->state.current_pipeline to never to be set to GPGPU, trigerring assert(cmd_buffer->state.current_pipeline == GPGPU) when running in the compute engine. So here just dropping the anv_cmd_buffer_is_render_queue() check. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28053>	2024-03-08 14:39:09 +00:00
Lionel Landwerlin	ab7641b8dc	anv: implement descriptor buffer binding And barriers for them. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22151>	2024-02-29 07:05:06 +00:00
Lionel Landwerlin	82d772fa9b	anv: create new helper for small allocations A number of allocations during command buffer building are sourced from the dynamic state heap. They're not actually access using an offset in the dynamic state heap, it just happens to be a conveninent place. Use different helpers for thoses so we dynamically change the dynamic state heap location in the next commits. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22151>	2024-02-29 07:05:06 +00:00
Caio Oliveira	255a411450	intel: Use _brw suffix for genX headers that rely on brw Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27563>	2024-02-24 00:24:32 +00:00
Caio Oliveira	8ae528331c	intel/compiler: Use "intel" prefix for walk_order enum Will be used later in non-brw specific code in Iris. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27646>	2024-02-21 00:38:35 +00:00
Lionel Landwerlin	dbee85713f	anv: factor out descriptor buffer flushing Take the opportunity to fix the flush of the descriptor buffer surface when needed. Previously we would only flush it if the shader used one of the push descriptor. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27504>	2024-02-19 11:10:29 +00:00
Caio Oliveira	5732c9d269	intel/compiler: Rename brw_cs_dispatch_info to intel_cs_dispatch_info And move to the intel_shader_enums.h file. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27475>	2024-02-14 22:31:23 -08:00
Kenneth Graunke	5e7f4ff97f	intel: Add driver support for hardware generated local invocation IDs This adds a few new fields in the brw_cs_prog_data struct and then uses them to fill in the relevant COMPUTE_WALKER fields. Although the Tile Layout field theoretically has different settings for 32/64/128bpe, it appears that the recommended programming is to always pick either TileY 32bpe or Linear. It's not very practical to look at the surface formats involved, anyway. Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27167>	2024-01-25 08:43:04 +00:00
Lionel Landwerlin	58c9f817cb	anv: fix pipeline executable properties with graphics libraries We're missing the ISA code in renderdoc. You can reproduce with the Sascha Willems graphics pipeline demo. The change is large here because we have to fix a confusion between anv_shader_bin & anv_pipeline_executable. anv_pipeline_executable is there as a representation for the user and multiple anv_pipeline_executable can point to a single anv_shader_bin. In this change we split the anv_shader_bin related logic that was added in anv_pipeline_add_executable*() and move it to a new anv_pipeline_account_shader() function. When importing RT libraries, we add all the anv_pipeline_executable from the libraries. When importing Gfx libraries, we add the anv_pipeline_executable only if not doing link time optimization. anv_shader_bin related properties are added whenever we're importing a shader from a library, compiling or finding in the cache. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `3d49cdb71e` ("anv: implement VK_EXT_graphics_pipeline_library") Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26594>	2024-01-23 07:38:02 +00:00
Lionel Landwerlin	51d63f2236	anv: move compute/ray-tracing commands to their own file Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26989>	2024-01-15 12:28:50 +00:00

36 commits