fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-22 22:10:10 +01:00

Author	SHA1	Message	Date
Caio Oliveira	06b553f02c	intel/elk: Remove compiler specific devinfo hash This more coarse-grained hash information for compiler (vs. full devinfo), used only by Iris and Anv, and relevant for more recent platforms. Remove it from elk. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27563>	2024-02-24 00:24:30 +00:00
Caio Oliveira	0083585fc5	intel/elk: Compile ELK library, tests and tools For now is not linked to any driver. The tools were renamed to use elk prefix to avoid conflicting with the brw ones. The run-test.py script was also updated due to that change. Before the new compiler can be linked together with the old (going to be done for Iris and other tools), the symbol conflicts need to be fixed first. This will happen in a later commit. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27563>	2024-02-24 00:24:30 +00:00
Caio Oliveira	d44462c08d	intel/elk: Fork Gfx8- compiler by copying existing code Based on code from commit `c3ceec6cd8`. Acked-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27563>	2024-02-24 00:24:30 +00:00
Kenneth Graunke	c12300844d	intel/fs: Don't rely on CSE for VARYING_PULL_CONSTANT_LOAD In the past, we didn't have a good solution for combining scalar loads with a variable index plus a constant offset. To handle that, we took our load offset and rounded it down to the nearest vec4, loaded an entire vec4, and trusted in the backend CSE pass to detect loads from the same address and remove redundant ones. These days, nir_opt_load_store_vectorize() does a good job of taking those scalar loads and combining them into vector loads for us, so we no longer need to do this trick. In fact, it can be better not to: our offset need only be 4 byte (scalar) aligned, but we were making it 16 byte (vec4) aligned. So if you wanted to load an unaligned vec2, we might actually load two vec4's (___X \| Y___) instead of doing a single load at the starting offset. This should also reduce the work the backend CSE pass has to do, since we just emit a single VARYING_PULL_CONSTANT_LOAD instead of 4. shader-db results on Alchemist: - No changes in SEND count or spills/fills - Instructions: helped 95, hurt 100, +/- 1-3 instructions - Cycles: helped 3411 hurt 1868, -0.01% (-0.28% in affected) - SIMD32: gained 5, lost 3 fossil-db results on Alchemist: - Instrs: 161381427 -> 161384130 (+0.00%); split: -0.00%, +0.00% - Cycles: 14258305873 -> 14145884365 (-0.79%); split: -0.95%, +0.16% - SIMD32: Gained 42, lost 26 - Totals from 56285 (8.63% of 652236) affected shaders: - Instrs: 13318308 -> 13321011 (+0.02%); split: -0.01%, +0.03% - Cycles: 7464985282 -> 7352563774 (-1.51%); split: -1.82%, +0.31% From this we can see that we aren't doing more loads than before and the change is pretty inconsequential, but it requires less optimizing to produce similar results. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27568>	2024-02-20 23:16:27 -08:00
Caio Oliveira	8ae528331c	intel/compiler: Use "intel" prefix for walk_order enum Will be used later in non-brw specific code in Iris. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27646>	2024-02-21 00:38:35 +00:00
Kenneth Graunke	1497f4e0c2	intel/fs: Don't include sync.nop in instruction count statistics With the advent of software scoreboarding, we emit sync instructions in various places to synchronize the execution pipelines. This results in assembly being littered with a bunch of sync.nop instructions. That means that when you reorder anything in the program, the scoreboarding changes, and the number of sync.nops can vary wildly - even if the code isn't really materially better or worse. This makes it hard to use tools like shader-db or fossil-db on post-Icelake platforms. For now, exclude sync.nops from the instruction count statistic. One day we may want to consider improving the software scoreboarding pass to emit fewer redundant sync.nop instructions, at which point tracking this as a separate stat might be useful. For now though, it's simply cluttering and confusing our results. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27701>	2024-02-20 22:26:09 +00:00
Lionel Landwerlin	0eb3c850c6	intel/clc: workaround LLVM17 opaque pointers Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Caio Oliveira <caio.oliveira@intel.com> Fixes: `b52e25d3a8` ("anv: rewrite internal shaders using OpenCL") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27637>	2024-02-20 14:41:43 +00:00
Lionel Landwerlin	62baa4df5f	intel/clc: lower temp function/shader variables together Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Caio Oliveira <caio.oliveira@intel.com> Fixes: `4fd7495c69` ("intel/clc: add ability to output NIR") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27637>	2024-02-20 14:41:43 +00:00
Lionel Landwerlin	cf193af762	anv: fixup push descriptor shader analysis There are a couple mistakes here : - using a bitfield as an index to generate a bitfield... - in anv_nir_push_desc_ubo_fully_promoted(), confusing binding table access of the descriptor buffer with actual descriptors Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `ff91c5ca42` ("anv: add analysis for push descriptor uses and store it in shader cache") Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27504>	2024-02-19 11:10:29 +00:00
Caio Oliveira	ae50ac46d1	intel: Remove brw_ prefix from process debug function Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27644>	2024-02-16 22:35:05 +00:00
Caio Oliveira	c773898f39	intel/compiler: Rename brw_gfx_ver_enum.h to intel_gfx_ver_enum.h Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27644>	2024-02-16 22:35:05 +00:00
Caio Oliveira	d8f9a05f32	intel/compiler: Rename the passes and files related to intel_nir.h Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27644>	2024-02-16 22:35:05 +00:00
Caio Oliveira	dc76cfc781	intel/compiler: Collect NIR-only passes in intel_nir.h Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27644>	2024-02-16 22:35:05 +00:00
Mark Janes	c4ce1ca847	intel/compiler: generate a hash function to use with the shader cache Currently, Intel's shader cache incorporates PCI ID into shader cache keys. Many devices with different PCI IDs have identical shader compilation functionality. Using PCI ID as a component of the shader cache hash means that a multi-platform shader cache will have redundant, identical entries for similar platforms. All Intel compiler functionality is selected based on device configuration in `struct intel_device_info`. intel_device_info.py flags all fields accessed by intel/compiler. This commit generates a hash function incorporating intel/compiler device info fields. Using this hash function in place of PCI ID will produce a multiplatform cache with no duplicated content. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26844>	2024-02-15 16:58:15 -08:00
Sagar Ghuge	f55f9272e4	intel/compiler: Fix disassembly of URB message descriptor on Xe2+ URB messages follow the LSC message descriptor so we are already disassembling the descriptor/extended descriptor, we don't have to duplicate it. Without this change: urb MsgDesc: ( store, a32, d32, V4, L1UC_L3WB dst_len = 0, src0_len = 2, src1_len = 8 flat ) mlen 2 ex_mlen 8 rlen 0 { align1 1H $1 }; with this change: urb MsgDesc: ( store, a32, d32, V4, L1UC_L3WB dst_len = 0, src0_len = 2, src1_len = 8 flat ) base_offset 0 { align1 1H $1 }; Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Acked-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27498>	2024-02-15 19:46:55 +00:00
Caio Oliveira	0b751a2134	intel: Rename i965_{asm,disasm} tools to brw_{asm,disasm} And move them inside the compiler since they (especially asm) rely on a bunch of internal types. Acked-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27579>	2024-02-15 09:26:46 +00:00
Caio Oliveira	5992185c8d	intel/compiler: Merge intel_disasm.[ch] into corresponding brw files Rename the functions to match the existing ones. Acked-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27579>	2024-02-15 09:26:46 +00:00
Caio Oliveira	468a0ffe9c	intel/compiler: Include brw_disasm_info.h where its used Acked-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27579>	2024-02-15 09:26:46 +00:00
Caio Oliveira	ff95f00883	intel/compiler: Move disassemble functions to own header file Acked-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27579>	2024-02-15 09:26:46 +00:00
Caio Oliveira	5732c9d269	intel/compiler: Rename brw_cs_dispatch_info to intel_cs_dispatch_info And move to the intel_shader_enums.h file. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27475>	2024-02-14 22:31:23 -08:00
Caio Oliveira	c5b80de583	intel/compiler: Rename brw_vue_map to intel_vue_map And move to the intel_shader_enums.h file. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27475>	2024-02-14 22:31:23 -08:00
Caio Oliveira	7d85d2c7fd	intel/compiler: Rename DISPATCH_MODE_* enums to INTEL_DISPATCH_MODE_* And move to the intel_shader_enums.h file. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27475>	2024-02-14 22:31:23 -08:00
Caio Oliveira	aeda865b6d	intel/compiler: Rename BRW_TESS_* enums to INTEL_TESS_* And move to the intel_shader_enums.h file. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27475>	2024-02-14 22:31:23 -08:00
Caio Oliveira	26dd1f0bba	intel/compiler: Rename BRW_WM_MSAA_* enums to INTEL_MSAA_* And move to the intel_shader_enums.h file. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27475>	2024-02-14 22:31:23 -08:00
Caio Oliveira	a88084f8be	intel/compiler: Rename brw_image_param to isl_image_param And move them to ISL. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27475>	2024-02-14 22:31:23 -08:00
Jordan Justen	c6e855b64b	intel/compiler: Verify SIMD16 is used for xe2 BTD/RT dispatch Ref: HSD 14011192593 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27529>	2024-02-14 20:07:13 +00:00
Jordan Justen	820e04ead4	intel/compiler: Implement nir_intrinsic_load_topology_id_intel for xe2 Rework: * Sagar: Rework BRW_TOPOLOGY_ID_DSS, BRW_TOPOLOGY_ID_EU_THREAD_SIMD calculations Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27529>	2024-02-14 20:07:13 +00:00
Jordan Justen	b533bf7361	intel/compiler: Set branch shader required-width as 16 for xe2 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27529>	2024-02-14 20:07:13 +00:00
Lionel Landwerlin	1e31fd5f42	meson: add option to install intel-clc Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>	2024-02-13 00:06:45 +00:00
Lionel Landwerlin	e6b5196079	intel-clc: print text input Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>	2024-02-13 00:06:45 +00:00
Lionel Landwerlin	4fd7495c69	intel/clc: add ability to output NIR This will be used to generate a serialized NIR of functions for internal shaders. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>	2024-02-13 00:06:45 +00:00
Lionel Landwerlin	2bae1b6b66	intel-clc: move ISA generation to its own function Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>	2024-02-13 00:06:45 +00:00
Lionel Landwerlin	2a1ff08376	intel/compiler: make default NIR compiler options visible Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>	2024-02-13 00:06:45 +00:00
Lionel Landwerlin	2437556d83	intel/fs: rerun divergence prior to lowering non-uniform interpolate at sample Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `74a40cc4b6` ("intel/fs: move lower of non-uniform at_sample barycentric to NIR") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>	2024-02-13 00:06:44 +00:00
Lionel Landwerlin	8f5a7f57df	intel/fs: indent lowering code to make it more readable Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26797>	2024-02-13 00:06:44 +00:00
Sagar Ghuge	98b62434bd	intel/compiler: Lower texture operation to combine LOD and AI We have to push the lowering of texture operations a bit further in pipeline since nir_lower_tex gets invoked twice and if there is no LOD source present, nir_lower_tex adds that as a source. Once that's all done we can easily combine the LOD and array index into a single 32-bit value. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27458>	2024-02-12 21:25:48 +00:00
Sagar Ghuge	15129c7634	intel/compiler: Use nir_tex_src_backend1 to pack LOD and array index Since this lowering is totally Intel specific, we don't have to introduce the new texture source. We can use the nir_tex_src_backend1 source to pack LOD/LOD Bias and array index into 32 bit single value. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27458>	2024-02-12 21:25:48 +00:00
Sagar Ghuge	73a3257968	intel/compiler: Add texture operation lowering pass This pass combines the LOD or LOD bias and array index into a single 32-bit value since Xe2+ sampler messages requires us to do that. v2: (Alyssa) - Use nir_iand_imm instead of nir_iand and nir_imm_int - Use nir_trim_vector instead of nir_swizzle Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27458>	2024-02-12 21:25:48 +00:00
Jordan Justen	c3a0483f5b	intel/compiler: Lower DPAS instructions on ARL except ARL-H Ref: bspec 55414 Ref: `951e08fc18` ("intel/compiler: Disable DPAS instructions on MTL") Suggested-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27352>	2024-02-06 21:23:19 +00:00
Ian Romanick	68da9e4dff	intel/compiler/xe2: Set SIMD mode for sampler messages Since SIMD8 no longer exists, the SIMD modes enums have different names and different values. v2 (Francisco Jerez): Rebase on `07b9bfacc7` ("intel/compiler: Move logical-send lowering to a separate file"). v3: Update brw_disasm.c with SIMD descriptions. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27305>	2024-02-02 02:39:10 +00:00
Ian Romanick	84de7a88d3	intel/compiler/xe2: Emit texture instructions w/ combined LOD and array index The extra assertions are just there to help validate pack_lod_and_array_index (in nir_lower_tex.c). v2: Split got_lod_or_bias into two variables. This simplifies some changes that Sagar is working on. Suggested by Sagar. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27305>	2024-02-02 02:39:10 +00:00
Ian Romanick	78e7f7b377	intel/compiler/xe2: Use new sample_*_mlod messages Note: a future commit will expand the sampler message type to the 6 bits used on Xe2. v2 (Francisco Jerez): Rebase on `07b9bfacc7` ("intel/compiler: Move logical-send lowering to a separate file"). v3: Drop XE2_SAMPLER_MESSAGE_SAMPLE_BIAS_MLOD as it does not actually exist. This resulted in some bigger changes in brw_disasm.c. Noticed by Sagar. v4: Now that XE2_SAMPLER_MESSAGE_SAMPLE_MLODc conflicts with GFX7_SAMPLER_MESSAGE_SAMPLE_GATHER4_PO_C, the determination of min_lod_is_first must include devinfo->ver or previous platforms will break. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27305>	2024-02-02 02:39:09 +00:00
Sagar Ghuge	8690a6b546	intel/compiler/xe2: Handle 6-bit message type for Gfx20+ Message types are expanded to 6-bit encoding now. 5 bits are still the same field from the Sampler Message Descriptor. The most significant bit is now bit 31 of the Sampler Message Descriptor. The messages that have '1 in bit 6 are only to support programmable offsets and those would require message header. If a sampler type shows only 5 bits encoding, it is implied bit 6 equal to 0 and there is no requirement for header. v2 (idr): Trivial formatting changes. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27305>	2024-02-02 02:39:09 +00:00
Ian Romanick	a9ed9cf88b	intel/fs: Move opcode modification before the switch that emits srcs This small refactor simplifies a later commit that will optionally emit some opcodes before the switch (as is already done with the shadow comparitor). v2 (Francisco Jerez): Rebase on `07b9bfacc7` ("intel/compiler: Move logical-send lowering to a separate file"). v3 (Jordan): SHADER_OPCODE_TXL => SHADER_OPCODE_TXL_LZ (was SHADER_OPCODE_TXF_LZ). Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27305>	2024-02-02 02:39:09 +00:00
Ian Romanick	7441af803f	intel/compiler/xe2: Update get_sampler_lowered_simd_width The Bspec also says, "The table below describes the SIMD modes which are supported. SIMD32 and SIMD64 are used for media-type operations only." Perhaps this commit should just add if (devinfo->ver >= 20) return 16; instead. v2: Use reg_unit in get_sampler_lowered_simd_width. Suggested by Sagar. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27305>	2024-02-02 02:39:09 +00:00
Ian Romanick	118e0bdc1f	intel/rt: Don't directly generate umul_32x16 The optimization pass will (eventually) turn the imul into a umul_32x16. In many cases, the multiply will be converted to something else. I also tried cloning a bunch of existing imul algebraic patterns for [iu]mul_32x16. This produced the same result, but it was a lot more churn. All of the shaders affected were ray tracing shaders in Q2RTX. This is the only ray tracing workload in my fossil-db. DG2 Totals: Instrs: 191995626 -> 191995079 (-0.00%); split: -0.00%, +0.00% Cycles: 14003803561 -> 14003798040 (-0.00%); split: -0.00%, +0.00% Spill count: 108320 -> 108288 (-0.03%) Fill count: 200695 -> 200663 (-0.02%) Scratch Memory Size: 8755200 -> 8754176 (-0.01%) Totals from 7 (0.00% of 652118) affected shaders: Instrs: 14998 -> 14451 (-3.65%); split: -3.94%, +0.29% Cycles: 137222 -> 131701 (-4.02%); split: -4.10%, +0.07% Spill count: 32 -> 0 (-inf%) Fill count: 32 -> 0 (-inf%) Scratch Memory Size: 19456 -> 18432 (-5.26%) Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27161>	2024-02-02 00:02:05 +00:00
Eric Engestrom	92c24191d4	tree-wide: use __normal_user() everywhere instead of writing the check manually Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27346>	2024-01-30 12:45:54 +00:00
Caio Oliveira	4af079960d	intel/compiler: Enable lower_rotate_to_shuffle in subgroup lowering Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27272>	2024-01-25 19:07:42 +00:00
Kenneth Graunke	2e38024fd8	intel: Use hardware generated compute shader local invocation IDs Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27167>	2024-01-25 08:43:04 +00:00
Kenneth Graunke	5e7f4ff97f	intel: Add driver support for hardware generated local invocation IDs This adds a few new fields in the brw_cs_prog_data struct and then uses them to fill in the relevant COMPUTE_WALKER fields. Although the Tile Layout field theoretically has different settings for 32/64/128bpe, it appears that the recommended programming is to always pick either TileY 32bpe or Linear. It's not very practical to look at the surface formats involved, anyway. Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27167>	2024-01-25 08:43:04 +00:00

... 7 8 9 10 11 ...

3464 commits