fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-23 08:38:08 +02:00

Author	SHA1	Message	Date
Lionel Landwerlin	32450d0901	isl: further restrict alignment constraints We can limit the AUX-TT requirements to formats supporting CCS. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26890>	2024-01-08 08:21:14 +00:00
Mark Janes	2236dc3481	intel/dev: update workaround definitions to latest defect status Acked-by: Caio Oliveira <caio.oliveira@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26898>	2024-01-05 22:51:46 +00:00
Mark Janes	590fe58ef6	intel: remove MTL a0 workarounds Meteorlake shipped with the b0 stepping. Remove fixes for hardware bugs that were corrected prior to the platform release. Acked-by: Caio Oliveira <caio.oliveira@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26898>	2024-01-05 22:51:46 +00:00
Mark Janes	a6a95591aa	intel/dev: poison macros for workarounds fixed at a stepping INTEL_NEEDS_WA macros are valid when a workaround applies to all platforms which have the GFX_VERx10 versions for the workaround. Some workarounds were fixed at a stepping after the platform release. If a workaround applies partially to any platform, then GFX_VERx10 cannot be used to correctly apply the workaround. This change invalidates INTEL_NEEDS_WA_16014538804 and INTEL_NEEDS_WA_22014412737, which were fixed for MTL platforms at stepping b0. The run-time checks were already present for all uses of these macros. Updating the poisoned macros to INTEL_WA_{num}_GFX_VER compiles out the run-time checks on platforms where they cannot apply. Acked-by: Caio Oliveira <caio.oliveira@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26898>	2024-01-05 22:51:45 +00:00
Mark Janes	7354d3a947	intel/dev: improve descriptions of workaround macros. Instructions for INTEL_WA_{num}_GFX_VER macros were confusing and contradicted itself. Acked-by: Caio Oliveira <caio.oliveira@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26898>	2024-01-05 22:51:45 +00:00
Yonggang Luo	d6c258d9ee	util: Add align_uintptr and use it treewide to replace ALIGN that works on size_t and uintptr_t Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26866>	2024-01-05 21:54:35 +00:00
Caio Oliveira	77f4f3112d	intel/fs: Use linear allocator in fs_live_variables Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25670>	2024-01-04 23:06:07 +00:00
Caio Oliveira	b5cd91501d	intel/fs: Use linear allocator in opt_copy_propagation Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25670>	2024-01-04 23:06:07 +00:00
Caio Oliveira	6d2503e935	intel/fs: Only allocate acp_entry if we are adding one In practice it seems we are always entering here, haven't looked in detail whether at this point we could just assert. But for now only allocate a new acp_entry if we are going to add it. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25670>	2024-01-04 23:06:07 +00:00
Sagar Ghuge	96e0d979a7	intel/fs: Check fs_visitor instance before using it On Xe2+, we don't build the SIMD8 shader so this check makes sure we don't execute the uninitialized invocations. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26886>	2024-01-04 22:24:07 +00:00
Dave Airlie	56a72e014f	intel/compiler: reemit boolean resolve for inverted if on gen5 Gen5 adds some boolean conversion instructions after nir emits, but that nir srcs don't line up with them, so reemit the boolean conversion if we reemit the inot. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `31b5f5a51f` ("nir/opt_if: Simplify if's with general conditions") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26782>	2024-01-04 21:27:23 +00:00
Dave Airlie	8f73cc802c	intel/compiler: revert part of "Move earlier scheduler code that is not mode-specific" This removed a bunch of calls from the vec4 code that aren't called anywhere else. Bring back the bits that were removed. Fixes glxgears on gen5 Fixes: `81594d0db1` ("intel/compiler: Move earlier scheduler code that is not mode-specific") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26862>	2024-01-04 00:38:38 +00:00
Dave Airlie	37366fef68	intel/compiler: fix release build unused variable. This is only used in an assert. Fixes: `158ac265df` ("intel/fs: Make helpers for saving/restoring instruction order") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26863>	2024-01-03 23:52:11 +00:00
Daniel Schürmann	a3ed36da1a	treewide: replace calls to nir_opt_trivial_continues() with nir_opt_loop() Totals from 850 (1.11% of 76636) affected shaders: (RADV, GFX11) MaxWaves: 18134 -> 18130 (-0.02%) Instrs: 3011298 -> 3008585 (-0.09%); split: -0.17%, +0.08% CodeSize: 15836804 -> 15841972 (+0.03%); split: -0.09%, +0.12% VGPRs: 63580 -> 63604 (+0.04%) SpillSGPRs: 966 -> 1148 (+18.84%); split: -0.83%, +19.67% Latency: 36102291 -> 30186144 (-16.39%); split: -16.41%, +0.02% InvThroughput: 9058100 -> 7011821 (-22.59%); split: -22.61%, +0.02% VClause: 65369 -> 65364 (-0.01%); split: -0.03%, +0.02% SClause: 100309 -> 100305 (-0.00%); split: -0.04%, +0.04% Copies: 335658 -> 336472 (+0.24%); split: -0.70%, +0.94% Branches: 110806 -> 108945 (-1.68%); split: -1.94%, +0.26% PreSGPRs: 73476 -> 73934 (+0.62%); split: -0.25%, +0.87% PreVGPRs: 58809 -> 58840 (+0.05%); split: -0.01%, +0.06% Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24940>	2024-01-03 20:48:04 +00:00
Yonggang Luo	472b6f5379	intel,crocus,iris: Use align64 instead of ALIGN for 64 bit value parameter Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26864>	2024-01-03 12:46:10 +00:00
Yonggang Luo	5a2aa3ff88	intel: Cleanup duplicate ALIGN macro defines Use ALIGN function instead Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26864>	2024-01-03 12:46:10 +00:00
Yonggang Luo	8665ce27bc	intel: Use ALIGN_POT instead of ALIGN inside macro define These macro define is compute from literals, so use ALIGN_POT instead of ALIGN function so that it's can be computed at compile time Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26864>	2024-01-03 12:46:10 +00:00
Yonggang Luo	3a9c569177	intel: Avoid use align as variable, replace it with other names align is a function and when we want use it, the align variable will shadow it So replace it with other names Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26864>	2024-01-03 12:46:10 +00:00
Mark Janes	188c349e51	intel: remove workaround for preproduction DG2 steppings DG2_G10 was released with stepping C0. DG2_G11 was released with stepping B1. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26845>	2024-01-02 16:06:37 -08:00
Iván Briano	56d556f821	anv: enable VK_KHR_maintenance6 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26842>	2024-01-02 22:12:02 +00:00
Iván Briano	b7c4fe54cb	anv: move astc_emu to use descriptors2 calls Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26842>	2024-01-02 22:12:02 +00:00
Iván Briano	ce6899d804	anv: add support for CmdDescriptorSet2KHR Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26842>	2024-01-02 22:12:02 +00:00
Iván Briano	40377eed91	anv: handle VkBindMemoryStatusKHR on buffer/image memory bind Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26842>	2024-01-02 22:12:02 +00:00
Iván Briano	abe0cc8aa4	anv: remove no longer valid assert Maintenance6 allows creating uncompressed views of compressed images with multiple layers. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26842>	2024-01-02 22:12:02 +00:00
Iván Briano	3b5615500a	anv: allow NULL index buffers Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26842>	2024-01-02 22:12:01 +00:00
Tapani Pälli	fe5c82e853	isl: implement Wa_14018471104 Set EnableSamplerRouteToLSC in case ResourceMinLOD is 0. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26801>	2024-01-02 21:14:42 +00:00
José Roberto de Souza	70382f7f06	intel/isl/xe2: Enable route of Sampler LD message to LSC Xe2 allows route of LD messages from Sampler to LSC to improve performance when some restrictions are met. BSpec: 57023 Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26801>	2024-01-02 21:14:42 +00:00
Zhang, Jianxun	e9b633619c	intel/genxml: Add RENDER_SURFACE_STATE for xe2 The indirect BO of clear color is also removed along with clear value address and its enabling. Other delta in struct RENDER_SURFACE_STATE are deferred to their functional enabling changes. Signed-off-by: Zhang, Jianxun <jianxun.zhang@intel.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26801>	2024-01-02 21:14:42 +00:00
Jordan Justen	db5be18862	intel/genxml/gfx125: Move STATE_SURFACE_TYPE to enum This will allow us to use it in Xe2 genxml. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26801>	2024-01-02 21:14:42 +00:00
Jordan Justen	772ce98a81	intel/genxml/gfx125: Move L1_CACHE_CONTROL to enum This will allow us to use it in Xe2 genxml. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26801>	2024-01-02 21:14:42 +00:00
Sagar Ghuge	9e97ce59a8	anv: No need to emit PIPELINE_SELECT on Xe2+ On Xe2+, PIPELINE_SELECT is getting deprecated (Bspec 55860), as a result we don't have to do the stalling flushes while switching between different pipelines. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26637>	2024-01-02 20:57:33 +00:00
Ian Romanick	2e75d71c1f	intel/cmat: Generate better code for nir_intrinsic_cmat_insert When the source destination index is a constant, we can avoid generating a lot of the intermediate code. At the very least, this makes initial NIR dumps much easier to read. v2: Simplify tracking of dst_index. Suggested by Caio. Suggested-by: Caio Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>	2023-12-29 20:28:54 -08:00
Ian Romanick	c6d44284aa	intel/dev: Enable VK_KHR_cooperative_matrix on all Gfx9+ GPUs Gfx12.5 (DG2) will use DPAS instructions to accelerate the implementation. Earlier platforms will use equivalent discrete instructions (basically subgroup operations). Gfx12 (Tigerlake) will use DP4A for 8-bit integer matrix multiplication. Older platforms, which lack DP4A, will use a suboptimal instruction sequence. There is plenty of room for improvement here. On DG2 (Gfx12.5) gets the following results from the CTS: Test run totals: Passed: 1642/13982 (11.7%) Failed: 0/13982 (0.0%) Not supported: 12340/13982 (88.3%) Warnings: 0/13982 (0.0%) Waived: 0/13982 (0.0%) On DG2 (Gfx12.5) with forced lowering, Raptor Lake (Gfx12) and Ice Lake (Gfx11): Test run totals: Passed: 1662/13982 (11.9%) Failed: 0/13982 (0.0%) Not supported: 12320/13982 (88.1%) Warnings: 0/13982 (0.0%) Waived: 0/13982 (0.0%) The difference in the number of tests run is due to saturatingAccumulation not being set on DG2 when DPAS is used. There is a comment in "intel/dev: Advertise integer configs with saturatingAccumulation too" that explains how this could be added should the need arise. v2: Prefix type names with INTEL_CMAT_. Suggested by Lionel. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>	2023-12-29 20:28:54 -08:00
Ian Romanick	8ea032b78e	intel/dev: Advertise integer configs with saturatingAccumulation too VUID-RuntimeSpirv-saturatingAccumulation-08983 says: For OpCooperativeMatrixMulAddKHR, the SaturatingAccumulation cooperative matrix operand must be present if and only if VkCooperativeMatrixPropertiesKHR::saturatingAccumulation is VK_TRUE. As a result, we have to advertise integer configs both with and without this flag set. v2: Prefix type names with INTEL_CMAT_. Suggested by Lionel. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>	2023-12-29 20:28:54 -08:00
Ian Romanick	f952dd510e	anv: Select the SIMD mode very early when cooperative matrices are used The commit is a little ugly. The definition of anv_fixup_subgroup_size is moved before the added call site. In addition, the bit starting at the "Cooperative matrix extension requires..." comment is added. v2: Dramatic simplification of SIMD selection. Suggested by Caio. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>	2023-12-29 20:28:54 -08:00
Ian Romanick	511f91e307	anv: Lower indirect derefs again after lowering cooperative matrices The cooperative matrix lowering can generate a lot of indirect array accesses, and these need to be eliminated. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>	2023-12-29 20:28:54 -08:00
Ian Romanick	b741a9a851	anv: Set PIPELINE_SELECT systolic mode enable flag Set the flag on compute shaders when the application has enabled the cooperative matrix feature. We might still want to enable this only when DPAS is actually used. The current method is based on many suggestions from Lionel. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>	2023-12-29 20:28:54 -08:00
Ian Romanick	7bfbeb79a7	anv: Set COMPUTE_WALKER systolic mode enable flag Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>	2023-12-29 20:28:54 -08:00
Ian Romanick	67739b02de	anv: Add anv_physical_device::has_cooperative_matrix This flag tracks whether or not cooperative matrices are fully enabled on the physica device (i.e., both the configs exist and the environment varible is set). This is mainly to support a later commit "anv: Set PIPELINE_SELECT systolic mode enable flag." This could be squashed into "anv: Implement VK_KHR_cooperative_matrix." I left it separate because we might go back to the previous method. v3: Don't hide the extension behind an environment variable (ANV_COOPERATIVE_MATRIX) now the we have a better solution for setting PIPELINE_SELECT. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>	2023-12-29 20:28:54 -08:00
Caio Oliveira	0a6f8b40bf	anv: Implement VK_KHR_cooperative_matrix v2: Rebase on moving lowering pass to src/intel/compiler. v3: Don't hide the extension behind an environment variable (ANV_COOPERATIVE_MATRIX) now the we have a better solution for setting PIPELINE_SELECT. v4: Prefix type names with INTEL_CMAT_. Suggested by Lionel. Also rebase on `f99e43d606` ("anv: switch to use runtime physical device properties infrastructure"). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>	2023-12-29 20:28:54 -08:00
Caio Oliveira	ff16458478	intel/dev: Add cooperative matrix configuration information v2: Prefix type names with INTEL_CMAT_. Suggested by Lionel. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>	2023-12-29 20:28:54 -08:00
Ian Romanick	6b14da33ad	intel/fs: nir: Add nir_intrinsic_dpas_intel v2: Fix parameter order in nir_intrinsic_dpas_intel to DPAS conversion. v3: Fix float16 destination DPAS on DG2. v4: Use nir_component_mask(...) instead of 0xffff. Suggested by Caio. v5: Rebase on !26323. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>	2023-12-29 20:28:43 -08:00
Ian Romanick	3756f60558	intel/fs: DPAS lowering Implements integer dot product lowering both with and without DP4A. Implements half-float dot product lowering. There are a couple FINISHME comments describing future optimizations. v2: Add a brw_compiler::lower_dpas flag to track when the lowering should be applied. v3: Use is_null() instead of checking file != ARF. Suggested by Caio. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>	2023-12-29 20:27:15 -08:00
Ian Romanick	3cb9625539	intel/fs: Fix scoreboarding for DPAS v2: Remove all mention of DPASW. Suggested by Curro and Caio. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>	2023-12-29 20:27:15 -08:00
Ian Romanick	eb1f19d7bf	intel/compiler: Validation for DPAS instructions v2: s/regiser/register/g in messages. Noticed by Caio. Add more context to the sub-byte precision error message. Suggested by Caio. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>	2023-12-29 20:27:15 -08:00
Ian Romanick	1c92dad5cb	intel/disasm: Disassembly support for DPAS v2: Fix regioning in src[012]_dpas_3src. Noticed by Caio. Treat DPAS as unordered. Suggested by Curro. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>	2023-12-29 20:27:13 -08:00
Ian Romanick	e666872c75	intel/compiler: Initial bits for DPAS instruction v2: Add brw_ir_performance.cpp and brw_fs_generator.cpp changes. Fix overlapping register allocation (via has_source_and_destination_hazard). Fix incorrect destination register file encoding. v3: Prevent lower_regioning from trying to "fix" DPAS sources. v4: Add instruction latency information for scheduling and perf estimates. v5: Remove all mention of DPASW. Suggested by Curro and Caio. Update the comment in fs_inst::has_source_and_destination_hazard. Suggested by Caio. v6: Add some comments near the src2 calculation in fs_inst::size_read. Suggested by Caio. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>	2023-12-29 20:24:16 -08:00
Ian Romanick	3a35f8b29b	intel/cmat: Lower cmat_load and cmat_store v2: Add support for non-constant stride. v3: Explain B matrices (a little bit) in get_slice_type_from_desc. Suggested by Caio. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>	2023-12-29 20:24:16 -08:00
Ian Romanick	502be565da	intel/cmat: Add lowering for cmat_bitcast v2: Use nir_component_mask(...) instead of 0xffff. Assert that source and destination are same size. Both suggested by Caio. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>	2023-12-29 20:24:15 -08:00
Ian Romanick	7303315a8b	intel/cmat: Enable packed formats for scalar ops v2: Use nir_pack_bits and nir_unpack_bits to simplify coop_scalar handling. This saved 13 lines of code. v3: Allow packing factor 2 and packing factor 1 elements be stored in 16-bit integers. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>	2023-12-29 20:24:15 -08:00

1 2 3 4 5 ...

10990 commits