fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-21 15:50:11 +01:00

Author	SHA1	Message	Date
Kenneth Graunke	98bcf650f1	intel/compiler: Use nir_dest_bit_size() for ballot bit size check There's no guarantee that this is a SSA value. Use the helper to handle both SSA values and register correctly. Otherwise we read trash when we encounter a register and make bad decisions on types, possibly leading to our destination being UQ typed when the VGRF is only 32-bit. Fixes compilation with -Dintel-clc=enabled since `7f6491b76d` (nir: Combine if_uses with instruction uses) but the bug is much older than that, circa 2017. We were just getting lucky before. Fixes: `069bf7c907` ("i965/fs: Match destination type to size for ballot") Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22374>	2023-04-07 19:28:56 -07:00
Jordan Justen	eef7a117a1	intel/compiler: Support fmul_fsign opt for fp64 when int64 isn't supported MTL support fp64, but not int64. The fsign(double(x))*FOO optimization would try to use a 64-bit int xor operation to conditionally toggle the sign bit off the result. Since this only affects high bit of the result, we can do a 32-bit move of the low dword, and a 32-bit xor on the high dword. Fixes dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp64.input_args.modf_denorm_flush_to_zero on MTL. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22259>	2023-04-05 18:48:21 +00:00
Lionel Landwerlin	a358b97c58	intel/fs: optimize uniform SSBO & shared loads Using divergence analysis, figure out when SSBO & shared memory loads are uniform and carry the data only once in register space. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21853>	2023-04-05 12:32:56 +00:00
Sagar Ghuge	cece2aa2c1	intel/compiler: Add Wa_14014063774 for slm_fence Before SLM fence compiler needs to insert SYNC.ALLWR in order to avoid the SLM data race. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22050>	2023-03-25 00:45:04 +00:00
Mark Janes	33d03e57ad	intel/fs: use generated helpers for Wa_14013363432 / Wa_14012688258 Wa_14013363432 is a clone of Wa_14012688258. It does not apply to all gfx 12.5 platforms. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21745>	2023-03-23 19:13:09 +00:00
Lionel Landwerlin	56474fae93	intel/fs: fix subgroup invocation read bounds checking nir->info.subgroup_size can be set to an enum : SUBGROUP_SIZE_VARYING = 0 SUBGROUP_SIZE_UNIFORM = 1 SUBGROUP_SIZE_API_CONSTANT = 2 SUBGROUP_SIZE_FULL_SUBGROUPS = 3 So compute the API subgroup size value and compare it to the dispatch size to determine whether we need some bound checking. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `9ac192d79d` ("intel/fs: bound subgroup invocation read to dispatch size") Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21856>	2023-03-14 12:15:48 +00:00
Ian Romanick	28311f9d02	nir: intel/compiler: Move ufind_msb lowering to NIR Fossil-db results: All Intel platforms had similar results. (Ice Lake shown) Cycles in all programs: 9098346105 -> 9098333765 (-0.0%) Cycles helped: 6 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>	2023-03-10 15:27:17 +00:00
Ian Romanick	08ca862ef8	intel/compiler: Tighter src and dest size bounds checking for some opcodes Enforce the sizes listed in the Skylake PRM: BFREV: source types: D destination types: D CBIT: source types: UB, UW, UD destination types: UD FBH: source types: D, UD destination types: UD FBL: source types: UD destination types: UD LZD: source types: D, UD destination types: UD v2: Update BFREV commit message documentation. Suggested by Ken. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>	2023-03-10 15:27:17 +00:00
Ian Romanick	0cc7bf63b7	nir: intel/compiler: Move ifind_msb lowering to NIR Unlike ufind_msb, ifind_msb is only defined in NIR for 32-bit values, so no @32 annotation is required. No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>	2023-03-10 15:27:17 +00:00
Ian Romanick	15c6c859cf	intel/compiler: Lower find_lsb in NIR No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>	2023-03-10 15:27:17 +00:00
Kenneth Graunke	f5e5705c91	intel/fs: Use F32TO16/F16TO32 helpers in fquantize16 handling I originally thought that we were intentionally emitting the legacy opcodes here to make them opaque to the optimizer, so that it wouldn't eliminate the explicit type conversions, as they're actually required to do the quantization. But...we don't actually optimize those away currently anyway. So...go ahead and use the helpers for consistency. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21783>	2023-03-09 23:26:17 +00:00
Kenneth Graunke	309ec3725a	intel/fs: Use new F16TO32 helpers for unpack_half_split_* opcodes This gets us a MOV at the IR level on Gfx8+ which should be more optimizable than F16TO32. It also removes confusion about which pipe which the instruction will run on. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21783>	2023-03-09 23:26:17 +00:00
Kenneth Graunke	78bf53904e	intel/fs: Delete a TODO about using brw_F32TO16. We can just use the new builder helpers to get the optimization advantages of a MOV on Gfx8+ while also getting the necessary F32TO16 on Gfx7.x and yet not worry too hard about it. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21783>	2023-03-09 23:26:17 +00:00
Caio Oliveira	c92d589597	intel/compiler: Drop non-scoped barrier handling Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21634>	2023-03-07 00:41:13 +00:00
Caio Oliveira	db0a09c9e2	intel/fs: Handle scoped barriers with execution scope Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21634>	2023-03-07 00:41:13 +00:00
Faith Ekstrand	83fd7a5ed1	intel: Use nir_lower_tex_options::lower_index_to_offset Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21546>	2023-03-06 21:38:32 +00:00
Caio Oliveira	c80268a20d	intel/compiler: Mark various memory barriers intrinsics unreachable Now that both SPIR-V and GLSL are using scoped barriers, we can stop handling the specialized ones. Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3339>	2023-02-27 20:24:01 +00:00
Daniel Schürmann	2bb369dd8d	nir: add assertions that loops don't have a Continue Construct Hoping that I didn't miss any, this should add assertions to all functions and passes which explicitly handle 'nir_loop'. Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13962>	2023-02-21 10:41:11 +00:00
Lionel Landwerlin	9ac192d79d	intel/fs: bound subgroup invocation read to dispatch size This is to avoid out of bound register accesses (potentially leading to hangs) when the dispatch size is smaller than when is reported in the NIR subgroup_size. v2: Implement bounding with a mask (since workgroup sizes are powers of 2) (Faith) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `530de844ef` ("intel,anv,iris,crocus: Drop subgroup size from the shader key") Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21282>	2023-02-14 21:29:42 +00:00
Jason Ekstrand	949b42c4dc	intel/compiler: Convert wm_prog_key::multisample_fbo to a tri-state This allows us to communicate to the back-end that we don't actually know if the framebuffer is multisampled or not. No drivers set anything but ALWAYS/NEVER and we still have a few ALWAYS/NEVER assumptions but those should be asserted. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>	2023-02-06 09:12:18 +00:00
Ian Romanick	ea413e826b	nir: Eliminate nir_op_f2b Builds on the work of !15121. This gets to delete even more code because many drivers shared a lot of code for i2b and f2b. No shader-db or fossil-db changes on any Intel platform. v2: Rebase on `1a35acd8d9`. v3: Update a comment in nir_opcodes_c.py. Suggested by Konstantin. v4: Another rebase. Remove f2b stuff from Midgard. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20509>	2023-02-03 22:39:57 +00:00
Sagar Ghuge	0c083d29a5	intel/fs: Always stall between the fences on Gen11+ Be conservative in Gfx11+ and always stall in a fence. Since there are two different fences, and shader might want to synchronize between them. This change also brings back the original code block for the stall between the fence and comment from the commit `b390ff3517`. v2: (Caio) - Re-arrange code block. - Adjust comment. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6958 Fixes: `f7262462` ("intel/fs: Rework fence handling in brw_fs_nir.cpp") Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Tested-by: Mark Janes <markjanes@swizzler.org> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20996>	2023-02-02 00:21:21 +00:00
Amber	ab4c2990ed	intel/compiler: use lower_image_samples_to_one Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewer-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Signed-off-by: Amber Amber <amber@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20813>	2023-02-01 19:52:49 +00:00
Lionel Landwerlin	13cca48920	intel/fs: drop FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD_GFX7 We can lower FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD into other more generic sends and drop this internal opcode. The idea behind this change is to allow bindless surfaces to be used for UBO pulls and why it's interesting to be able to reuse setup_surface_descriptors(). But that will come in a later change. No shader-db changes on TGL & DG2. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20416>	2023-01-26 11:26:53 +00:00
Marcin Ślusarz	9bb18a4f9e	intel/compiler: fix generation of vec8/vec16 alu instruction I stumbled on this when I inserted some suboptimal lowering code after all optimizations. Adding certain subset of optimizations after my lowering code actually avoided this bug, so I think it's not possible to hit this on upstream. Let's fix this for the next person generating suboptimal code... Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20857>	2023-01-24 13:15:58 +00:00
Kenneth Graunke	16b66ab659	intel/compiler: Drop dest checking in atomic code NIR atomic operation intrinsics all have destinations. This is just copy and pasted from other generic intrinsic handling where that may or may not be the case. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20604>	2023-01-19 08:42:22 +00:00
Kenneth Graunke	780f3e2e6b	intel/compiler: Delete all the A64 atomic variants for type sizes These are handled identically in almost all cases. There is one place in the legacy surface lowering that was obtaining the bitsize from the opcode, but the LSC-based lowering uses (type_sz(inst->dst.type) * 8) for that and works just fine. If we just do that in the legacy lowering too, then we don't need this plethora of opcodes. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20604>	2023-01-19 08:42:22 +00:00
Kenneth Graunke	03ddde1230	intel/compiler: Combine nir_emit_{ssbo,shared}_atomic into one helper These are basically identical save for: - shared has surface hardcoded to SLM rather than an SSBO index - shared has to handle adding the 'base' const_index (SSBO have none) - the NIR source index for data is shifted by one It's not worth copy and pasting the entire function for this. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20604>	2023-01-19 08:42:22 +00:00
Kenneth Graunke	b84939c678	intel/compiler: Delete fs_visitor::nir_emit_{ssbo,shared}_atomic_float() These are now basically identical to their non-float counterparts. The only thing that differed was the opcode checking to determine which operands existed. Now that we have a unified opcode enum and a helper for the number of data operands, we can just use that. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20604>	2023-01-19 08:42:22 +00:00
Kenneth Graunke	f7b29d7924	intel/compiler: Drop redundant 32-bit expansion for shared float atomics We already expanded data to 32-bit a few lines earlier, so this is just redundantly doing it a second time. Fixes: `43169dbbe5` ("intel/compiler: Support 16 bit float ops") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20604>	2023-01-19 08:42:22 +00:00
Kenneth Graunke	02129eee3a	intel/compiler: Eliminate SHADER_OPCODE_UNTYPED_ATOMIC_FLOAT The only reason for the separate opcode was because of the overlapping BRW_AOP_* enums, making it impossible to tell whether a particular AOP was the integer or float operation. Now that we use the lsc_opcode enums, we can just have the legacy lowering inspect the opcode and select the right descriptor. No need for a separate opcode. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20604>	2023-01-19 08:42:22 +00:00
Kenneth Graunke	90a2137cd5	intel/compiler: Use LSC opcode enum rather than legacy BRW_AOPs This gets our logical atomic messages using the lsc_opcode enum rather than the legacy BRW_AOP_* defines. We have to translate one way or another, and using the modern set makes sense going forward. One advantage is that the lsc_opcode encoding has opcodes for both integer and floating point atomics in the same enum, whereas the legacy encoding used overlapping values (BRW_AOP_AND == 1 == BRW_AOP_FMAX), which made it impossible to handle both sensibly in common code. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20604>	2023-01-19 08:42:22 +00:00
Kenneth Graunke	8d2dc52a14	intel/compiler: Move atomic op translation into emit_*_atomic() There's no need to pass both the intrinsic and an opcode computed from that same intrinsic. Just do it in the functions themselves. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20604>	2023-01-19 08:42:22 +00:00
Francisco Jerez	4a2e7306dd	intel/fs/gfx12: Ensure that prior reads have executed before barrier with acquire semantics. This avoids a violation of the Vulkan memory model that was leading to intermittent failures of at least 8k test-cases of the Vulkan CTS (within the group dEQP-VK.memory_model.) on TGL and DG2 platforms. In theory the issue may be reproducible on earlier platforms like IVB and ICL, but the SYNC.ALLWR instruction is not available on those platforms so a different (likely costlier) fix will be needed. The issue occurs within the sequence we emit for a NIR memory barrier with acquire semantics requiring the synchronization of multiple caches, e.g. in pseudocode for a barrier involving the TGM and UGM caches on DG2: x <- load.ugm // Atomic read sequenced-before the barrier y <- fence.ugm z <- fence.tgm wait(y, z) w <- load.tgm // Read sequenced-after the barrier In the example we must provide the guarantee that the memory load for x is completed before the one for w, however this ordering can be reversed with the intervention of a concurrent thread, since the UGM fence will block on the prior UGM load and potentially take a long time, while the TGM fence may complete and invalidate the TGM cache immediately, so a concurrent thread could pollute the TGM cache with stale contents for the w location before* the UGM load has completed, leading to an inversion of the expected memory ordering. v2: Apply the workaround regardless of whether the NIR barrier intrinsic specifies multiple storage classes or a single one, since an acquire barrier is required to order subsequent requests relative to previous atomic requests of unknown storage class not necessarily specified by the memory scope information of the intrinsic. Cc: mesa-stable Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20690>	2023-01-18 21:34:33 -08:00
Ian Romanick	eb76cee9f8	nir: Eliminate nir_op_i2b There are a lot of optimizations in opt_algebraic that match ('ine', a, 0), but there are almost none that match i2b. Instead of adding a huge pile of additional patterns (including variations that include both ine and i2b), always lower i2b to a != 0. At this point in the series, it should be impossible for anything to generate i2b, so there /should not/ be any changes. The failing test on d3d12 is a pre-existing bug that is triggered by this change. I talked to Jesse about it, and, after some analysis, he suggested just adding it to the list of known failures. v2: Don't rematerialize i2b instructions in dxil_nir_lower_x2b. v3: Don't rematerialize i2b instructions in zink_nir_algebraic.py. v4: Fix zink-on-TGL CI failures by calling nir_opt_algebraic after nir_lower_doubles makes progress. The latter can generate b2i instructions, but nir_lower_int64 can't handle them (anymore). v5: Add back most of the hunk at line 2125 of nir_opt_algebraic.py. I had accidentally removed the f2b(bf2(x)) optimization. v6: Just eliminate the i2b instruction. v7: Remove missed i2b32 in midgard_compile.c. Remove (now unused) emit_alu_i2orf2_b1 function from sfn_instr_alu.cpp. Previously this function was still used. 🤷 No shader-db changes on any Intel platform. All Intel platforms had similar results. (Ice Lake shown) Instructions in all programs: 141165875 -> 141165873 (-0.0%) Instructions helped: 2 Cycles in all programs: 9098956382 -> 9098956350 (-0.0%) Cycles helped: 2 The two Vulkan shaders are helped because of the "new" (('b2i32', ('ine', ('ubfe', a, b, 1), 0)), ('ubfe', a, b, 1)) algebraic pattern. Acked-by: Jesse Natalie <jenatali@microsoft.com> [earlier version] Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Daniel Schürmann <daniel@schuermann.dev> [earlier version] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>	2022-12-14 06:23:21 +00:00
Kenneth Graunke	88918baf5c	intel/compiler: Delete key->msaa_16 None of the drivers have used this since we dropped i965, and BLORP no longer uses it as of the previous commit. We can also drop the former compressed_multisample_tex_mask (now padding) field so that things remain 64-bit aligned. Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20223>	2022-12-09 10:18:25 +00:00
Kenneth Graunke	584e18863e	intel: Drop compressed_multisample_layout_mask from the compiler keys The compiler looks at this key field to determine whether to perform an MCS fetch for a txf_ms or samples_identical texture message, if a nir_tex_src_ms_mcs_intel source wasn't provided. If it isn't set, it instead uses constant 0 (nothing is compressed). All of the drivers (iris, crocus, anv, hasvk) unconditionally set this to ~0 because we don't want to pay for costly shader recompiles (which can cause nasty stuttering). Most textures are compressed anyway, and the hardware ignores the l2dms MCS parameter if MCS is disabled. The only user was BLORP, which sets the key field based on whether the texture's aux usage has MCS. But if it has MCS, it also does the MCS fetch itself and supplies it directly. Otherwise, it relies on the compiler to fill in the 0 value. But it could easily just provide the 0 value itself in that case and not rely on the compiler at all. With that fixed, we can just drop the key fields entirely. We leave them as padding for now to avoid repacking structures; we won't need to after the next commits anyway. Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20223>	2022-12-09 10:18:25 +00:00
Jason Ekstrand	7d2e3f660c	intel/fs: Support load_workgroup_id_zero_base Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20068>	2022-12-01 04:56:48 +00:00
Lionel Landwerlin	9c1c1888d9	intel/fs: put scratch surface in the surface state heap In `4ceaed7839` we made scratch surface state allocations part of the internal heap (mapped to STATE_BASE_ADDRESS::SurfaceStateBaseAddress) so that it doesn't uses slots in the application's expected 1M descriptors (especially with vkd3d-proton). But all our compiler code relies on BSS (STATE_BASE_ADDRESS::BindlessSurfaceStateBaseAddress). The additional issue is that there is only 26bits of surface offset available in CS instruction (CFE_STATE, 3DSTATE_VS, etc...) for scratch surfaces. So we need the drivers to put the scratch surfaces in the first chunk of STATE_BASE_ADDRESS::SurfaceStateBaseAddress (hence all the driver changes). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `4ceaed7839` ("anv: split internal surface states from descriptors") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7687 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19727>	2022-11-19 14:58:58 +00:00
Ian Romanick	351b8c6aec	intel/fs: Enable nir_op_imul_32x16 and nir_op_umul_32x16 on pre-Gfx7 Even though Intel's CI doesn't test these old platforms anymore, the validation added in "intel/eu/validate: Validate integer multiplication source size restrictions" combined with full shader-db runs gives me confidence in the changes. Sandy Bridge total instructions in shared programs: 13902341 -> 13902167 (<.01%) instructions in affected programs: 30771 -> 30597 (-0.57%) helped: 66 / HURT: 0 total cycles in shared programs: 741795500 -> 741791931 (<.01%) cycles in affected programs: 987602 -> 984033 (-0.36%) helped: 28 / HURT: 5 Iron Lake total instructions in shared programs: 8365806 -> 8365754 (<.01%) instructions in affected programs: 1766 -> 1714 (-2.94%) helped: 10 / HURT: 0 total cycles in shared programs: 248542694 -> 248542378 (<.01%) cycles in affected programs: 29836 -> 29520 (-1.06%) helped: 9 / HURT: 0 GM45 total instructions in shared programs: 5187127 -> 5187101 (<.01%) instructions in affected programs: 891 -> 865 (-2.92%) helped: 5 / HURT: 0 total cycles in shared programs: 163643914 -> 163643750 (<.01%) cycles in affected programs: 22206 -> 22042 (-0.74%) helped: 5 / HURT: 0 Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19602>	2022-11-09 21:34:26 +00:00
Ian Romanick	293ad13e3f	intel/fs: Slightly restructure emitting nir_op_imul_32x16 and nir_op_umul_32x16 There are no immediate values at this point, so all of this code was bunk. :face_palm: Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19602>	2022-11-09 21:34:26 +00:00
Caio Oliveira	22d8ed84b8	intel/compiler: Remove unused fs_visitor::emit_percomp() Since `7ef7738a61` ("i965: Write gl_FragCoord directly to the destination.") this is not used. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19586>	2022-11-08 07:33:09 +00:00
Rohan Garg	43169dbbe5	intel/compiler: Support 16 bit float ops Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17988>	2022-10-17 15:56:28 +02:00
Lionel Landwerlin	9dba8d8aa1	intel/fs: take a builder arg for resolve_source_modifiers() There will be situations where we will want to use a local builder rather than the one associated with NIR->backend translation. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16970>	2022-09-28 05:38:36 +00:00
José Roberto de Souza	f4857591e1	intel/compiler/fs: Use DF to load constants when has_64bit_int is not supported This was already been done to gen7 platforms, so now extending to all platforms without has_64bit_int. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18577>	2022-09-14 19:32:43 +00:00
Caio Oliveira	0b6e613de8	intel/compiler: Create and use struct for CS thread payload Move subgroup_id, that's only used by CS for verx10 < 125, as part of the payload too -- even though is not, strictly speaking. Note the thread execution of Task/Mesh is similar enough, so we make their common struct inherit from cs_thread_payload. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>	2022-09-13 01:44:24 +00:00
Caio Oliveira	9de790760e	intel/compiler: Create and use struct for Bindless thread payload Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>	2022-09-13 01:44:24 +00:00
Caio Oliveira	a70378f292	intel/compiler: Store start of ICP handles in GS thread payload struct Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>	2022-09-13 01:44:24 +00:00
Caio Oliveira	5b6987daee	intel/compiler: Create and use struct for GS thread payload Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>	2022-09-13 01:44:24 +00:00
Caio Oliveira	19c6e1b447	intel/compiler: Create and use struct for TES thread payload Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>	2022-09-13 01:44:24 +00:00

1 2 3 4 5 ...

561 commits