fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-23 02:30:12 +01:00

Author	SHA1	Message	Date
Sagar Ghuge	7e48cbb029	intel: uncached L1 to fix memory barrier issue in RT shader In the RT shader, if there's a executeCallableEXT() in between, even though the called shader does nothing, the instructions before and after the executeCallableEXT() is not properly synced. Patch fixes: - dEQP-VK.ray_tracing_pipeline.memguarantee.inside.rgen - dEQP-VK.ray_tracing_pipeline.memguarantee.inside.chit - dEQP-VK.ray_tracing_pipeline.memguarantee.inside.miss - dEQP-VK.ray_tracing_pipeline.memguarantee.inside.call Thank to Kevin for finding out there is a load/store issue. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31201>	2024-09-24 14:33:11 +00:00
Lionel Landwerlin	b45ce7d43e	brw: move null_rt control up a layer We'll want to tune this setting based on other parameters. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Backport-to: 24.2 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31196>	2024-09-23 15:56:02 +00:00
Lionel Landwerlin	02b124846f	brw: fix TGM messages to use cmask lsc opcodes This is a restriction for TGM. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `b55f7716` ("intel/brw: Switch to emitting MEMORY_*_LOGICAL opcodes") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31199>	2024-09-17 09:28:58 +00:00
Rohan Garg	daea7e1651	intel/compiler: use the correct cache enum for loads and stores Fixes: `74efde7` ('intel/brw/xehp+: Drop redundant arguments of lsc_msg_desc*()') Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30742>	2024-09-16 15:18:31 +00:00
Rohan Garg	b99fd944e8	intel/compiler: version can never be above 11 due to the previous check Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30742>	2024-09-16 15:18:31 +00:00
Kenneth Graunke	02482604e5	intel/brw: Delete old-style surface and A64 message opcodes These have now been replaced by the MEMORY_*_LOGICAL opcodes. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Acked-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>	2024-09-12 20:54:36 +00:00
Kenneth Graunke	dc4770b005	intel/brw: Lower MEMORY_OPCODE__LOGICAL to HDC messages This is more complicated. We map the MEMORY__LOGICAL opcodes to the older HDC messages: typed and untyped surface read/write/atomic (whether float or integer), DWord and Byte scattered messages, OWord block, and both A64, BTI, and stateless messages. - MEMORY_MODE_* is used to select stateless-scratch, typed, or untyped. - MEMORY_FLAG_TRANSPOSE is used to select block access. - MEMORY_BINDING_TYPE = FLAT and 64-bit address size selects A64. - Alignment and data type size select between byte/dword scattered or surface messages. While we may not be able to handle the full generality of message possibilities, we can handle everything we generate currently. The plan here is to assert/validate that we don't generate MEMORY_*_LOGICAL ops on HDC-based platforms which can't support those particular messages. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>	2024-09-12 20:54:36 +00:00
Kenneth Graunke	3255c9cc49	intel/brw: Lower MEMORY_OPCODE__LOGICAL to LSC messages This is pretty straightforward, as the new MEMORY__LOGICAL opcodes are designed to match the new LSC's capabilities. The main part is constructing the message payload. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>	2024-09-12 20:54:36 +00:00
Kenneth Graunke	d5f38be713	intel/brw: Introduce new MEMORY_*_LOGICAL opcodes This is a new unified set of opcodes for memory access loosely patterned after the new LSC-style data port messages introduced on Alchemist GPUs. Rather than creating an opcode for every type of memory access, it has only three opcodes: load, store, and atomic. It has various sources to indicate the rest: - Binding type (raw pointer, pointer to surface state, or BT index) - Address size (A64, A32, A16) - Data size (bit size, number of components) - Opcode (atomic opcode, or LOAD/STORE vs. LOAD_CMASK/STORE_CMASK) - Mode (typed vs. untyped vs. shared-local vs. scratch) - Address (and its dimensionality) - Data (0 for loads, 1 for stores, 2 for atomics) - Whether we want block access Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30828>	2024-09-12 20:54:36 +00:00
Kenneth Graunke	437bda3013	intel/brw: Get rid of the lsc_msg_desc_wcmask helper The LOAD/STORE opcodes take a vector size, while the LOAD/STORE_CMASK opcodes take a channel mask. The two are mutually exclusive. So we can just have the lsc_msg_desc() helper take one or the other in the same parameter. This more closely matches the actual descriptor. We couldn't do this until the previous commit, since we were previously relying on the lsc_msg_desc() function to calculate a cmask out of the number of vector components. But now we don't need it to do that. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30632>	2024-08-27 09:25:59 +00:00
Kenneth Graunke	55f193a105	intel/brw: Switch from LSC CMASK opcodes to regular LOAD/STORE The LOAD/STORE opcodes take a vector size (number of components), while the LOAD/STORE_CMASK opcodes take a channel mask. For some reason, we were passing a number of channels to lsc_msg_desc(), then using it to construct a channel mask with all channels enabled, and always using the CMASK message variants. Considering we don't actually want to mask off any channels, we should probably just use the regular LOAD/STORE opcodes, as they're more flexible anyway. One exception is that typed messages on Xe2 apparently only support LOAD_CMASK/STORE_CMASK and not regular LOAD/STORE. So we keep using those there. (Thanks to Sagar Ghuge for catching this!) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30632>	2024-08-27 09:25:58 +00:00
Caio Oliveira	31dfb04fd3	intel/brw: Remove long register file names The long names were originally meant to map to the HW encoding but nowadays the actual encoding values depend on gfx version, whether instruction is 3src, etc. Suggested by Ken. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30704>	2024-08-25 22:08:14 +00:00
Francisco Jerez	71ca8529c5	intel/brw/gfx12.5+: Fix IR of sub-dword atomic LSC operations. We were currently emitting logical atomic instructions with a packed destination region for sub-dword LSC atomics, along the lines of: > untyped_atomic_logical(32) dst<1>:HF, ... However, these instructions use an LSC data size D16U32, which means that the 16b data on the return payload is expanded to 32b by the LSC shared function, so we were lying to the compiler about the location of the individual channels on the return payload, its execution masking, etc. This is why the hacks that manually set the 'inst->size_written' of the instruction were required. In some cases this worked, but any non-trivial manipulation of the instruction destination by lowering or optimization passes could have led to corruption, as has been reproduced in deqp-vk during lower_simd_width() for shaders that use 16-bit atomics in SIMD32 dispatch mode. Note that LSC sub-dword reads aren't affected by this because they use raw UD destinations and specify the actual bit size of the operation datatype as the immediate SURFACE_LOGICAL_SRC_IMM_ARG, which doesn't work for atomic operations since that immediate specifies the atomic opcode. Instead, have the logical operation implement the behavior of 16-bit destinations correctly instead of silently replacing the 16-bit region with an inconsistent 32-bit region -- This is done by emitting the MOV instructions used to pack the data from the UD temporary into the packed destination from the lower_logical_sends() pass instead of from the NIR translation pass. Fixes: `43169dbbe5` ("intel/compiler: Support 16 bit float ops") Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30683>	2024-08-21 02:33:12 +00:00
Sagar Ghuge	83c2524124	intel/compiler: Adjust trace ray control field on Xe2 Bspec 64643: Structure_TraceRayPayload::Trace Ray Control Bit field moved from 9-8 to 10-8 on Xe2. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30600>	2024-08-13 20:02:24 +00:00
Kenneth Graunke	9200fb966c	intel/brw: Record that SHADER_OPCODE_SCRATCH_HEADER uses g0 The generator code for emitting legacy scratch headers was implicitly using g0 as a source. But the IR wasn't indicating any usage of g0, which means the liveness isn't properly tracked at the IR level. It works because we reserve g0 as permanently live for the whole program. In order to stop doing that, we need to record it properly. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30319>	2024-08-01 16:37:31 -07:00
Sushma Venkatesh Reddy	0116430d39	intel/brw: Handle 16-bit sampler return payloads API requires samplers to return 32-bit even though hardware can handle 16-bit floating point, so we detect that case and make more efficient use of memory BW. This is helping improve performance of encode and decode tokens during LLM by at least 5% across multiple platforms. Thank you Kenneth Graunke for suggesting and guiding me throughout this implementation. Signed-off-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30447>	2024-07-31 21:26:46 +00:00
Caio Oliveira	8ba8e33c39	intel/brw: Simplify @file annotations Doxygen documentation says > If the file name is omitted (i.e. the line after \file is left > blank) then the documentation block that contains the \file command will > belong to the file it is located in. so we can omit the filename itself when using the annotation. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30168>	2024-07-22 22:48:03 +00:00
Caio Oliveira	3670c24740	intel/brw: Replace uses of fs_reg with brw_reg And remove the fs_reg alias. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29791>	2024-07-03 02:53:19 +00:00
Caio Oliveira	6b2405e1f5	intel/brw: Remove duplicated functions between fs_reg/brw_reg Update the brw_reg ones and use them. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29791>	2024-07-03 02:53:18 +00:00
Caio Oliveira	d00329e821	intel/brw: Replace some fs_reg constructors with functions Create three helper functions for ATTR, UNIFORM and VGRF creation. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29791>	2024-07-03 02:53:18 +00:00
Sagar Ghuge	edcad250ed	intel/compiler: Don't use half float param for sample_b Looks like some of the tests uses the bias which does not fit into half float parameter, so it's better to use float param for sample_b. If we have cube arrays, we anyway combine BIAS and array index properly so we don't have to worry about the first parameter. This fixes: GTF-GL46.gtf21.GL3Tests.texture_lod_bias.texture_lod_bias_clamp_m_g_M Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29533>	2024-06-28 03:33:18 +00:00
Francisco Jerez	c1feccdd90	intel/fs/gfx20+: Fix surface state address on extended descriptors for NIR scratch intrinsics. The r0.5 thread payload register contains Surface State Offset bits [27:6] as bits [31:10], so we need to shift the register right by 4 in order to get the surface state offset expected in ExBSO mode, which is the only extended descriptor encoding supported by the UGM shared function for SS addressing on Xe2+. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29543>	2024-06-21 01:49:43 +00:00
Sviatoslav Peleshko	2358c997f3	intel/brw: Actually retype integer sources of sampler message payload According to PRMs: "All parameters are of type IEEE_Float, except those in the The ld*, resinfo, and the offu, offv of the gather4_po[_c] instruction message types, which are of type signed integer." Currently, we load parameters with the correct types, but use them as send sources with the default float type, which may confuse passes downstream. Fix this by actually storing the retyped sources. Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11118 Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29581>	2024-06-12 18:59:17 +00:00
Jordan Justen	1fa84d34ef	intel/compiler: Don't set size written in brw_lower_logical_sends.cpp Rework: (Sagar) - Drop unused variable Suggested-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29271>	2024-06-06 00:18:37 +00:00
Lionel Landwerlin	d8b78924c5	brw: use a single virtual opcode to read ARF registers In `2c65d90bc8` I forgot to add the new SHADER_OPCODE_READ_MASK_REG opcode to the list of barrier instruction in the scheduler. Let's just use a single opcode for all ARF registers that need special scoreboarding and put the register as source (nicer for the debug output). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `2c65d90bc8` ("intel/brw: ensure find_live_channel don't access arch register without sync") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29446>	2024-05-31 20:22:27 +00:00
Kenneth Graunke	fbe0f8d36d	intel/brw: Blockify convergent load_shared on Gfx11-12 as well Gfx11-12 can support SLM block loads via OWord Block Load messages (notably, the aligned version, not the unaligned version). A while back we deleted the SHADER_OPCODE_OWORD_BLOCK_READ opcode. Rather than bring it back, we continue using UNALIGNED_OWORD_BLOCK_READ for SLM block access (like we do for SSBOs) but switch it over to the aligned variant when lowering logical sends. We do ensure the alignment is at least 16B, however. This is ugly, but it's probably not worth bringing back a whole extra opcode for a legacy HDC block load quirk. References: BSpec 47652 and 1689 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9960 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29429>	2024-05-30 22:01:10 +00:00
Kenneth Graunke	545bb8fb6f	intel/brw: Replace type_sz and brw_reg_type_to_size with brw_type_size_* Both of these helpers do the same thing. We now have brw_type_size_bits and brw_type_size_bytes and can use whichever makes sense in that place. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>	2024-04-25 11:41:48 +00:00
Kenneth Graunke	c22f44ff07	intel/brw: Replace brw_reg_type_from_bit_size by brw_type_with_size Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>	2024-04-25 11:41:48 +00:00
Kenneth Graunke	f523bfcf90	intel/brw: Reindent after shortening BRW_REGISTER_TYPE_* to BRW_TYPE_* Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>	2024-04-25 11:41:48 +00:00
Kenneth Graunke	873fcdff38	intel/brw: Stop using long BRW_REGISTER_TYPE enum names s/BRW_REGISTER_TYPE/BRW_TYPE/g Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847>	2024-04-25 11:41:48 +00:00
Kenneth Graunke	12b0e03bd2	intel/brw: Use SHADER_OPCODE_SEND for coherent framebuffer reads We already have a logical opcode and lower to what is basically a send instruction. We just weren't using SHADER_OPCODE_SEND, instead having extra redundant infrastructure for no real gain. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28705>	2024-04-16 02:14:49 +00:00
Rohan Garg	a368d234c8	intel/brw: Lower DWORD scattered read writes to lsc Rework: * Francisco Jerez: Rebase on `07b9bfacc7` ("intel/compiler: Move logical-send lowering to a separate file") * Jordan: Move SHADER_OPCODE_DWORD_SCATTERED__LOGICAL from previous patch, as it seems to make more sense here. Jordan: Change `devinfo->has_lsc` ?: to if/else as suggested by idr Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28484>	2024-04-01 00:00:03 +00:00
Rohan Garg	b5040bfc3f	intel/brw: Handle typed surface and atomic messages for xe2+ Reworks: * Francisco: Rebase on `07b9bfacc7` ("intel/compiler: Move logical-send lowering to a separate file") * Jordan: Rebase on `952a523abb` ("intel: switch over to unified atomics") Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28484>	2024-04-01 00:00:03 +00:00
Francisco Jerez	74efde7663	intel/brw/xehp+: Drop redundant arguments of lsc_msg_desc*(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28484>	2024-04-01 00:00:03 +00:00
Francisco Jerez	fa96274a87	intel/brw/xehp+: Replace lsc_msg_desc_dest_len()/lsc_msg_desc_src0_len() with helpers to do the computation. We cannot rely on the immediate message descriptor having accurate values for mlen and rlen at the IR level, since they are updated at codegen time via 'inst->mlen' and 'inst->size_written', which could end up with values inconsistent with the message descriptor if e.g. the split sends optimization had an effect. Instead, define helpers that do the computation without relying on the message descriptor, and use the pre-existing brw_message_desc_mlen()/brw_message_desc_rlen() helpers (fully equivalent to the lsc helpers deleted here) during disassembly. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28484>	2024-04-01 00:00:03 +00:00
Rohan Garg	adb853ed10	intel/brw: Update written size depending on the LSC message Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28306>	2024-03-20 15:46:44 -07:00
Rohan Garg	65f66974a5	intel/brw: Use the dimensions supplied in the instruction Rework: * Francisco Jerez: Rebase on `07b9bfacc7` ("intel/compiler: Move logical-send lowering to a separate file") Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28306>	2024-03-20 15:46:44 -07:00
Francisco Jerez	d96bfb160f	intel/brw/xe2+: Update encoding of FB write extended descriptor. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28306>	2024-03-20 15:46:44 -07:00
Francisco Jerez	189422de1b	intel/brw/xe2+: Update encoding of FB write descriptor message control. Ref: bspec: 65209, 63908 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28306>	2024-03-20 15:46:44 -07:00
Francisco Jerez	7b0fbc22dd	intel/brw/xe2: Render target reads have been removed from the hardware. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28306>	2024-03-20 15:46:44 -07:00
Ian Romanick	f89d9cc53d	intel/brw: Silence "statement may fall through" warning src/intel/compiler/brw_lower_logical_sends.cpp: In member function ‘bool fs_visitor::lower_logical_sends()’: src/intel/compiler/brw_lower_logical_sends.cpp:3170:10: warning: this statement may fall through [-Wimplicit-fallthrough=] 3170 \| if (devinfo->has_lsc) { \| ^~ src/intel/compiler/brw_lower_logical_sends.cpp:3174:7: note: here 3174 \| case SHADER_OPCODE_DWORD_SCATTERED_READ_LOGICAL: \| ^~~~ Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27552>	2024-03-12 21:31:30 +00:00
Lionel Landwerlin	75c6ad9907	intel/fs: fixup sampler header message If you look at the sampler message header on Gfx9+, you'll see that we mostly only use 2 dwords (dw2 & dw3). DW2 has a bunch of sampler parameters, DW3 is the sampler handle. On Gfx9 we can micro optimize by copying r0 into the header because the HW mostly doesn't care about other DWs. We just have to clear dw2 on non VS/FS stages. On Gfx11+, we always have to do a careful copy of the r0.3 bits to mask out the bottom unrelated bits. So there, just clearing the entire header makes more sense. On Xe2+, the dw4 of the header references the sampler feedback surface handle and bit0 is a boolean to know whether to use that surface or not. So it REALLY matters to have that as 0. If we copy r0, we'll get random bits in dw4, leading to enable that surface. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28082>	2024-03-12 07:25:45 +00:00
Kenneth Graunke	ad37622a8f	intel/brw: Delete legacy texture opcodes We first generate the logical opcodes, and these days fully lower to SHADER_OPCODE_SEND. In the past, we lowered to a non-logical variant and handled that in the generator. These days, we were just using the non-logical opcodes as an awkward intermediate opcode change during the lowering...which isn't really necessary at all. This patch eliminates them by using the original logical opcodes. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27908>	2024-03-01 22:19:51 +00:00
Kenneth Graunke	45a5e4c0c4	intel/brw: Delete SHADER_OPCODE_TXF_UMS Nothing seems to generate this anymore. I guess we always use CMS. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27908>	2024-03-01 22:19:51 +00:00
Kenneth Graunke	601ef12467	intel/brw: Delete SHADER_OPCODE_TXF_CMS[_LOGICAL] We always use the wide variant (_W) on hardware this compiler supports. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27908>	2024-03-01 22:19:50 +00:00
Rohan Garg	73d98848fa	intel/compiler: Xe2+ can do URB load/store with a byte offset Thanks to Ken for suggesting this URB refactoring change and pointing out that the LSC can operate on the byte offset granularity. This should fix the geometry shader test cases where we have more than 32 vertices since previously we were failing to write the correct control data bits because of incorrect write mask. Shader-db results for Xe2: total instructions in shared programs: 153475 -> 153437 (-0.02%) instructions in affected programs: 1374 -> 1336 (-2.77%) helped: 11 HURT: 0 helped stats (abs) min: 3 max: 5 x̄: 3.45 x̃: 3 helped stats (rel) min: 1.67% max: 4.92% x̄: 3.23% x̃: 2.70% 95% mean confidence interval for instructions value: -3.92 -2.99 95% mean confidence interval for instructions %-change: -4.10% -2.36% Instructions are helped. total loops in shared programs: 140 -> 140 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 16002649 -> 16002329 (<.01%) cycles in affected programs: 9174 -> 8854 (-3.49%) helped: 11 HURT: 0 helped stats (abs) min: 22 max: 38 x̄: 29.09 x̃: 32 helped stats (rel) min: 2.62% max: 5.54% x̄: 3.78% x̃: 3.85% 95% mean confidence interval for cycles value: -33.56 -24.62 95% mean confidence interval for cycles %-change: -4.48% -3.08% Cycles are helped. total spills in shared programs: 52 -> 52 (0.00%) spills in affected programs: 0 -> 0 helped: 0 HURT: 0 total fills in shared programs: 94 -> 94 (0.00%) fills in affected programs: 0 -> 0 helped: 0 HURT: 0 total sends in shared programs: 4240 -> 4240 (0.00%) sends in affected programs: 0 -> 0 helped: 0 HURT: 0 LOST: 0 GAINED: 0 Rework: (Sagar) - Adjust offset/indirect offset calculation. - Add shader-db results - Always calculate dword index - Drop changes for indirect writes Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27602>	2024-03-01 16:11:30 +00:00
Caio Oliveira	8f3c52c1da	intel/brw: Remove MRF type Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691>	2024-02-28 05:45:39 +00:00
Caio Oliveira	371468c013	intel/brw: Remove Gfx8- code from lower logical sends Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691>	2024-02-28 05:45:38 +00:00
Sagar Ghuge	6f0ab5e4d5	intel/compiler: Add texture gather offset LOD/Bias message support v2: (Ian) - Space formatting on conditional statement Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27447>	2024-02-27 00:22:46 +00:00
Sagar Ghuge	79af0ac29a	intel/compiler: Add gather4_i/l/[_c]/b sampler message v2: (Ian) - Format comment Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27447>	2024-02-27 00:22:46 +00:00

1 2 3

123 commits