fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-22 07:00:12 +01:00

Author	SHA1	Message	Date
Caio Oliveira	ff89e83178	intel/brw: Lower VGRFs to FIXED_GRFs earlier Moves the lowering of VGRFs into FIXED_GRFs from the code generation to (almost) right after the register allocation. This will allow: (1) later passes not worry about VGRFs (and what they mean in a post reg alloc phase) and (2) make easier to add certain types of validation post reg alloc phase using the backend IR. Note that a couple of passes still take advantage of seeing "allocated VGRFs", so perform lowering after they run. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28604>	2024-04-23 23:17:57 +00:00
Caio Oliveira	5b3d4c757d	intel/brw: Support FIXED_GRF when generating code for CLUSTER_BROADCAST Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28604>	2024-04-23 23:17:57 +00:00
Francisco Jerez	62aab1437e	intel/fs/gfx20+: Handle subdword integer regioning restrictions in copy propagation. This makes sure that copy propagation doesn't undo the lowering of restricted sub-dword integer regions done by brw_fs_lower_regioning(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28698>	2024-04-22 18:02:32 -07:00
Francisco Jerez	217d412360	intel/fs/gfx20+: Implement sub-dword integer regioning restrictions. This patch introduces code to enforce the pages-long regioning restrictions introduced by Xe2 that apply to sub-dword integer datatypes (See BSpec page 56640). They impose a number of restrictions on what the regioning parameters of a source can be depending on the source and destination datatypes as well as the alignment of the destination. The tricky cases are when the destination stride is smaller than 32 bits and the source stride is at least 32 bits, since such cases require the destination and source offsets to be in agreement based on an equation determined by the source and destination strides. The second source of instructions with multiple sources is even more restricted, and due to the existence of hardware bug HSDES#16012383669 it basically requires the source data to be packed in the GRF if the destination stride isn't dword-aligned. In order to address those restrictions this patch leverages the existing infrastructure from brw_fs_lower_regioning.cpp. The same general approach can be used to handle this restriction we were using to handle restrictions of the floating-point pipeline in previous generations: Unsupported source regions are lowered by emitting an additional copy before the instruction that shuffles the data in a way that allows using a valid region in the original instruction. The main difficulty that wasn't encountered in previous platforms is that it is non-trivial to come up with a copy instruction that doesn't break the regioning restrictions itself, since on previous platforms we could just bitcast floating-point data and use integer copies in order to implement arbitrary regioning, which is unfortunately no longer a choice lacking a magic third pipeline able to do the regioning modes the integer pipeline is no longer able to do. The required_src_byte_stride() and required_src_byte_offset() helpers introduced here try to calculate parameters for both regions that avoid that situation, but it isn't always possible, and actually in some cases that involve the second source of ALU instructions a chain of multiple copy instructions will be required, so the lower_instruction() routine needs to be applied recursively to the instructions emitted to lower the original instruction. XXX - Allow more flexible regioning for the second source of an instruction if bug HSDES#16012383669 is fixed in a future hardware platform. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28698>	2024-04-22 18:02:07 -07:00
Caio Oliveira	13093ceb3c	intel/brw: Move validate out of fs_visitor Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28534>	2024-04-22 13:38:41 -07:00
Caio Oliveira	671d216f39	intel/brw: Remove two duplicated validate calls in optimizer The OPT macro will call validate() after each pass, so both cases removed by this patch are just redundant calls. Will only affect Debug builds since in Release builds validation is a no-op. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28534>	2024-04-22 13:38:41 -07:00
Caio Oliveira	8a6fe54409	intel/brw: Refactor FS validation macros Use `a` and `b` (already identified as that in the output message) instead of `f` and `s` for the two values being compared, since in a later patch `s` will be used to hold the fs_visitor shader. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28534>	2024-04-22 13:38:41 -07:00
Ian Romanick	a5adbae6f6	nir: intel/brw: Remove cmat_signed_mask from dpas_intel intrinsic It is not used. The signedness is inferred from src_type and dest_type. Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28822>	2024-04-19 09:53:29 -07:00
Ian Romanick	2ce558d928	intel/brw: Fix handling of cmat_signed_mask For integer types, the signedness is determined by flags on the muladd instruction. The types of the sources play no role. Previously we were using the signedness of the type and ignoring the mask. Adjust the types passed to the dpas_intel intrinsic to match. Fixes various dEQP-VK.compute..cooperative_matrix.khr_.matrixmuladd_cross.* tests on different Intel platforms. Some platforms had failing tests, and some platforms failed EU validation before the tests could fail. Fixes: `6b14da33ad` ("intel/fs: nir: Add nir_intrinsic_dpas_intel") Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28822>	2024-04-19 09:53:27 -07:00
Mike Blumenkrantz	042b8a65d3	brw/lower_a2c: fix for scalarized fs outputs it's legal for a fs to write xyzw components separately, and this pass should handle such cases cc: mesa-stable Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28752>	2024-04-18 23:27:22 +00:00
Jordan Justen	4e5ed7ebd5	intel/brw: Avoid getting a stride of 0 for nir_intrinsic_exclusive_scan Ref: `671745b616` ("intel/fs: Don't allow 0 stride on MOV destination") Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28821>	2024-04-18 23:03:57 +00:00
Ian Romanick	90e12ed843	intel/brw/xe2+: Only apply Wa 22016140776 to math instructions The check in has_invalid_src_region incorrectly omitted inst->is_math() from the condition. Fixes: `0e817ba548` ("intel/brw/xe2+: Implement Wa 22016140776") Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28821>	2024-04-18 23:03:57 +00:00
Kenneth Graunke	e637c63239	intel/brw: Make an fs_builder::SYNC helper We always want a null destination, so this saves some typing. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28705>	2024-04-16 02:14:49 +00:00
Kenneth Graunke	d5b8cec7a2	intel/brw: Replace FS_OPCODE_LINTERP with BRW_OPCODE_PLN We no longer support the old LINE+MAC lowering, and we already lower this to MAD in NIR on Gfx11+, so the LINTERP virtual opcode always corresponds the PLN. The only catch is that LINTERP's operands are reversed from PLN, so we have to switch them. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28705>	2024-04-16 02:14:49 +00:00
Kenneth Graunke	12b0e03bd2	intel/brw: Use SHADER_OPCODE_SEND for coherent framebuffer reads We already have a logical opcode and lower to what is basically a send instruction. We just weren't using SHADER_OPCODE_SEND, instead having extra redundant infrastructure for no real gain. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28705>	2024-04-16 02:14:49 +00:00
Kenneth Graunke	46a7ee772e	intel/brw: Drop default size of 1 from bld.vgrf() calls This isn't necessary as 1 is the default value for the parameter. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28705>	2024-04-16 02:14:49 +00:00
Kenneth Graunke	217d56e9b1	intel/brw: Delete fs_visitor::vgrf helper Just use fs_builder::vgrf instead of the older glsl_type-based one. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28705>	2024-04-16 02:14:49 +00:00
Kenneth Graunke	f29a56a4ac	intel/brw: Delete if_depth_in_loop This was only used prior to Sandybridge. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28705>	2024-04-16 02:14:49 +00:00
Kenneth Graunke	bd6a430c94	intel/brw: Drop gfx7 scratch message setup code Nothing uses this. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28705>	2024-04-16 02:14:49 +00:00
Mike Blumenkrantz	39b66f9c84	intel: set compact_arrays in compiler options Acked-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28601>	2024-04-12 18:43:48 +00:00
Tapani Pälli	0413729bc3	intel/compiler: add assert for Wa_22017182272 According to the workaround description: "For all Data Port messages, DP_FLUSH_TYPE should not be programmed to Discard." This issue happens only with certain circumstances but as we are not using discard, add assert and deal with it later if discard is taken in to use. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24422>	2024-04-10 06:03:58 +00:00
Lionel Landwerlin	6a7e576017	intel/fs: fixup instruction scheduling last grf write tracking When I bumped the max size of VGRFs, I should have bumped the values in the scheduler too. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `d33aff783d` ("intel/fs: add support for sparse accesses") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28188>	2024-04-05 19:46:40 +00:00
Lionel Landwerlin	d59612f5e5	intel/fs: printout a couple of more late compile steps Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28188>	2024-04-05 19:46:40 +00:00
Ian Romanick	0e817ba548	intel/brw/xe2+: Implement Wa 22016140776 HF sources to math instructions cannot be scalar. This is very similar to an old Gfx6 restriction on POW, so let's fix it in a similar way. As an extra bit of saftey, lower any occurances that might slip through in brw_fs_lower_regioning. The primary change is to prevent copy propagation from violating the restriction. With that change, nothing should be able to generate these invalid source strides. The modification to fs_visitor::validate should detect potential problems sooner rather than later. Previous attempts to implement this Wa when emitting the math instruction (in brw_eu_emit.c gfx6_math) didn't work for several reasons. The lowering happens after the SWSB pass, so the scoreboarding was incorrect (thanks to Curro for finding that). In addition, the lowering happens after register allocation, so it's impossible to allocate a non-scalar register to expand the scalar value. Fixes 113 tests in the dEQP-VK.spirv_assembly.* group on LNL. v2: Add changes to brw_fs_lower_regioning. Suggested by Curro. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28480>	2024-04-04 21:04:09 -07:00
Ian Romanick	0b67d3d909	intel/elk: Delete stray nir_opt_dce No shader-db changes on any Intel platform. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28136>	2024-04-04 23:42:28 +00:00
Ian Romanick	24cdbbdaa2	intel/brw: Delete stray nir_opt_dce No shader-db or fossil-db changes on any Intel platform. Fixes: `f76f4be301` ("intel/compiler: move gen5 final pass to actually be final pass") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28136>	2024-04-04 23:42:27 +00:00
Ian Romanick	44fb57b827	intel/elk: Don't call nir_opt_remove_phis before nir_convert_from_ssa shader-db: All platforms had similar results. (Ivy Bridge shown) total instructions in shared programs: 15831424 -> 15831637 (<.01%) instructions in affected programs: 38880 -> 39093 (0.55%) helped: 0 / HURT: 179 total cycles in shared programs: 432140353 -> 432170199 (<.01%) cycles in affected programs: 11798080 -> 11827926 (0.25%) helped: 77 / HURT: 123 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28136>	2024-04-04 23:42:27 +00:00
Ian Romanick	6377e8fd29	intel/brw: Don't call nir_opt_remove_phis before nir_convert_from_ssa Per discussion in #10727, removing phis breaks LCSSA form which in turn invalidates divergence analysis. shader-db: All Skylake and newer platforms had similar results. (Ice Lake shown) total instructions in shared programs: 20299612 -> 20299695 (<.01%) instructions in affected programs: 20829 -> 20912 (0.40%) helped: 6 / HURT: 13 total cycles in shared programs: 842149085 -> 842148399 (<.01%) cycles in affected programs: 15146222 -> 15145536 (<.01%) helped: 40 / HURT: 45 fossil-db: All Intel platforms had similar results. (Ice Lake shown) Totals: Instrs: 165505077 -> 165505603 (+0.00%); split: -0.00%, +0.00% Cycles: 15144183575 -> 15144235695 (+0.00%); split: -0.00%, +0.00% Spill count: 45213 -> 45220 (+0.02%) Fill count: 74166 -> 74184 (+0.02%) Totals from 94 (0.01% of 656116) affected shaders: Instrs: 263079 -> 263605 (+0.20%); split: -0.00%, +0.20% Cycles: 28411487 -> 28463607 (+0.18%); split: -0.18%, +0.37% Spill count: 3474 -> 3481 (+0.20%) Fill count: 6713 -> 6731 (+0.27%) Fixes: `6dbb5f1e07` ("intel/fs: rerun divergence analysis prior to convert_from_ssa") Closes: #10727 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28136>	2024-04-04 23:42:27 +00:00
Ian Romanick	87101e7d83	intel/compiler: Ensure load_barycentric_at_sample and load_interpolated_input remain together This previously worked by luck because we were incorrectly calling nir_opt_remove_phis before calling nir_convert_from_ssa. See also #10727. No shader-db or fossil-db changes on any Intel platform. v2: Handle the load_interpolated_input and load_barycentric_at_sample as separate passes. Based on discussion with Ken starting at https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28136#note_2330424. Fixes: `74a40cc4b6` ("intel/fs: move lower of non-uniform at_sample barycentric to NIR") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28136>	2024-04-04 23:42:27 +00:00
Kenneth Graunke	9e0d0190ea	intel/brw: Drop align16 support in brw_broadcast() align16 support is only used on Gen9 for 3-source instructions, quad swizzling, and dPdy calculations. We don't need it for broadcast. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28458>	2024-04-02 00:00:59 +00:00
Kenneth Graunke	a520c976a5	intel/brw: Drop dead CHV checks. This compiler no longer supports Cherryview. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28458>	2024-04-02 00:00:59 +00:00
Kenneth Graunke	e3d12cf72f	intel/brw: Don't mention gfx7 limitations in shuffle comments We don't support gfx7 here anymore, so we needn't consider it. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28458>	2024-04-02 00:00:59 +00:00
Kenneth Graunke	1d9e2b761a	intel/brw: Update comments for indirect MOV splitting brw_broadcast and generate_mov_indirect both had similar comments, both with typos ("insead"). One still referred to IVB bugs, while the other dropped that during the compiler split. The one that dropped the comment mentioned "both of these" issues, while citing only one issue; there was in fact a third issue (no-Q/UQ) that wasn't mentioned in either comment. One also had some bad grammar in the comments. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28458>	2024-04-02 00:00:59 +00:00
Kenneth Graunke	7a24f29fbb	intel/brw: Fix lower_regioning for BROADCAST, MOV_INDIRECT on Q types For BROADCAST and MOV_INDIRECT, required_exec_type was returning brw_int_type(type_sz(t), false), which is an unsigned type. However, get_exec_type(inst) returns the original type for either Q or UQ. This meant that has_invalid_exec_type would detect a mismatch and trigger lowering. That lowering would insert new 64-bit MOVs, which would need to be lowered on platforms which don't support Q/UQ. Except, we already ran that lowering pass earlier. So, the unlowered Q/UQ MOVs would reach the software scoreboarding pass, and trigger failures in the inferred_exec_pipe() function, as no pipe is available to handle 64-bit integer operations. It turns out that we don't need the region lowering pass to do anything for these opcodes. The generator code for both BROADCAST and MOV_INDIRECT already handle decomposing Q/UQ operations into 32-bit MOVs when they're not supported. And, it also implicitly converts to integer types, even for floating point sources. The inferred_exec_pipe function already special cases them to note that they'll always be handled on the integer pipe, so that matches. Just drop the region lowering code for these opcodes. Cc: mesa-stable Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28458>	2024-04-02 00:00:59 +00:00
Kenneth Graunke	a90edad9f7	intel/brw: Fix generate_mov_indirect to check has_64bit_int not float We are overriding the type to Q/UQ, so we need to split to two MOVs if 64-bit integer math is not supported. For reference, Meteorlake does support 64-bit floats but would still not work correctly here. See also brw_broadcast(), which does similar indirects but correctly checks has_64bit_int instead of has_64bit_float. Cc: mesa-stable Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28458>	2024-04-02 00:00:59 +00:00
Rohan Garg	3d68dd78d0	intel/eu/validate: Allow SIMD16 for mixed mode float operations on xe2+ Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28484>	2024-04-01 00:00:03 +00:00
Rohan Garg	a368d234c8	intel/brw: Lower DWORD scattered read writes to lsc Rework: * Francisco Jerez: Rebase on `07b9bfacc7` ("intel/compiler: Move logical-send lowering to a separate file") * Jordan: Move SHADER_OPCODE_DWORD_SCATTERED__LOGICAL from previous patch, as it seems to make more sense here. Jordan: Change `devinfo->has_lsc` ?: to if/else as suggested by idr Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28484>	2024-04-01 00:00:03 +00:00
Rohan Garg	b5040bfc3f	intel/brw: Handle typed surface and atomic messages for xe2+ Reworks: * Francisco: Rebase on `07b9bfacc7` ("intel/compiler: Move logical-send lowering to a separate file") * Jordan: Rebase on `952a523abb` ("intel: switch over to unified atomics") Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28484>	2024-04-01 00:00:03 +00:00
Francisco Jerez	74efde7663	intel/brw/xehp+: Drop redundant arguments of lsc_msg_desc*(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28484>	2024-04-01 00:00:03 +00:00
Francisco Jerez	f1812437e8	intel/eu/xehp+: Don't initialize mlen and rlen descriptor fields from lsc_msg_desc*(). These fields are overlapping with the ones set by brw_message_desc(), so the latter should be used instead. This fixes corruption of the LSC message descriptors when inconsistent values are specified through both helpers, which can happen if the 'inst->mlen' field is modified during optimization (e.g. by opt_split_sends()). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28484>	2024-04-01 00:00:03 +00:00
Francisco Jerez	fa96274a87	intel/brw/xehp+: Replace lsc_msg_desc_dest_len()/lsc_msg_desc_src0_len() with helpers to do the computation. We cannot rely on the immediate message descriptor having accurate values for mlen and rlen at the IR level, since they are updated at codegen time via 'inst->mlen' and 'inst->size_written', which could end up with values inconsistent with the message descriptor if e.g. the split sends optimization had an effect. Instead, define helpers that do the computation without relying on the message descriptor, and use the pre-existing brw_message_desc_mlen()/brw_message_desc_rlen() helpers (fully equivalent to the lsc helpers deleted here) during disassembly. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28484>	2024-04-01 00:00:03 +00:00
Ian Romanick	5f9ab41457	intel/brw/xe2: Update uniform handling to account for 512b physical registers Rework: * Jordan: Drop FINISHME (s-b Caio) * Jordan: Use reg_unit() in asserts rather than a ver check (s-b Caio) * Ian: Make use of reg_unit() in round_components_to_whole_registers() Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28484>	2024-04-01 00:00:03 +00:00
Ian Romanick	8587ef172c	intel/brw/xe2: Update brw_nir_analyze_ubo_ranges to account for 512b physical registers Rework: * Jordan: Use `REG_SIZE * reg_unit` (Suggested by Caio) Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28484>	2024-04-01 00:00:03 +00:00
Caio Oliveira	d9e737212d	intel/brw: Add a src array for the common case in fs_inst In the common case, fs_inst will have up to 4 sources (the HW instructions have up to 3, and our representation of SENDs have 4). Embed such array into the fs_inst, and use it whenever applicable instead of allocating a new array. Also change the code to reuse the allocated src array when resizing to a smaller length. Between the changes above and the reduced amount of initializing fs_regs, this reduces fossil-db time by around 2% for Borderlands 3 and Rise of the Tomb Raider, and around 1.5% for Total War Warhammer 3. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28379>	2024-03-29 22:44:01 +00:00
Caio Oliveira	dae9795628	intel/brw: Remove vestiges of sources on IF opcode, only valid on Gfx6 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28379>	2024-03-29 22:44:01 +00:00
Kenneth Graunke	816a33849a	intel/brw: Rearrange fs_inst fields For better packing, and to make all the small fields easier to hash and compare en masse. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28379>	2024-03-29 22:44:01 +00:00
Ian Romanick	5e9c01dfe4	intel/brw/xe2+: Use phys_nr and phys_subnr in DPAS encoding Suggested-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28404>	2024-03-29 21:12:32 +00:00
Ian Romanick	6d85f7129a	intel/brw/xe2+: DPAS must be SIMD16 now Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28404>	2024-03-29 21:12:32 +00:00
Ian Romanick	a8115221e5	nir: intel/brw: Change the order of sources for nir_dpas_intel It was by pure luck that all sources (and the result) of nir_dpas_intel had the same number of components. It is possible to support matrix sizes where the accumlator matrix and the result matrix are larger (e.g., 16x8 * 8x16 = 16x16). This breaks all of the assumptions of NIR's infrastructure for code generating intrinsics. Fix the by making the accumulator matrix be the first source. The accumulator and the result will always have the same dimensions (due to rules of matrix multiplication) and the same type (due to restructions of the cooperative matrix extension). This forces them to have the same number of components. This doesn't fix all the potential problems. NIR expects that all 0-sized sources will have the same number of components. This just ensures that the result has the correct number of components. Fixes: `6b14da33ad` ("intel/fs: nir: Add nir_intrinsic_dpas_intel") Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28404>	2024-03-29 21:12:32 +00:00
Ian Romanick	c6bd6f2a41	intel/brw: Use enums for DPAS source regioning Was previously passing 1, 1, 0 as the regioning. This generated incorrect disassembly because the encoding for a width of 1 is 0. Use the enums to ensure the correct values are used. Fixes: `1c92dad5cb` ("intel/disasm: Disassembly support for DPAS") Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28404>	2024-03-29 21:12:32 +00:00

... 2 3 4 5 6 ...

3556 commits