fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-21 18:00:13 +01:00

Author	SHA1	Message	Date
Francisco Jerez	0d332d0c49	intel/fs: Plumb shader instead of compiler to get_lowered_simd_width() and friends. This will allow making lowering decisions based on properties of the shader, like the multipolygon dispatch mode used. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26606>	2023-12-28 14:12:59 -08:00
Francisco Jerez	bd634bef12	intel/fs/xe2+: Implement layout of mesh shading per-primitive inputs in PS thread payloads. This is based on a previous patch by Marcin Ślusarz addressing the same issue, though it's largely rewritten, simplified and includes additional fixes. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26606>	2023-12-28 14:12:59 -08:00
Francisco Jerez	4cebfaadf7	intel/fs/xe2+: Implement support for multi-polygon vertex setup data in PS payload. This fixes a number of assumptions made by the multipolygon input attribute handling code from assign_urb_setup() so it also works on Xe2+, which has additional multipolygon dispatch modes (like SIMD4x8 and SIMD2x16) and uses a different more compact representation of the plane parameters. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26606>	2023-12-28 14:12:59 -08:00
Francisco Jerez	702eabaaae	intel/fs/xe2+: Update for new layout of vertex setup data in PS payload. The interpolation deltas of PS inputs now show up as a 12B vec3 (A0, A1-A0, A2-A0) in the ATTR file, instead of the previously used 16B format with an unused component. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26606>	2023-12-28 11:07:03 -08:00
Francisco Jerez	d622e19f00	intel/fs/xe2+: Enable new format of barycentrics in PS payload. The X and Y barycentric vectors are no longer interleaved in SIMD8 chunks (yay), so this is mostly a matter of disabling the lower_barycentrics() pass and switching to a simpler implementation of fetch_barycentric_reg() that simply calls fetch_payload_reg() instead of the SIMD8 shuffling we had to do in previous generations. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26606>	2023-12-28 11:07:03 -08:00
Francisco Jerez	49a867f67e	intel/fs: Add support for vector payload values to fetch_payload_reg(). This extends fetch_payload_reg() to support fetching vector registers like barycentrics stored on the payload as a contiguous sequence of SIMD-wide vectors. In the SIMD32 case, both halves of the SIMD16 vector registers specified as regs[0] and regs[1] are zipped to construct a single SIMD32-wide vector. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26606>	2023-12-28 11:07:03 -08:00
Francisco Jerez	a0ae3c0dba	intel/fs/xe2+: Update uses of pixel/sample mask from PS thread payload. Note from Caio: proper handling of brw_sample_mask_reg will appear in later patches. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26606>	2023-12-28 11:07:03 -08:00
Rohan Garg	3e46ee61d5	intel/fs/xe2+: Lift CPS dispatch width restrictions on Xe2+. These restrictions don't seem to be applicable anymore, and limiting to SIMD8 wouldn't work since we're no longer building shaders with that dispatch width. [ Francisco: This one-liner change was squashed by Rohan Garg into a previous version of my patch "Stop building SIMD8 programs", but it makes more sense as a separate commit -- Formatted as a separate patch. ] Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26605>	2023-12-22 10:37:00 -08:00
Francisco Jerez	6877916155	intel/fs/xe2+: Stop building SIMD8 fragment shaders. They are no longer suppored by the fixed-function hardware. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26605>	2023-12-22 10:37:00 -08:00
Francisco Jerez	7397ba61c2	intel/fs/xe2+: Stop building SIMD8 compute-like shaders (CS/BS/TS/MS). SIMD8 kernels are no longer able to utilize the ALUs efficiently, since they have twice the vector width as previous platforms. However even though there aren't many reasons to use it, SIMD8 is still supported by the instruction set technically, and it will still be used for some SIMD-lowering sequences. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26605>	2023-12-22 10:37:00 -08:00
Francisco Jerez	1f2c44dc21	intel/compiler: Attempt to build dual-SIMD8 variant of fragment shaders on gfx12+ platforms. Similar to other FS dispatch modes, attempt to build a dual-SIMD8 program if the regular SIMD8 program didn't spill and doubling the amount of space for varyings doesn't cause us to go over the thread payload limit. Dual-SIMD8 builds in combination with coarse pixel shading are currently not handled. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26585>	2023-12-22 18:05:31 +00:00
Francisco Jerez	09ea840987	intel/fs: No need to copy null destinations in lower_simd_width. The copy would be discarded immediately. Until now we were relying on DCE to eliminate these, but it seems like in some cases MOVs into the null register emitted by lower_simd_width() are never eliminated, likely because a lower_simd_width() call has been introduced close to the bottom of optimize() which isn't follow by any additional DCE passes. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26585>	2023-12-22 18:05:31 +00:00
Francisco Jerez	5e0760a993	intel/fs/gfx12: Don't consider multipolygon PS to have packed dispatch. This fixes a number of regressions and hangs in multipolygon fragment shaders that have FIND_LIVE_CHANNEL sequences which would otherwise lead to access of a dead channel. Note that the failures don't seem to be reproducible in simulation. Acked-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26585>	2023-12-22 18:05:31 +00:00
Francisco Jerez	6bf99e6a45	intel/compiler: Don't change types for copies from ATTR file. Since the <8;8,0> regions they use in multipolygon mode could violate regioning restrictions in some cases, depending on the execution type of the instruction. Note that the assertion is removed from try_copy_propagate() since a more accurate check is used within that function than what fs_inst::can_change_types() can do. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26585>	2023-12-22 18:05:31 +00:00
Francisco Jerez	b62ad4e028	intel/fs: Rework layout of FS vertex setup data in ATTR file to support multi-polygon dispatch. The updated layout includes one copy of each plane parameter per channel of the SIMD thread, in order to allow channels to process different polygons. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26585>	2023-12-22 18:05:31 +00:00
Francisco Jerez	a844c0b185	intel/fs: Fix fs_reg::component_size() to handle two-dimensional register regions. Add code to calculate the size in bytes of arbitrary two-dimensional regions for FIXED_GRF and ARF registers. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26585>	2023-12-22 18:05:31 +00:00
Francisco Jerez	2d26ed6688	intel/fs: Assert fs_reg::nr is always zero for ATTR registers in geometry stages. Instead of treating fs_reg::nr as an offset for ATTR registers simply consider different indices as denoting disjoint spaces that can never be accessed simultaneously by a single region. From now on geometry stages will just use ATTR #0 for everything and select specific attributes via offset() with the native dispatch width of the program, which should work on current platforms as well as on Xe2+. See "intel/fs: Map all GS input attributes to ATTR register number 0." for the rationale. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26585>	2023-12-22 18:05:30 +00:00
Francisco Jerez	e4aca2ebaa	intel/fs: Add separate constructor of fs_visitor for fragment shaders. To allow specifying the number of polygons that will be processed per SIMD thread. Rework: * Jordan: Add needs_register_pressure following `09cdb77a92` ("intel/fs: report max register pressure in shader stats") Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26585>	2023-12-22 18:05:30 +00:00
Francisco Jerez	1eff2fcb62	intel/compiler: Add polygon count statistic to brw_compile_stats. And use it in ANV in order to return a "SIMDNxM" name from vkGetPipelineExecutablePropertiesKHR. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26585>	2023-12-22 18:05:30 +00:00
Francisco Jerez	ccf9174655	intel/compiler: Add multipolygon dispatch fields to brw_wm_prog_data. Add fields that track the number of polygons processed per PS SIMD thread (note that this might be lower than the value that was specified to the compiler via brw_compile_fs_params if compilation at the desired polygon count wasn't possible), and the dispatch width of the multi-polygon PS kernel. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26585>	2023-12-22 18:05:30 +00:00
Sviatoslav Peleshko	8f8cde4c60	intel/fs: Don't optimize DW1 MUL if it stores value to the accumulator Fixes: `a8b86459` ("i965/fs: Optimize a 1.0 -> a.") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9570 Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25710>	2023-12-19 13:32:23 +00:00
Kenneth Graunke	49b8ccbcdc	intel/fs: Drop opt_register_renaming() In the past, multiple writes to a single register were pretty common, but since we've transitioned to NIR, and leave the IR in SSA form for everything not captured in a phi-web, the pattern of generating new temporary registers at each step is a lot more common. This pass isn't nearly as useful now. Across fossil-db on Alchemist, this affects only 0.55% of shaders, which fall into two cases: - Coarse pixel shading pixel-X/Y setup. There are a few cases where we write a partial calculation into a register, then have a second instruction read that as a source and overwrite it as a destination. While we could use a temporary here, it doesn't actually help with register pressure at all, since there's the same amount of values live at both instructions regardless. So while this pass kicks in, it doesn't do anything useful. - Geometry shader control data bits (5 shaders total). We track masks for handling EndPrimitive in a single register across the program, and apparently in some cases can split the live range. However, it's a single register...only in geometry shaders...which use EndPrimitive. None of them appear to be in danger of spilling, either. So this tiny benefit doesn't seem to justify the cost of running the pass. So, just throw it out. It's not worth keeping. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26343>	2023-12-19 11:07:18 +00:00
Caio Oliveira	bfc953add7	intel/compiler: Use C helpers to access builtin types Remove usage of C++ static members as they are going to be removed. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26658>	2023-12-15 03:09:19 +00:00
Caio Oliveira	a8b2426419	intel/compiler: Use reference instead of pointer for fs_visitor Per Ian suggestion. Also clear up a few unnecessary casts around the code and use `s` for fs_visitor ("shader"). Note to include a reference in ntf we need to set it during initialization, so create an explicit mem_ctx for it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26323>	2023-12-12 19:36:14 +00:00
Caio Oliveira	4e5fcccd01	intel/compiler: Create and use nir_to_brw() function Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26323>	2023-12-12 19:36:14 +00:00
Caio Oliveira	38a42e5aa1	intel/compiler: Add ctor to fs_builder that just takes the shader Uses the dispatch_width from the shader (fs_visitor). This was not possible before because the dispatch_width was not part of backend_shader. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26323>	2023-12-12 19:36:14 +00:00
Caio Oliveira	cf730adc58	intel/compiler: Make fs_builder include fs_visitor and not the other way This will allow fs_builder have a reference to an fs_visitor (a "fs_shader" really), instead of a reference to a backend_shader. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26323>	2023-12-12 19:36:14 +00:00
Caio Oliveira	f5032c4d52	intel/compiler: Make fs_visitor not depend on fs_builder At this point this is more a header dependency due to inline functions, so shuffle them around. The end goal is to allow fs_builder have a reference to a fs_visitor (really a fs_shader). Note the header is still included, a later patch will move the includes to the call-sites. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26323>	2023-12-12 19:36:14 +00:00
Caio Oliveira	5b8ec015f2	intel/compiler: Don't use fs_visitor::bld in remaining places The remaining users can simply create a new builder at_end() if needed. In many places a new builder object is already being constructed, so just give more specific instructions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26323>	2023-12-12 19:36:14 +00:00
Caio Oliveira	c12460b01e	intel/compiler: Move NIR emission code to brw_fs_nir.cpp This is a preparation to reorganize NIR emission code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26323>	2023-12-12 19:36:13 +00:00
Daniel Schürmann	1179d83a89	nir: remove info.fs.needs_all_helper_invocations Use info.uses_wide_subgroup_intrinsics instead. Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26026>	2023-11-22 11:31:11 +01:00
Lionel Landwerlin	295734bf88	intel/fs: fix residency handling on Xe2 We're missing a few reg_unit() scaling when dealing with residency data. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26208>	2023-11-15 20:06:12 +00:00
Caio Oliveira	a9f95bf687	intel/compiler: Reuse same scheduler for all pre-RA scheduling modes Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>	2023-11-13 23:05:47 +00:00
Caio Oliveira	fcd025c1ce	intel/compiler: Remove is_tex() The current name doesn't cover all the tex related instructions and in all usages, we already have a switch statement to dispatch per instruction type, so is more natural to list the instructions we care there. In fs::is_send_from_grf() we can simply ignore them since the instructions are either lowered directly to SEND (Gfx7+) or use MRF (Gfx6-). With this change, the fs_inst::size_read() generated code gets simplified (the "tex" entries get added to the switch jump table in gcc) and the default case loses the conditional handling tex. This reduces shader compilation time, as illustrated by replaying fossils (tested on my TGL laptop): ``` // Rise of the Tomb Raider (N=13) Difference at 95.0% confidence -1.32231 +/- 0.0170138 -4.37605% +/- 0.0563054% (Student's t, pooled s = 0.0210159) // Cyberpunk 2077 (N=7) Difference at 95.0% confidence -3.64 +/- 0.114993 -2.95188% +/- 0.0932544% (Student's t, pooled s = 0.09873) ``` Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25721>	2023-11-10 15:43:31 +00:00
Caio Oliveira	40416850f1	intel/compiler: Re-enable opt_zero_samples() in many cases for Gfx12.5 The workaround applies specifically to Cube and Cube Arrays, so we can still apply the optimization for the others. Ideally we would like to pull opt_zero_samples logic into the lowering sends -- to avoid adding a bit to communicate between passes. However the texture coordinates for the LOGICAL backend instructions, which are a common target for the optimization, are combined into offsets over a single VGRF, so we can't easily identify the constant cases. The copy-prop pass make this more visible for opt_zero_samples. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25742>	2023-11-09 03:56:28 +00:00
Caio Oliveira	daeab51a62	intel/compiler: Re-enable opt_zero_samples() for Gfx7+ Inadvertently, because of a sequence of changes elsewhere, this pass ended up not having any effect: - Before Gfx5 the optimization is not applicable. - On Gfx5-6 it doesn't apply because it sampler operations don't currently use LOAD_PAYLOAD, but write the MOVs directly. Not clear to me whether they ever did. - On Gfx7+ it doesn't apply anymore because now the logical sampler operations are now lowered directly to SENDs, and the is_tex() check would skip them. Since the LOAD_PAYLOAD implementation applies for Gfx7+ only, rework the pass to work again by handling SEND instructions. To make the pass easier, the optimization will happen before opt_split_sends() so only one LOAD_PAYLOAD needs to be cared for. Update the code to accept BAD_FILE sources in addition to zeros, these are added in some cases as padding and effectively are don't care values, so we can assume them zeros. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25742>	2023-11-09 03:56:28 +00:00
Caio Oliveira	ef8553082e	intel/compiler: Rework opt_split_sends to not rely/modify LOAD_PAYLOAD This is a preparation to (re-)enable opt_zero_samples(), which will reduce a SEND mlen before we split it. When that happen, opt_split_sends() won't be able to rely on the fact that mlen covers the entire LOAD_PAYLOAD. Since we are changing that, take the opportunity to also not modify the existing LOAD_PAYLOAD, just create two new ones with the exact set of sources. This allows the pass to be further simplified by iterating forward and not require live_variables analysis. The helper function was added so can be used later for opt_zero_samples(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25742>	2023-11-09 03:56:28 +00:00
Caio Oliveira	e017bcae59	intel/compiler: Clarify the asserts in nir_load_workgroup_id lowering For Task/Mesh WorkgroupID is now lowered to WorkgroupIndex by the generic NIR pass, so we shouldn't hit this. We can now simplify the asserting code in emit_work_group_id_setup(). Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25977>	2023-11-08 17:18:36 -08:00
Kenneth Graunke	48f60f4c4b	intel/compiler: Convert the repclear shader to use send-from-GRF Sandybridge uses this code and needs MRFs, but all other platforms send from GRFs. Do that directly rather than relying on the MRF hack. Ivybridge and later also use SHADER_OPCODE_SEND directly rather than a virtual opcode that's handled in the generator, so we follow suit. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20172>	2023-10-30 23:03:23 +00:00
Kenneth Graunke	ef7d1b5f44	intel/compiler: Drop unused saturate handling in repclear shader We never set key->clamp_fragment_color when compiling the BLORP fast clear shaders. Besides, we were setting saturate on an FB write opcode, which...isn't even a thing. We would need it on the MOV, and weren't setting it there. So it wouldn't have even worked. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20172>	2023-10-30 23:03:23 +00:00
Kenneth Graunke	e6d9267d4f	intel/compiler: Delete repclear shader's special case for 1 color target This is basically just once through the loop but copy and pasted. One difference is that the single render target case used a headerless message, and the multiple render target case always used headers. Now we use headerless messages for the first render target, even in the multiple render target case. While we already have it set up for the other RTs, it's still 2 fewer registers to send. Minor improvement. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20172>	2023-10-30 23:03:23 +00:00
Kenneth Graunke	e6460fe66b	intel/compiler: Delete unused repclear shader uniform handling A long time ago, we used a uniform for the clear color. Back in 2014, we added support for using a flat input instead, as this was easier for Vulkan, but we left the option of using a uniform for OpenGL. Eventually nobody used the uniform approach anymore, but the compiler code to handle it remained. Drop the dead code. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20172>	2023-10-30 23:03:23 +00:00
Kenneth Graunke	b35f1fc910	intel/compiler: Delete unused emit_dummy_fs() This code is compiled out, but has been left in place in case we wanted to use it for debugging something. In the olden days, we'd use it for platform enabling. I can't think of the last time we did that, though. I also used to use it for debugging. If something was misrendering, I'd iterate through shaders 0..N, replacing them with "draw hot pink" until whatever shader was drawing the bad stuff was brightly illuminated. Once it was identified, I'd start investigating that shader. These days, we have frameretrace and renderdoc which are like, actual tools that let you highlight draws and replace shaders. So we don't need to resort iterative driver hacks anymore. Again, I can't think of the last time I actually did that. So, this code is basically just dead. And it's using legacy MRF paths, which we could update...or we could just delete it. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20172>	2023-10-30 23:03:23 +00:00
Lionel Landwerlin	3f973a4f45	Revert "intel/fs: limit register flag interaction of FIND_LIVE_CHANNEL" This reverts commit `c9739e8912`. We don't have a full understanding of what is going on but reverting definitely fixes a hang. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `c9739e8912` ("intel/fs: limit register flag interaction of FIND_LIVE_CHANNEL") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9868 Tested-By: Valentin Geyer <trayshar@t-online.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25563>	2023-10-13 08:37:28 +03:00
Alyssa Rosenzweig	c39896b17b	nir: Use getters for nir_src::parent_* First, we need to give the parent_instr field a unique name to be able to replace with a helper. We have parent_instr fields for both nir_src and nir_def, so let's rename nir_src::parent_instr in preparation for rework. This was done with a combination of sed and manual fix-ups. Then we use semantic patches plus manual fixups: @@ expression s; @@ -s->renamed_parent_instr +nir_src_parent_instr(s) @@ expression s; @@ -s.renamed_parent_instr +nir_src_parent_instr(&s) @@ expression s; @@ -s->parent_if +nir_src_parent_if(s) @@ expression s; @@ -s.renamed_parent_if +nir_src_parent_if(&s) @@ expression s; @@ -s->is_if +nir_src_is_if(s) @@ expression s; @@ -s.is_if +nir_src_is_if(&s) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24671>	2023-10-10 04:58:05 -04:00
Ian Romanick	bac10ef4aa	intel/fs: Add DP4A to get_lowered_simd_width While working on cooperative matrix support, I noticed some invalid DP4A instructions being generated. dp4a(32) g33<1>UD g21<8,8,1>UD g1.0<0,1,0>UD g9<1,1,1>UD This violates the constraint that the destination or a source can only access two consecutive GRFs. I'm a little surprised that validation didn't catch this. Perhaps because it's a 3 source instruction? Either way, it seems like a bigger project to fix that. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Fixes: `0f809dbf40` ("intel/compiler: Basic support for DP4A instruction") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25554>	2023-10-07 02:27:53 +00:00
Caio Oliveira	81bc09bf97	intel/fs: Tweak default case of fs_inst::size_read() In the default case, there's a special case with a few conditions. Prefer the cheapest conditions first, so we can take advantage of short-circuiting. Effect is a small but still significant reduce in shader compilation times, as can be seen by: - Fossil replay for Rise of the Tomb Raider ``` Difference at 95.0% confidence -0.433333 +/- 0.028609 -1.42556% +/- 0.0941163% (Student's t, pooled s = 0.0337886) ``` - Fossil replay for Batman Arkham City ``` Difference at 95.0% confidence -8.84 +/- 0.146083 -1.65932% +/- 0.0274207% (Student's t, pooled s = 0.125423) ``` Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25549>	2023-10-06 09:16:56 +00:00
Sviatoslav Peleshko	8f23b45252	intel/fs: Fix "packed word exception" condition for register regioning Fixes: `a6bf5f88` ("i965/fs: Enforce common regioning restrictions by SIMD splitting.") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9432 Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25378>	2023-10-05 01:41:42 +00:00
Francisco Jerez	53d1d793cb	intel/fs: Delete manual 'inst->mlen' calculations from all uses of logical URB writes. Rework: * Marcin: update emit_urb_indirect_vec4_write Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25195>	2023-09-27 23:57:25 +00:00
Francisco Jerez	34a2c9ce35	intel/fs: Specify number of data components of logical URB writes via control immediate. This is what most logical SEND messages do when they take a variable number of components. 'inst->mlen' is expected to be zero for logical SEND opcodes, which are expected to behave like plain arithmetic operations, so certain automated transformations (like SIMD lowering) can manipulate them without opcode-specific special-casing. Guessing the number of components from 'inst->mlen' has other disadvantages, because it requires duplicating the logic that infers the message payload size in every use of the instruction -- Instead we can just do the computation once during logical send lowering. In addition on LNL platform this causes the 'inst->mlen' field of URB writes to have units inconsistent with every other SEND instruction, which is likely to lead to confusion and bugs down the road. Rework: * Marcin: update emit_urb_indirect_vec4_write Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25195>	2023-09-27 23:57:25 +00:00

... 3 4 5 6 7 ...

883 commits