fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-23 04:40:09 +01:00

Author	SHA1	Message	Date
Sviatoslav Peleshko	aea7366613	intel/brw: List all instructions that have BranchCtrl bit Previously this bit was not clearly documented in PRMs, but gfx12 PRMs finally list all the instructions where it is present. Although it's unclear if it's functional for anything other than "if", "else", and "goto", we probably still should acknowledge its existence in other instructions. Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31747>	2024-11-02 18:01:19 +00:00
Paulo Zanoni	5ca883505e	brw: add a NOP in between WHILE instructions on LNL This is a workaround that is still in progress, see HSD 22020521218. If we don't have these NOPs we may see rendering corruption or even GPU hangs. While we still don't fully understand the issue from the hardware point of view, let's have this workaround so we can pass CTS and move things forward. If we need to change this later, we can. Besides, the impact is minimal. Shaderdb/fossilize report no changes for this patch. On our Blackops trace, the lack of this patch causes corruption in fog rendering (rectangles where fog was supposed to be shown don't show the fog). On dEQP-VK.graphicsfuzz.cov-array-copies-loops-with-limiters, without this patch we get a GPU hang. Backport-to: 24.2 Testcase: dEQP-VK.graphicsfuzz.cov-array-copies-loops-with-limiters Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11813 Reviewed-by: Ivan Briano <ivan.briano@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31331>	2024-10-31 23:57:10 +00:00
Iván Briano	13db5fad27	brw: fix task/mesh push constant loading The InlineData passed to the shader is a fixed size unrelated to the register size. It happens to match pre-Xe2, but by considering it the same in Xe2, we ended up reading pushed constants from the wrong place when they didn't fit in the InlineData. Fixes: `97b17aa0b1` ("brw/nir: rework inline_data_intel to work with compute") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31856>	2024-10-26 18:12:41 +00:00
Jordan Justen	35ace9d4e2	intel/compiler: Xe2 and Xe3 use the same compaction tables Ref: bspec 56709 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31838>	2024-10-26 07:39:30 +00:00
Jordan Justen	688a673c5a	intel/brw: Allow Xe3 in brw_stage_has_packed_dispatch() Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31838>	2024-10-26 07:39:30 +00:00
Jordan Justen	cd33b7766a	intel/compiler: Add compiler enum for Xe3 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31838>	2024-10-26 07:39:30 +00:00
Ian Romanick	04e1783278	brw: Call brw_fs_opt_algebraic less often No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31729>	2024-10-25 23:39:36 +00:00
Ian Romanick	ac64b78f1f	brw/copy: Perform constant folding with constant propagation No shader-db or fossil-db changes on any Intel platform. v2: Simlify the logic for when to try constant folding. Do commute_immediates before constant folding. Both suggested by Ken. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31729>	2024-10-25 23:39:36 +00:00
Ian Romanick	2cc1575a31	brw/algebraic: Refactor constant folding out of brw_fs_opt_algebraic Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31729>	2024-10-25 23:39:36 +00:00
Ian Romanick	5dcad54902	brw/sat: Convert nearly all tests to use new style builders v2: Use new style builder for second ADD in other_non_saturated_use too. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31834>	2024-10-25 20:31:45 +00:00
Ian Romanick	19ae7aceb5	brw/sat: Fix small typos, copy and paste, etc. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31834>	2024-10-25 20:31:45 +00:00
Ian Romanick	de45273307	brw/builder: Add new style ALU3 builder Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31834>	2024-10-25 20:31:45 +00:00
Ian Romanick	8329c04521	brw/copy: Don't remove instructions w/ conditional modifier Fixes: `9e750f00c3` ("intel/brw: Make opt_copy_propagation_defs clean up its own trash") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31834>	2024-10-25 20:31:44 +00:00
Kenneth Graunke	d949d47f09	brw/emit: Fix align16 3src subregister encodings for HF types Prior to Cherryview, align16 3src instruction sources had to have their subregister number be DWord-aligned. Cherryview added a discontiguous bit in the encoding to represent bit 1 of the subregister number. This allows us to use packed HF sources. Update the ISA encoding helpers to properly handle bit 1. While we're at it, make them take a full subregister number and adjust accordingly, rather than making the callers divide or multiply by some alignment. Note that the destination subregister must still be DWord aligned, so HF destinations must be strided. Thanks to Ian Romanick for discovering that we were botching this. BSpec: 12054, 12081 v2 (idr): Fix ordering of high and low bit parameters to brw_inst_bits. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31834>	2024-10-25 20:31:44 +00:00
Kenneth Graunke	33cd5a49f1	brw/validate: Return an error for Align16 access mode on Icelake+ Gfx11+ doesn't support Align16 instructions anymore - only Align1 mode. Bailing early for Align16 is important so that brw_hw_decode_inst doesn't try to read Align16 related instruction fields on generations where they no longer exist (which could trigger assertions). Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31834>	2024-10-25 20:31:44 +00:00
Francisco Jerez	e2eba3c7da	intel/brw/xe2+: Adjust performance analysis divergence weight due to EU fusion removal. This reduces the penalty the heuristic gives to SIMD32 shaders relative to SIMD16 in presence of discard control flow on Xe2+. The penalty was meant to account for the inefficient divergence behavior of SIMD32 shaders on Gfx12.x platforms, since Gfx12 hardware had EUs bundled in groups of two, and each pair shared control flow logic so both EUs could only execute instructions in lockstep, which meant that SIMD32 shaders had an effective warp size of 64 on Gfx12.x. This change switches back to more optimistic modelling of discard divergence. With it we gain about 6% performance in a Shadow of the Tomb Raider trace (tested on BMG). One may wonder if there are still workloads that would suffer materially from enabling SIMD32 for all pixel shaders on Xe2 instead of using this heuristic, since Xe2 EUs have twice the GRF space, twice the FPU throughput and better divergence behavior than Xe, but the answer seems to be yes unfortunately: E.g. Superposition has some pixel shaders where SIMD32 has substantially worse scheduling due to the increased number of false dependencies due to higher register pressure, and using SIMD32 for them reduces performance significantly. The heuristic seems to model this correctly so it doesn't look like we can do without it at least right now on Xe2. Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31697>	2024-10-24 22:06:52 +00:00
Kenneth Graunke	7bed11fbde	intel/brw: Allow immediates in the BFE instruction on Gfx12+ We weren't allowing immediates in BFE at all. Gfx12+ supports immediates in src0 (value) and src2 (width), but not src1 (offset). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31437>	2024-10-24 21:31:28 +00:00
Daniel Schürmann	87cb42f953	treewide: don't lower to LCSSA before calling nir_divergence_analysis() Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>	2024-10-24 10:06:17 +00:00
Daniel Schürmann	c8348139fd	nir: change signature of nir_src_is_divergent() Now, it takes nir_src * instead of nir_src. Also move the implementation to nir_divergence_analysis.c. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>	2024-10-24 10:06:17 +00:00
Paulo Zanoni	c0bceaf057	brw: don't emit instruction to add zero in spilling code When the spill_offset is zero, don't emit an instruction that adds zero. Results on MTL: - Shaders: instructions helped: shaders/blender/581.shader_test FS SIMD8: 6760 -> 6759 (-0.01%) (scheduled: none) instructions helped: shaders/blender/1017.shader_test FS SIMD8: 6760 -> 6759 (-0.01%) (scheduled: none) instructions helped: shaders/blender/1045.shader_test FS SIMD8: 6474 -> 6473 (-0.02%) (scheduled: none) instructions helped: shaders/blender/723.shader_test FS SIMD8: 6458 -> 6457 (-0.02%) (scheduled: none) instructions helped: shaders/blender/1042.shader_test FS SIMD8: 6458 -> 6457 (-0.02%) (scheduled: none) instructions helped: shaders/blender/917.shader_test FS SIMD8: 4900 -> 4897 (-0.06%) (scheduled: none) instructions helped: shaders/blender/455.shader_test FS SIMD8: 4832 -> 4829 (-0.06%) (scheduled: none) cycles helped: shaders/blender/917.shader_test FS SIMD8: 891856 -> 891832 (<.01%) (scheduled: none) cycles helped: shaders/blender/455.shader_test FS SIMD8: 894692 -> 894660 (<.01%) (scheduled: none) total instructions in shared programs: 1596934 -> 1596923 (<.01%) instructions in affected programs: 42642 -> 42631 (-0.03%) helped: 7 HURT: 0 - Fossils: Instrs: 151744378 -> 151741213 (-0.00%) Cycle count: 16007811131 -> 16007643963 (-0.00%); split: -0.00%, +0.00% Totals from 1353 (0.21% of 632545) affected shaders: Instrs: 3925143 -> 3921978 (-0.08%) Cycle count: 2292838118 -> 2292670950 (-0.01%); split: -0.01%, +0.00% RELATIVE IMPROVEMENTS - Instrs Before After Delta Percentage mesa/benchmarks/gravity_mark/3e9c48cebaddf012/cs/0 1947 1941 -6 -0.31% mesa/steam-native/red_dead_redemption2/571534e21fb7bd2a/fs.8/0 3431 3421 -10 -0.29% mesa/steam-dxvk/batman_arkham_city_goty/d783eacc9ebe324d/fs.8/0 717 715 -2 -0.28% mesa/steam-dxvk/batman_arkham_city_goty/14e0878a6a9605c9/fs.8/0 724 722 -2 -0.28% mesa/steam-dxvk/batman_arkham_city_goty/d859c2ae858269dc/fs.8/0 744 742 -2 -0.27% mesa/steam-dxvk/total_war_warhammer3/18b9d4a3b1961616/vs/0 1539 1535 -4 -0.26% mesa/steam-dxvk/total_war_warhammer3/a21827ce57dc0e29/vs/0 1539 1535 -4 -0.26% (and a bunch of others where the delta is -2, -4 or -6) Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31694>	2024-10-23 20:19:48 +00:00
Sviatoslav Peleshko	ebd6738260	intel/elk/chv: Implement WaClearArfDependenciesBeforeEot Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31746>	2024-10-23 15:02:27 +00:00
Sviatoslav Peleshko	2a4efe21c5	intel/brw/gfx9: Implement WaClearArfDependenciesBeforeEot Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11928 Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31746>	2024-10-23 15:02:27 +00:00
Kenneth Graunke	834b919f6a	brw: Optimize 16-bit texture fetches later At the point we were calling this, we hadn't necessarily cleaned up derefs via nir_lower_vars_to_ssa, nor movs/vecs via copy propagation, so it wasn't necessarily easy for this pass to see the actual usage of the destination. Moving this later allows us to detect f2f32(txf(...)) and avoid converting it to a 16-bit txf (why convert with ALU instructions when the sampler could do it for us?). Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31750>	2024-10-22 01:15:10 +00:00
Caio Oliveira	019770f026	intel/brw: Add SHADER_OPCODE_VOTE_* Add opcodes for VOTE_ALL, VOTE_ANY and VOTE_EQUAL. The first two are also used for the quad variants. Move their lowering from NIR conversion to brw_lower_subgroup_ops. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31029>	2024-10-19 02:44:20 +00:00
Caio Oliveira	f20df2984d	intel/brw: Ensure BROADCAST() value respect register alignment If we have a non-register-aligned source, MOV it to a new register so that the invariant expected when generating SHADER_OPCODE_BROADCAST is respected. Added to ensure a later patch won't hit the `src.subnr == 0` assertion in brw_broadcast() generation code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31029>	2024-10-19 02:44:20 +00:00
Caio Oliveira	d97381efd8	intel/brw: Add fs_builder::BROADCAST() helper Include in the helper which already take care of using exec_all() and taking the first component of the result. Both are expected by SHADER_OPCODE_BROADCAST. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31029>	2024-10-19 02:44:20 +00:00
Lionel Landwerlin	608d521086	elk: Don't apply discard_if condition opt if it can change results Replicates the change from `57344052b6` ("intel/brw: Don't apply discard_if condition opt if it can change results") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `0ba9497e66` ("intel/fs: Improve discard_if code generation") Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31604>	2024-10-18 01:57:58 +00:00
Lionel Landwerlin	97b17aa0b1	brw/nir: rework inline_data_intel to work with compute This intrinsic was initially dedicated to mesh/task shaders, but the mechanism it exposes also exists in the compute shaders on Gfx12.5+. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31508>	2024-10-17 19:35:59 +00:00
Lionel Landwerlin	1dc125338e	brw: fix mesh fence emission In SIMD32, the fence instruction is currently going to read grf0-3 leading to such assertions in the backend : ../src/intel/compiler/brw_fs_reg_allocate.cpp:206: void fs_visitor::calculate_payload_ranges(bool, unsigned int, int*) const: Assertion `j < payload_node_count' failed. The reason we haven't seen the problem yet is that there always enough payload register to accomodate this. But the following change is going to make the inline parameter register optional. Since SHADER_OPCODE_MEMORY_FENCE is emitted in the generator as SIMD1 NoMask (see brw_memory_fence), we can limit ourselves to SIMD1 exec_all() in the IR as well so that the IR accounts for grf0 as a source. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31508>	2024-10-17 19:35:59 +00:00
Lionel Landwerlin	b2c5ca0ade	brw: remove rebuild single element special case No shader-db difference on DG2. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31508>	2024-10-17 19:35:59 +00:00
Lionel Landwerlin	19eb601cfc	brw: avoid clashing nested loop indices Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31508>	2024-10-17 19:35:59 +00:00
Lionel Landwerlin	f5d123b977	brw: delay printf lowering Useful to insert debug traces a bit later in the lowering process (in particular after load/store vectorization). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31508>	2024-10-17 19:35:59 +00:00
Lionel Landwerlin	be3f62af15	brw: remove unused prototype Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31508>	2024-10-17 19:35:59 +00:00
Georg Lehmann	cba575f4df	nir: always emit ddx intrinsics Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31014>	2024-10-17 09:50:19 +00:00
Georg Lehmann	6cb6bc7133	elk: remove alu fddx/fddy check Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31014>	2024-10-17 09:50:19 +00:00
Kenneth Graunke	4cb67cb07a	intel/brw: Use whole 512-bit registers in constant combining on Xe2 Xe2 increased the register size from 256-bits to 512-bits. So we can store 32 16-bit values in a register, rather than 16 values. Prior to this patch, we hadn't updated the pass, so the second half of each of our registers was unused. Backport-to: 24.2 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31499>	2024-10-15 18:14:37 +00:00
Kenneth Graunke	d9e5022650	intel/brw: Delete more Gfx8 code from brw_fs_combine_constants These platforms are supported by elk, not brw. Backport-to: 24.2 Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31499>	2024-10-15 18:14:37 +00:00
Kenneth Graunke	dea61b7399	intel/brw: Fix register and builder size in emit_barrier() for Xe2 We were manually allocating 1 REG_SIZE for the barrier payload, which is only half a register on Xe2. This should eventually get allocated to a whole register anyway, but it's awkward in the meantime. Also, we were zero-initializing the header using group(8, 0) which only initialized half the register. The rest of the fields are Reserved MBZ, so they're likely unused and unread anyway - but it's better to zero-initialize them so we don't get random undefined, miserable-to-debug behavior. Backport-to: 24.2 Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31499>	2024-10-15 18:14:37 +00:00
Kenneth Graunke	7c9eb8b289	intel/brw: Make a ubld temporary in emit_barrier() Saves typing .exec_all() in a lot of places. Backport-to: 24.2 Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31499>	2024-10-15 18:14:37 +00:00
Kenneth Graunke	a9d9488788	intel/brw: Delete Gfx7-8 code from emit_barrier() Those are supported by elk, not brw. Backport-to: 24.2 Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31499>	2024-10-15 18:14:37 +00:00
Kenneth Graunke	c747c1e1f4	intel/brw: Fix spill/fill count for load/store_scratch in SIMD32 Honestly, I don't know what I was thinking - we are emitting a single spill/fill message here, but were counting it as 2 spill/fills in SIMD32 shaders. So our eventual shader stat reporting would subtract the number of spills and fills from send_count, and get a negative number, wrapping around to just shy of UINT32_MAX. That's way too many sends. This is especially noticable on Xe2 which often uses SIMD32 shaders. Backport-to: 24.2 Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31499>	2024-10-15 18:14:37 +00:00
Marek Olšák	65ace5649b	nir: reject unsupported component counts from all vectorize callbacks If you allow an unsupported component count in the callback for loads, nir_opt_load_store_vectorize will align num_components to the next supported vector size, essentially overfetching. This changes all callbacks to reject it. AMD will enable it in a later commit. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>	2024-10-15 05:50:24 +00:00
Marek Olšák	02923e237d	nir: add hole_size parameter into the vectorize callback It will be used to allow merging loads with a hole between them. Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>	2024-10-15 05:50:24 +00:00
Caio Oliveira	b9787fcc80	intel/brw: Move emit_scan/emit_scan_step near its usage Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30496>	2024-10-11 06:40:29 +00:00
Caio Oliveira	0ba1159b0a	intel/brw: Add SHADER_OPCODE_*_SCAN Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30496>	2024-10-11 06:40:29 +00:00
Caio Oliveira	9537b62759	intel/brw: Add SHADER_OPCODE_REDUCE Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30496>	2024-10-11 06:40:29 +00:00
Caio Oliveira	4361a08254	intel/brw: Reduce scope of has_source_and_destination_hazard This predicate at the moment is only relevant during register allocation, so move it there and the code can ignore virtual instructions that were already lowered previously. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30496>	2024-10-11 06:40:29 +00:00
Caio Oliveira	bf9456753d	intel/brw: Validate some instructions exists only up until some phases Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30496>	2024-10-11 06:40:29 +00:00
Caio Oliveira	affa7567c2	intel/brw: Add phases to backend The general idea is to be able to validate that certain instructions were lowered and certain restrictions were already handled. Passes can now assert their expectations, i.e. if a pass is mean to run after certain lowerings or not. The actual phases are a initial stab and as we re-organized the passes, we may remove/add phases. This commit just add some phase steps, later commits will make use of them. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30496>	2024-10-11 06:40:29 +00:00
Caio Oliveira	21f78454bf	intel/brw: Fix Gfx9 3-src validation to handle FIXED_GRF Note this validation path is not being used at the moment, but will in a later commit. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30496>	2024-10-11 06:40:29 +00:00

... 7 8 9 10 11 ...

4222 commits