fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-21 15:50:11 +01:00

Author	SHA1	Message	Date
Ian Romanick	20cce95ce5	brw/opt: Don't call brw_opt_copy_propagation before brw_lower_load_reg Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details On a 36c/72t Xeon system, performance of replaying hogwarts_legacy.dx12vk-ultra.foz was improved 1.3% +/- 0.77% (n=10). I picked MTL for the fossil-db results because it was the most negative. shader-db: All Intel platforms had fairly similar results. (Lunar Lake) total instructions in shared programs: 16964217 -> 16964216 (<.01%) instructions in affected programs: 51777 -> 51776 (<.01%) helped: 20 / HURT: 27 total cycles in shared programs: 892934916 -> 893041912 (0.01%) cycles in affected programs: 51245298 -> 51352294 (0.21%) helped: 96 /HURT: 78 fossil-db: All Intel platforms had similar results. (Meteor Lake shown) Totals: Instrs: 233678547 -> 233678944 (+0.00%); split: -0.00%, +0.00% Cycle count: 24398049850 -> 24400490877 (+0.01%); split: -0.01%, +0.02% Max live registers: 42145052 -> 42145038 (-0.00%); split: -0.00%, +0.00% Totals from 1141 (0.14% of 805934) affected shaders: Instrs: 1546001 -> 1546398 (+0.03%); split: -0.01%, +0.03% Cycle count: 1201746062 -> 1204187089 (+0.20%); split: -0.14%, +0.34% Max live registers: 84247 -> 84233 (-0.02%); split: -0.03%, +0.01% Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>	2025-04-04 06:45:02 +00:00
Ian Romanick	991a2f510b	brw/sat: Eliminate non-defs saturate propagation The intervening_saturating_copy test is removed. The defs version of the pass does not handle this case. It should not occur often in practice anyway. Copy propagation and brw_nir_opt_fsat should prevent this scenario from happening. No shader-db changes on any Intel platform. fossil-db: All Intel platforms had similar results. (Lunar Lake shown) Totals: Instrs: 212677275 -> 212677278 (+0.00%) Cycle count: 30466062848 -> 30466056040 (-0.00%) Totals from 1 (0.00% of 706300) affected shaders: Instrs: 1343 -> 1346 (+0.22%) Cycle count: 411664 -> 404856 (-1.65%) v2: Stop counting ip. The non-defs part of the pass was the only thing that used it. v3: Also delete "if (block != def->block) continue;" code. I noticed this while working on some other changes to this function. It's the last thing in the loop, so it's totally useless. Delete some other spurious continues too. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> [v2] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>	2025-04-04 06:45:02 +00:00
Ian Romanick	cc5a6a5ae8	brw/sat: Convert tests to use load_reg This is in prepartion for a commit that removes the non-defs version of the pass. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>	2025-04-04 06:45:02 +00:00
Ian Romanick	2d13acf9d9	brw: Add passes to generate and lower load_reg v2: Add support for WE_all instructions... this already just worked, so I only had to delete the check and the FINISHME comment. v3: Use logic more like def_analysis::update_for_reads to determine when to not insert LOAD_REG instructions. Based on a suggestion by Ken. v4: Eliminate "store" from all the names since STORE_REG does not exist anymore. Fold insert_load_reg into brw_insert_load_reg. Elminate extra call to s.def_analysis.require() after progress. Pull a loop-invariant check out of the inst->srouces loop. Drop call to brw_opt_split_virtual_grfs after lowering load_reg. All suggested by Caio. v5: Assert that LOAD_REG doesn't already exist in brw_insert_load_reg. Update comment before fully_defines. Both suggested by Caio. v6: Don't explicitly special-case SHADER_OPCODE_MEMORY_STORE_LOGICAL. Move the inst->dst.file != VGRF check earlier to avoid the loop over sources. Both suggested by Ken. Move the call the brw_insert_load_reg a little bit later, and explain why it's at that location. Suggested by Caio. v7: Many changes to the for-each-source loop in brw_insert_load_reg. Removes incorrect multiplication of s.alloc.sizes with reg_unit. Adds checks for matching SIMD size and NoMask in the search for pre-existing LOAD_REG of same value. v8: Add some unit tests. Suggested by Caio. shader-db: Lunar Lake total instructions in shared programs: 16923237 -> 16921895 (<.01%) instructions in affected programs: 450565 -> 449223 (-0.30%) helped: 251 / HURT: 377 total cycles in shared programs: 910428418 -> 889920590 (-2.25%) cycles in affected programs: 719248184 -> 698740356 (-2.85%) helped: 9076 / HURT: 9082 total fills in shared programs: 2242 -> 2218 (-1.07%) fills in affected programs: 116 -> 92 (-20.69%) helped: 2 / HURT: 0 total sends in shared programs: 848635 -> 848421 (-0.03%) sends in affected programs: 810 -> 596 (-26.42%) helped: 10 / HURT: 0 LOST: 82 GAINED: 78 Meteor Lake and DG2 had similar results. (Meteor Lake shown) total instructions in shared programs: 19875784 -> 19871694 (-0.02%) instructions in affected programs: 1050091 -> 1046001 (-0.39%) helped: 251 / HURT: 2403 total cycles in shared programs: 905328238 -> 882446458 (-2.53%) cycles in affected programs: 682736344 -> 659854564 (-3.35%) helped: 7869 / HURT: 7911 total spills in shared programs: 5512 -> 5032 (-8.71%) spills in affected programs: 1830 -> 1350 (-26.23%) helped: 8 / HURT: 0 total fills in shared programs: 5648 -> 4782 (-15.33%) fills in affected programs: 3312 -> 2446 (-26.15%) helped: 8 / HURT: 0 total sends in shared programs: 1032942 -> 1032722 (-0.02%) sends in affected programs: 572 -> 352 (-38.46%) helped: 10 / HURT: 0 LOST: 138 GAINED: 53 Tiger Lake total instructions in shared programs: 19711930 -> 19715591 (0.02%) instructions in affected programs: 1040623 -> 1044284 (0.35%) helped: 317 / HURT: 2474 total cycles in shared programs: 862988990 -> 860573870 (-0.28%) cycles in affected programs: 612392461 -> 609977341 (-0.39%) helped: 7447 / HURT: 7686 total sends in shared programs: 1034763 -> 1034555 (-0.02%) sends in affected programs: 784 -> 576 (-26.53%) helped: 8 / HURT: 0 LOST: 56 GAINED: 143 Ice Lake and Skylake had similar results. (Ice Lake shown) total instructions in shared programs: 20545461 -> 20545220 (<.01%) instructions in affected programs: 422405 -> 422164 (-0.06%) helped: 180 / HURT: 459 total cycles in shared programs: 872697345 -> 866874523 (-0.67%) cycles in affected programs: 573117917 -> 567295095 (-1.02%) helped: 6783 / HURT: 6980 total spills in shared programs: 4335 -> 4336 (0.02%) spills in affected programs: 90 -> 91 (1.11%) helped: 1 / HURT: 2 total fills in shared programs: 4194 -> 4196 (0.05%) fills in affected programs: 463 -> 465 (0.43%) helped: 1 / HURT: 2 total sends in shared programs: 1079446 -> 1079238 (-0.02%) sends in affected programs: 784 -> 576 (-26.53%) helped: 8 / HURT: 0 LOST: 117 GAINED: 37 fossil-db: All Intel platforms had similar results. (Lunar Lake shown) Totals: Instrs: 209708136 -> 209695617 (-0.01%); split: -0.02%, +0.01% Send messages: 10927753 -> 10927640 (-0.00%) Cycle count: 30540172048 -> 30427084732 (-0.37%); split: -0.99%, +0.62% Spill count: 511621 -> 510932 (-0.13%); split: -0.22%, +0.08% Fill count: 621166 -> 618440 (-0.44%); split: -0.56%, +0.12% Scratch Memory Size: 35574784 -> 35648512 (+0.21%); split: -0.06%, +0.26% Max live registers: 65453860 -> 65453140 (-0.00%); split: -0.00%, +0.00% Non SSA regs after NIR: 75374990 -> 35195764 (-53.31%) Totals from 503284 (71.25% of 706391) affected shaders: Instrs: 180203778 -> 180191259 (-0.01%); split: -0.02%, +0.01% Send messages: 9699732 -> 9699619 (-0.00%) Cycle count: 30080349592 -> 29967262276 (-0.38%); split: -1.01%, +0.63% Spill count: 511584 -> 510895 (-0.13%); split: -0.22%, +0.08% Fill count: 621120 -> 618394 (-0.44%); split: -0.56%, +0.12% Scratch Memory Size: 35443712 -> 35517440 (+0.21%); split: -0.06%, +0.27% Max live registers: 52566092 -> 52565372 (-0.00%); split: -0.01%, +0.00% Non SSA regs after NIR: 70110949 -> 29931723 (-57.31%) Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>	2025-04-04 06:45:02 +00:00
Ian Romanick	8b2be206f3	brw/algebraic: Constant folding for BROADCAST and SHUFFLE This prevents assertion failures in brw_eu_emit in a later commit in this MR. Even though they have not been previously observed, these assertion failures could happen even without that commit. No shader-db or fossil-db changes on any Intel platform. Fixes: `04e1783278` ("brw: Call brw_fs_opt_algebraic less often") v2: Add SHUFFLE. Suggested by Ken. Fixed indentation. v3: Update BROADCAST exec_size after rebasing on "brw/build: Use SIMD8 temporaries in emit_uniformize". v4: Explain why munging the exec_size is correct. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>	2025-04-04 06:45:02 +00:00
Ian Romanick	1b997c7bcc	brw/coalesce: Prepare brw_opt_register_coalesce for load_reg v2: Explain the problematic situation a little better in the comment. Suggested by Caio. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>	2025-04-04 06:45:02 +00:00
Ian Romanick	15637334ce	brw/copy: Prepare copy_propagation for load_reg The changes to try_copy_propagate will be removed later in the series. v2: Fix up some comments to note that offset != 0 is allowed only when stride == 0. Apply same offset=0 restriction in try_copy_propagate_def too. Allow copy propagation if the source is either a def or UNIFORM. Don't copy prop a load_reg through a non-def value. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>	2025-04-04 06:45:02 +00:00
Ian Romanick	cfc50390fb	brw: Add basic infrastructure for load_reg pseudo op load_reg is something like load_payload except it has a single source. It copies the entire source to the destination. Its purpose is to convert a non-SSA VGRF into an SSA value. This copy is marked as volatile so that it will act as a scheduling barrier. v2: Fix some typos in the commit message. Eliminate the brw_builder::LOAD_REG overload that returns a brw_inst*. This is unlikely to ever be used. Add some checks to brw_validate. All suggested by Caio. v3: Force the source and destination types of the LOAD_REG to by integer. This will (eventually) simplify the creating of unit tests for the pass that adds LOAD_REG instructions. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>	2025-04-04 06:45:02 +00:00
Ian Romanick	b9656d51c0	brw/opt: Move non-SSA register accounting after first brw_opt_split_virtual_grfs v2: Move to immediately before the main optimization loop. Most importantly, this is after the first call to DCE. fossil-db: All Intel platforms had similar results. (Lunar Lake shown) Totals: Non SSA regs after NIR: 237045283 -> 100183460 (-57.74%); split: -58.12%, +0.39% Totals from 701423 (99.26% of 706657) affected shaders: Non SSA regs after NIR: 236868848 -> 100007025 (-57.78%); split: -58.17%, +0.39% Suggested-by: Ken Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31497>	2025-04-04 06:45:02 +00:00
Caleb Callaway	5ad00bae8b	intel/compiler: fix lingering i965 references Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34351>	2025-04-03 03:17:25 +00:00
Ian Romanick	e210b79ce3	brw/nir: Lower fsign again after last call to brw_nir_optimize Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details No shader-db or fossil-db changes on any Intel platform. Fixes: `13332c23` ("intel/brw: Unconditionally run optimizations after nir_opt_uniform_subgroup") Closes: #12888 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34251>	2025-04-02 01:59:49 +00:00
Ian Romanick	ca95cb8178	brw: Fix typo in comment Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34251>	2025-04-02 01:59:49 +00:00
irql-notlessorequal	255166a349	elk: always write the VUE header ELK equivalent of !34211, also required to avoid potential rendering errors with hasvk. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34298>	2025-03-31 16:56:13 +00:00
irql-notlessorequal	fe7e0fd4f1	elk: ensure VUE header writes in HS/DS/GS stages ELK equivalent of !34041, required to avoid potential rendering errors with VK_KHR_maintenance5 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34298>	2025-03-31 16:56:13 +00:00
Lionel Landwerlin	4346210ae6	brw: move texture offset packing to NIR That way we can deal with upcoming non constant values for VK_KHR_maintenance8. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33138>	2025-03-29 02:15:18 +00:00
Lionel Landwerlin	67ae49dede	intel: move lower_texture to brw Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33138>	2025-03-29 02:15:18 +00:00
Lionel Landwerlin	86773b2ba6	brw: don't lower tg4 offsets without LOD The problem this fixes is currently hidden because of the order in which we run nir_lower_tex & intel_nir_lower_texture. The issue is that nir_lower_tex removes the LOD source in some cases and the second run of nir_lower_tex can add it back. This is also only needed on Gfx12.5+ if the LOD is present. Finally move all of the texture lowering to the postprocess phase. No need to run this multiple times. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33138>	2025-03-29 02:15:18 +00:00
Lionel Landwerlin	b87dccc64c	elk: stop using intel_nir_lower_texture It's not doing anything. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33138>	2025-03-29 02:15:18 +00:00
Caio Oliveira	63224f64cc	brw: Remove adjust_block_ips and brw_inst::remove() with defer Now that the brw_ip_ranges analysis is being used, there's no need to track start_ip/end_ips in the blocks as they are mutate. And also no need to call adjust_block_ips at the end of some passes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34012>	2025-03-29 00:25:51 +00:00
Caio Oliveira	8057cfc49d	brw: Use brw_ip_ranges in liveness analysis Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34012>	2025-03-29 00:25:51 +00:00
Caio Oliveira	a6b0783375	brw: Use brw_ip_ranges in scheduling / regalloc Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34012>	2025-03-29 00:25:51 +00:00
Caio Oliveira	3659d36087	brw: Use brw_ip_ranges in passes Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34012>	2025-03-29 00:25:50 +00:00
Caio Oliveira	10660f5adf	brw: Add analysis for block IP ranges Calculate the IP ranges of the shader as an analysis pass. This will later replace the existing tracking of start_ip/end_ip as the blocks are changed (and the defer/adjust scheme to avoid too much churn when that happen). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34012>	2025-03-29 00:25:50 +00:00
Caio Oliveira	fd6045cca9	brw: Track total_instructions in a shader Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34012>	2025-03-29 00:25:50 +00:00
Caio Oliveira	7224b653b5	brw: Use block's num_instructions in scoreboard tests Stop using the start_ip / end_ip, these are not really important for those tests. What the test care was the number of instructions in the block to check for changes and ensure we can peek at them by index. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34012>	2025-03-29 00:25:50 +00:00
Caio Oliveira	1139ede508	brw: Track num_instructions in a block Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34012>	2025-03-29 00:25:50 +00:00
Caio Oliveira	abe8d35cb8	brw: Remove brw_cfg::dump() It was used by the pass tests to verify output with TEST_DEBUG=1, replace it with brw_print_instructions(). The output is slightly different (not printing IP, not reordering the blocks), we can add those features as we need, but given the usage was already very reduced, don't bother with that until need arises. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34012>	2025-03-29 00:25:50 +00:00
Kenneth Graunke	51c67ad7cf	brw: Avoid regioning restrictions for u2u16/i2i16 narrowing conversions Cuts 0.83% of instructions on Alchemist in affected fossil-db shaders (nearly all of which are in parallel-rdp). Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31833>	2025-03-28 13:40:07 +00:00
Kenneth Graunke	86f8b8860e	brw: Use a smaller type for masked sub-32-bit shift values Cuts 0.14% of instructions on Alchemist in affected fossil-db shaders (all of which are in parallel-rdp). Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31833>	2025-03-28 13:40:07 +00:00
Kenneth Graunke	2e108afb8c	brw: Skip unnecessary UNDEFs for comparisons For example, SIMD16 W/UW fills an entire REG_SIZE so UNDEF isn't needed. No change in fossil-db. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31833>	2025-03-28 13:40:07 +00:00
Kenneth Graunke	771e65b0db	brw: Emit UNDEF as needed in SSA-style builder helpers Should prevent regressions in a future commit. fossil-db does show small changes, but it ends up a wash at 0.0%. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31833>	2025-03-28 13:40:07 +00:00
Kenneth Graunke	b89e269a46	brw: Make a helper to emit UNDEF for temporaries containing small types Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31833>	2025-03-28 13:40:07 +00:00
Sagar Ghuge	191d1e7345	intel/compiler: Don't lower 64bit data memory access on LSC Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34189>	2025-03-28 03:07:56 +00:00
Lionel Landwerlin	4db4bd1d04	brw: always write the VUE header In `35df3925ca` ("brw: ensure VUE header writes in HS/DS/GS stages") I misread the PRMs and thought that the VF would initialize the header. What actually happens is that the VF does not write valid values in there and the PRMs explicitly say that the VS shader should overwrite whatever is in there. We could avoid writing the header in some cases when no HW is going to read back the header. For example with rendering disables through 3DSTATE_STREAMOUT::RenderingDisable. But those cases are dynamic and the compiler is not able to tell. So just always write the header. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `35df3925ca` ("brw: ensure VUE header writes in HS/DS/GS stages") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12880 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34211>	2025-03-27 07:42:23 +00:00
Caio Oliveira	72aefea0a0	brw: Fix disassembler trying to decode 3src_hstride in Gfx9 This field is not encoded for Gfx9, so use the fixed value that makes sense for that platform. Fixes: `9dfff2cb14` ("brw: Allow generating destination with stride 2 in 3-src instructions") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12881 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34197>	2025-03-26 18:12:46 +00:00
Caio Oliveira	e384ccde28	brw: Expand EU validation for DPAS Allow BFloat16 types when supported and allow destination/accumulator to match the other source types in Gfx20+. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34035>	2025-03-25 07:38:08 +00:00
Caio Oliveira	6cec413a78	brw: Add EU assembler support for bfloat16 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33664>	2025-03-25 05:23:37 +00:00
Caio Oliveira	e37b707bd0	brw: Consider bfloat16 in scoreboard Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33664>	2025-03-25 05:23:37 +00:00
Caio Oliveira	62323a934b	brw: Add BRW_TYPE_BF validation Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33664>	2025-03-25 05:23:37 +00:00
Caio Oliveira	9916cc1050	brw: Add BRW_TYPE_BF for bfloat16 Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33664>	2025-03-25 05:23:37 +00:00
Caio Oliveira	d1f4fb8eee	brw: Make some integer check more explicit Use the positive ("is int?") check when applicable. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33664>	2025-03-25 05:23:37 +00:00
Caio Oliveira	c3d2ba6973	brw: Remove prefix gfx10 from enum types The values already use BRW, make it consistent. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33664>	2025-03-25 05:23:37 +00:00
Caio Oliveira	9dfff2cb14	brw: Allow generating destination with stride 2 in 3-src instructions Will be useful for testing BFloat16 in later patches. No change expected to the compiler itself. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33664>	2025-03-25 05:23:37 +00:00
Caio Oliveira	676b874ca9	brw: Fix decoding of 3-src destination stride in EU validation Fixes: `f1036da345` ("intel/brw: Add vstride/width/hstride to brw_hw_decoded_inst") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33664>	2025-03-25 05:23:37 +00:00
Caio Oliveira	89a87fab66	brw: Remove extra SHADER_OPCODE_FLOW emitted during NIR conversion Some checks are pending macOS-CI / macOS-CI (dri) (push) Waiting to run Details macOS-CI / macOS-CI (xlib) (push) Waiting to run Details The DO() helper already emits a FLOW. Fixes: `d2c39b1779` ("intel/brw: Always have a (non-DO) block after a DO in the CFG") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33954>	2025-03-25 02:05:26 +00:00
Caio Oliveira	c01655370d	brw: Add assembler support for DPAS Allow us to parse instructions in a form we currently generate ``` dpas.8x8(8) g55<1>F g47<1,1,0>F g31<1,1,0>HF g39<1,1,0>HF { align1 WE_all 1Q $4 }; ``` Regions are not really needed, but this will be handled in a later patch (that will also stop printing the regions). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34031>	2025-03-25 01:40:02 +00:00
Connor Abbott	7a55e13939	nir, compiler: Rename needs_quad_helper_invocations This currently treats coarse and fine derivatives the same, but Qualcomm needs to know whether just coarse derivatives are used or fine derivatives/quad ops are also used. Rename this to needs_coarse_quad_helper_invocations make clear the difference from the new field, needs_full_quad_helper_invocations. Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com> Fixes: `264d8a6766` ("ir3: Set need_full_quad depending on info.fs.require_full_quads") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33862>	2025-03-14 21:55:57 +00:00
Matt Turner	ed42dc56f5	intel/compiler: Use correct enum type Fixes: `ce7208c3ee` ("brw: add support for texel address lowering") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34014>	2025-03-13 20:11:10 +00:00
Matt Turner	d5dcc6a5c4	intel/compiler: Add missing breaks Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34014>	2025-03-13 20:11:10 +00:00
Matt Turner	0a63d629fe	intel/compiler: Use unreachable instead of assert(!"...") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34014>	2025-03-13 20:11:10 +00:00

1 2 3 4 5 ...

4222 commits