fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-22 00:30:13 +01:00

Author	SHA1	Message	Date
Caio Oliveira	c621f75e7b	intel/brw: Remove now unused vec4-only opcodes Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691>	2024-02-28 05:45:38 +00:00
Caio Oliveira	a641aa294e	intel/brw: Remove vec4 backend It still exists as part of ELK for older gfx versions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27691>	2024-02-28 05:45:37 +00:00
Ian Romanick	8fb37ef985	intel/fs: Add fast path for ballot(true) This doesn't help very much now. A later commit adds a NIR optimization pass, tentatively called nir_opt_uniform_subgroup, that converts many kinds of subgroup operations to things involving bitCount(ballot(true)). This commit makes a huge difference in the results of that later commit. No shader-db changes on any Intel platform. Fossil-db results: All Intel platforms had similar results. (Ice Lake shown) Totals: Instrs: 165558033 -> 165557519 (-0.00%) Cycles: 15156188362 -> 15156178922 (-0.00%); split: -0.00%, +0.00% Totals from 299 (0.05% of 656117) affected shaders: Instrs: 88293 -> 87779 (-0.58%) Cycles: 3709498 -> 3700058 (-0.25%); split: -0.28%, +0.03% v2: Rebase on splitting ELK from BRW. Remove devinfo->ver >= 8 check. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27044>	2024-02-27 08:37:46 -08:00
Dave Airlie	8f73cc802c	intel/compiler: revert part of "Move earlier scheduler code that is not mode-specific" This removed a bunch of calls from the vec4 code that aren't called anywhere else. Bring back the bits that were removed. Fixes glxgears on gen5 Fixes: `81594d0db1` ("intel/compiler: Move earlier scheduler code that is not mode-specific") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26862>	2024-01-04 00:38:38 +00:00
Ian Romanick	e666872c75	intel/compiler: Initial bits for DPAS instruction v2: Add brw_ir_performance.cpp and brw_fs_generator.cpp changes. Fix overlapping register allocation (via has_source_and_destination_hazard). Fix incorrect destination register file encoding. v3: Prevent lower_regioning from trying to "fix" DPAS sources. v4: Add instruction latency information for scheduling and perf estimates. v5: Remove all mention of DPASW. Suggested by Curro and Caio. Update the comment in fs_inst::has_source_and_destination_hazard. Suggested by Caio. v6: Add some comments near the src2 calculation in fs_inst::size_read. Suggested by Caio. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>	2023-12-29 20:24:16 -08:00
Caio Oliveira	dcb68de656	intel/compiler: Clear up block instructions before re-adding them Avoids fixing up list pointers that we don't care about anymore -- since all the instructions will be re-added in a different order anyway. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>	2023-11-13 23:05:47 +00:00
Caio Oliveira	a9f95bf687	intel/compiler: Reuse same scheduler for all pre-RA scheduling modes Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>	2023-11-13 23:05:47 +00:00
Caio Oliveira	0dd5378ffe	intel/compiler: Make scheduler classes take an external mem_ctx Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>	2023-11-13 23:05:47 +00:00
Caio Oliveira	04aa2df461	intel/compiler: Separate schedule_node temporary data Some fields in schedule_node will need to be reset each time they are used. The `cand_generation` needs to be back to zero, and both `unblocked_time` and `parent_count` need to be back to their initial values, which were pre-calculated. Rename the initial data fields and add new ones for the temporary data. Note the helper function is `per node` to allow it "tag along" with an existing loops. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>	2023-11-13 23:05:47 +00:00
Caio Oliveira	81594d0db1	intel/compiler: Move earlier scheduler code that is not mode-specific This will be useful later on when we reuse the same scheduler for multiple modes. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>	2023-11-13 23:05:47 +00:00
Caio Oliveira	73d4e4118a	intel/compiler: Tidy up code in scheduler related to reads_remaining - Just assert in functions we expect it to exist - Predicate usage with `!post_reg_alloc` to avoid suggest there are more combinations. - Reuse an existing loop to call the count function. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>	2023-11-13 23:05:47 +00:00
Caio Oliveira	4f246cf4e7	intel/compiler: Merge child/latency arrays in schedule_node Values are used together, saves one pointer in schedule_node, reduces amount of reallocations when children count grows. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>	2023-11-13 23:05:47 +00:00
Caio Oliveira	e59a054203	intel/compiler: Move FS specific fields to fs_instruction_scheduler Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>	2023-11-13 23:05:47 +00:00
Caio Oliveira	a6297d05ca	intel/compiler: Remove virtual calls from scheduler Pull run() and schedule_instructions() for fs, and pull a very simplified version of those into a run() for vec4. Because of the previous patches the duplication is small. Since we are touching these, change run() implementations to use the cfg from the existing reference to the visitor/shader instead of taking one as argument. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>	2023-11-13 23:05:47 +00:00
Caio Oliveira	d76d58cf50	intel/compiler: Cache issue_time information Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>	2023-11-13 23:05:47 +00:00
Caio Oliveira	ecd7ffcf78	intel/compiler: Extract scheduling related basic functions Those will be used in multiple places later. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>	2023-11-13 23:05:47 +00:00
Caio Oliveira	8a8dd2db0c	intel/compiler: Add only available instructions to scheduling list The list was used for iterating through all instructions and then later also to track the available ones. Now that the array iteration is used, change how we fill it and rename it to reflect its only job. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>	2023-11-13 23:05:47 +00:00
Caio Oliveira	ddff6428c5	intel/compiler: Use array to iterate the scheduler nodes For all the preparation data collection before the scheduling actually happens, it is possible to walk the schedule nodes in order by iterating on the range of the array dedicated to a given block. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>	2023-11-13 23:05:47 +00:00
Caio Oliveira	fe6ac5a184	intel/compiler: Allocate all schedule_nodes at once Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>	2023-11-13 23:05:47 +00:00
Caio Oliveira	be012055da	intel/compiler: Remove reference to brw_isa_info from schedule_node It is always the same for all nodes, so use the one available in the scheduler itself. Also, per Matt's suggestion, collect is_haswell from devinfo instead of from a function argument. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>	2023-11-13 23:05:47 +00:00
Caio Oliveira	6987571737	intel/compiler: Use linear allocator in parts of brw_schedule_instructions Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25841>	2023-11-13 23:05:47 +00:00
Francisco Jerez	80e9031b44	intel/fs/xe2+: Fix grf_count in post-RA scheduling for updated register file size. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25020>	2023-09-20 17:19:36 -07:00
Kenneth Graunke	7eba19245d	intel/compiler: Move SCHEDULE_NONE handling into schedule_instructions() I'm going to introduce another call site for this function, and just handling SCHEDULE_NONE in the scheduler itself makes more sense than duplicating the logic. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24707>	2023-08-23 21:34:38 +00:00
Emma Anholt	10b94772d2	intel: Reduce cost of resetting last_grf_write. In zink-on-anv fs-mod-dvec3-dvec3.shader_test, we were memsetting 2MB of last_grf_write 2400 times, multiple times through the scheduler. Just resetting for the processed instructions reduces runtime from 21s to 16s. No change on steam shader-db runtime across several runs. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23635>	2023-06-14 16:16:56 +00:00
Emma Anholt	7d4769e802	intel: Allocate the last_grf_write once per scheduler. No need to re-calloc it per block when we're going to use it again. Also, this fixes the vec4 backend to avoid allocating giant grf_count-sized arrays on the stack. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23635>	2023-06-14 16:16:56 +00:00
Emma Anholt	2ad865b219	intel: Count reads_remaining across all blocks. We were zeroing it out per block, but it doesn't actually help to count per block, since the question is "will scheduling this instruction free the reg?". Saves some memsetting, which was showing up high in the profile (but not from this source). No change on iris SKL shader-db. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23635>	2023-06-14 16:16:55 +00:00
Lionel Landwerlin	a66944dfbc	intel/fs: reuse descriptor helper Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>	2023-05-30 06:36:36 +00:00
Lionel Landwerlin	9471ffa70a	intel/fs: fix scheduling of HALT instructions With the following test : dEQP-VK.spirv_assembly.instruction.terminate_invocation.terminate.no_out_of_bounds_load There is a : shader_start: ... <- no control flow g0 = some_alu g1 = fbl g2 = broadcast g3, g1 g4 = get_buffer_size g2 ... <- no control flow halt <- on some lanes g5 = send <surface>, g4 eliminate_find_live_channel will remove the fbl/broadcast because it assumes lane0 is active at get_buffer_size : shader_start: ... <- no control flow g0 = some_alu g4 = get_buffer_size g0 ... <- no control flow halt <- on some lanes g5 = send <surface>, g4 But then the instruction scheduler will move the get_buffer_size after the halt : shader_start: ... <- no control flow halt <- on some lanes g0 = some_alu g4 = get_buffer_size g0 g5 = send <surface>, g4 get_buffer_size pulls the surface index from lane0 in g0 which could have been turned off by the halt and we end up accessing an invalid surface handle. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20765>	2023-05-05 00:43:25 +03:00
Jason Ekstrand	714a291673	intel/compiler: Use SHADER_OPCODE_SEND for PI messages Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>	2023-02-06 09:12:17 +00:00
Lionel Landwerlin	13cca48920	intel/fs: drop FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD_GFX7 We can lower FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD into other more generic sends and drop this internal opcode. The idea behind this change is to allow bindless surfaces to be used for UBO pulls and why it's interesting to be able to reuse setup_surface_descriptors(). But that will come in a later change. No shader-db changes on TGL & DG2. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20416>	2023-01-26 11:26:53 +00:00
Ian Romanick	bdc7668008	intel/fs: Lower URB messages to SEND Before rebasing on top of Ken's split-SEND optimization (see !17018), this commit just caused some scheduling changes in various tessellation and geometry shaders. These changes were caused by the addition of real latency information for the URB messages. With the addition of the split-SEND optimization, the changes are... staggering. All of the shaders helped for spills and fills are vertex shaders from Batman Arkham Origins. What surprises me is that these shaders account for such a high percentage of the spills and fills in fossil-db. 85%?!? v2: Use FIXED_GRF instead of BRW_GENERAL_REGISTER_FILE in an assertion. Suggested by Ken. Tiger Lake, Ice Lake, and Skylake had similar results. (Ice Lake shown) total instructions in shared programs: 20013625 -> 19954020 (-0.30%) instructions in affected programs: 4007157 -> 3947552 (-1.49%) helped: 31161 HURT: 0 helped stats (abs) min: 1 max: 400 x̄: 1.91 x̃: 2 helped stats (rel) min: 0.08% max: 59.70% x̄: 2.20% x̃: 1.83% 95% mean confidence interval for instructions value: -1.97 -1.86 95% mean confidence interval for instructions %-change: -2.22% -2.18% Instructions are helped. total cycles in shared programs: 859337569 -> 858636788 (-0.08%) cycles in affected programs: 74168298 -> 73467517 (-0.94%) helped: 13812 HURT: 16846 helped stats (abs) min: 1 max: 291078 x̄: 82.83 x̃: 4 helped stats (rel) min: <.01% max: 37.09% x̄: 3.47% x̃: 2.02% HURT stats (abs) min: 1 max: 1543 x̄: 26.31 x̃: 14 HURT stats (rel) min: <.01% max: 77.97% x̄: 4.11% x̃: 2.58% 95% mean confidence interval for cycles value: -55.10 9.39 95% mean confidence interval for cycles %-change: 0.62% 0.77% Inconclusive result (value mean confidence interval includes 0). Broadwell total cycles in shared programs: 904844939 -> 904832320 (<.01%) cycles in affected programs: 525360 -> 512741 (-2.40%) helped: 215 HURT: 4 helped stats (abs) min: 4 max: 1018 x̄: 60.16 x̃: 39 helped stats (rel) min: 0.14% max: 15.85% x̄: 2.16% x̃: 2.04% HURT stats (abs) min: 79 max: 79 x̄: 79.00 x̃: 79 HURT stats (rel) min: 1.31% max: 1.57% x̄: 1.43% x̃: 1.43% 95% mean confidence interval for cycles value: -75.02 -40.22 95% mean confidence interval for cycles %-change: -2.37% -1.81% Cycles are helped. No shader-db changes on any older Intel platforms. Tiger Lake, Ice Lake, and Skylake had similar results. (Ice Lake shown) Instructions in all programs: 142622800 -> 141461114 (-0.8%) Instructions helped: 197186 Cycles in all programs: 9101223846 -> 9099440025 (-0.0%) Cycles helped: 37963 Cycles hurt: 151233 Spills in all programs: 98829 -> 13695 (-86.1%) Spills helped: 2159 Fills in all programs: 128142 -> 18400 (-85.6%) Fills helped: 2159 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17379>	2022-07-08 19:45:34 +00:00
Kenneth Graunke	72e9843991	intel/compiler: Introduce a new brw_isa_info structure This structure will contain the opcode mapping tables in the next commit. For now, this is the mechanical change to plumb it into all the necessary places, and it continues simply holding devinfo. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17309>	2022-06-30 23:46:35 +00:00
Lionel Landwerlin	361b3fee3c	intel: move away from booleans to identify platforms v2: Drop changes around GFX_VERx10 == 75 (Luis) v3: Replace (GFX_VERx10 < 75 && devinfo->platform != INTEL_PLATFORM_BYT) by (devinfo->platform == INTEL_PLATFORM_IVB) Replace (devinfo->ver >= 5 \|\| devinfo->platform == INTEL_PLATFORM_G4X) by (devinfo->verx10 >= 45) Replace (devinfo->platform != INTEL_PLATFORM_G4X) by (devinfo->verx10 != 45) v4: Fix crocus typo v5: Rebase v6: Add GFX3, ILK & I965 platforms (Jordan) Move ifdef to code expressions (Jordan) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12981>	2021-11-08 16:48:06 +00:00
Dave Airlie	8a81d14271	intel/vec4: sel.cond writes the flags on Gfx4 and Gfx5 This is the equivalent of idr's intel/fs: sel.cond writes the flags on Gfx4 and Gfx5 except for the vec4 backend. This fixes buggy rendering seen with crocus on a qt trace. v2 (idr): Trivial whitespace change. Add unit tests. v3: Fix type in comment in unit tests. Noticed by Jason and Priit. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Iron Lake total instructions in shared programs: 8183077 -> 8184543 (0.02%) instructions in affected programs: 198990 -> 200456 (0.74%) helped: 0 HURT: 1355 HURT stats (abs) min: 1 max: 8 x̄: 1.08 x̃: 1 HURT stats (rel) min: 0.29% max: 6.00% x̄: 0.99% x̃: 0.70% 95% mean confidence interval for instructions value: 1.04 1.12 95% mean confidence interval for instructions %-change: 0.96% 1.03% Instructions are HURT. total cycles in shared programs: 238967672 -> 238962784 (<.01%) cycles in affected programs: 4666014 -> 4661126 (-0.10%) helped: 406 HURT: 314 helped stats (abs) min: 4 max: 54 x̄: 22.46 x̃: 18 helped stats (rel) min: <.01% max: 12.80% x̄: 1.82% x̃: 0.65% HURT stats (abs) min: 2 max: 112 x̄: 13.48 x̃: 12 HURT stats (rel) min: <.01% max: 7.82% x̄: 0.81% x̃: 0.16% 95% mean confidence interval for cycles value: -8.60 -4.98 95% mean confidence interval for cycles %-change: -0.87% -0.49% Cycles are helped. GM45 total instructions in shared programs: 4986888 -> 4988354 (0.03%) instructions in affected programs: 198990 -> 200456 (0.74%) helped: 0 HURT: 1355 HURT stats (abs) min: 1 max: 8 x̄: 1.08 x̃: 1 HURT stats (rel) min: 0.29% max: 6.00% x̄: 0.99% x̃: 0.70% 95% mean confidence interval for instructions value: 1.04 1.12 95% mean confidence interval for instructions %-change: 0.96% 1.03% Instructions are HURT. total cycles in shared programs: 153577826 -> 153572938 (<.01%) cycles in affected programs: 4666014 -> 4661126 (-0.10%) helped: 406 HURT: 314 helped stats (abs) min: 4 max: 54 x̄: 22.46 x̃: 18 helped stats (rel) min: <.01% max: 12.80% x̄: 1.82% x̃: 0.65% HURT stats (abs) min: 2 max: 112 x̄: 13.48 x̃: 12 HURT stats (rel) min: <.01% max: 7.82% x̄: 0.81% x̃: 0.16% 95% mean confidence interval for cycles value: -8.60 -4.98 95% mean confidence interval for cycles %-change: -0.87% -0.49% Cycles are helped. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12191>	2021-08-11 13:09:32 -07:00
Ian Romanick	38807ceeae	intel/fs: sel.cond writes the flags on Gfx4 and Gfx5 On Gfx4 and Gfx5, sel.l (for min) and sel.ge (for max) are implemented using a separte cmpn and sel instruction. This lowering occurs in fs_vistor::lower_minmax which is called very, very late... a long, long time after the first calls to opt_cmod_propagation. As a result, conditional modifiers can be incorrectly propagated across sel.cond on those platforms. No tests were affected by this change, and I find that quite shocking. After just changing flags_written(), all of the atan tests started failing on ILK. That required the change in cmod_propagatin (and the addition of the prop_across_into_sel_gfx5 unit test). Shader-db results for ILK and GM45 are below. I looked at a couple before and after shaders... and every case that I looked at had experienced incorrect cmod propagation. This affected a LOT of apps! Euro Truck Simulator 2, The Talos Principle, Serious Sam 3, Sanctum 2, Gang Beasts, and on and on... :( I discovered this bug while working on a couple new optimization passes. One of the passes attempts to remove condition modifiers that are never used. The pass made no progress except on ILK and GM45. After investigating a couple of the affected shaders, I noticed that the code in those shaders looked wrong... investigation led to this cause. v2: Trivial changes in the unit tests. v3: Fix type in comment in unit tests. Noticed by Jason and Priit. v4: Tweak handling of BRW_OPCODE_SEL special case. Suggested by Jason. Fixes: `df1aec763e` ("i965/fs: Define methods to calculate the flag subset read or written by an fs_inst.") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Dave Airlie <airlied@redhat.com> Iron Lake total instructions in shared programs: 8180493 -> 8181781 (0.02%) instructions in affected programs: 541796 -> 543084 (0.24%) helped: 28 HURT: 1158 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.35% max: 0.86% x̄: 0.53% x̃: 0.50% HURT stats (abs) min: 1 max: 3 x̄: 1.14 x̃: 1 HURT stats (rel) min: 0.12% max: 4.00% x̄: 0.37% x̃: 0.23% 95% mean confidence interval for instructions value: 1.06 1.11 95% mean confidence interval for instructions %-change: 0.31% 0.38% Instructions are HURT. total cycles in shared programs: 239420470 -> 239421690 (<.01%) cycles in affected programs: 2925992 -> 2927212 (0.04%) helped: 49 HURT: 157 helped stats (abs) min: 2 max: 284 x̄: 62.69 x̃: 70 helped stats (rel) min: 0.04% max: 6.20% x̄: 1.68% x̃: 1.96% HURT stats (abs) min: 2 max: 48 x̄: 27.34 x̃: 24 HURT stats (rel) min: 0.02% max: 2.91% x̄: 0.31% x̃: 0.20% 95% mean confidence interval for cycles value: -0.80 12.64 95% mean confidence interval for cycles %-change: -0.31% <.01% Inconclusive result (value mean confidence interval includes 0). GM45 total instructions in shared programs: 4985517 -> 4986207 (0.01%) instructions in affected programs: 306935 -> 307625 (0.22%) helped: 14 HURT: 625 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.35% max: 0.82% x̄: 0.52% x̃: 0.49% HURT stats (abs) min: 1 max: 3 x̄: 1.13 x̃: 1 HURT stats (rel) min: 0.12% max: 3.90% x̄: 0.34% x̃: 0.22% 95% mean confidence interval for instructions value: 1.04 1.12 95% mean confidence interval for instructions %-change: 0.29% 0.36% Instructions are HURT. total cycles in shared programs: 153827268 -> 153828052 (<.01%) cycles in affected programs: 1669290 -> 1670074 (0.05%) helped: 24 HURT: 84 helped stats (abs) min: 2 max: 232 x̄: 64.33 x̃: 67 helped stats (rel) min: 0.04% max: 4.62% x̄: 1.60% x̃: 1.94% HURT stats (abs) min: 2 max: 48 x̄: 27.71 x̃: 24 HURT stats (rel) min: 0.02% max: 2.66% x̄: 0.34% x̃: 0.14% 95% mean confidence interval for cycles value: -1.94 16.46 95% mean confidence interval for cycles %-change: -0.29% 0.11% Inconclusive result (value mean confidence interval includes 0). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12191>	2021-08-11 13:09:20 -07:00
Lionel Landwerlin	91dcbf1f56	intel/compiler: Track latency/perf of LSC fences Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11759>	2021-07-12 11:39:03 +00:00
Sagar Ghuge	621cf9b1df	intel/fs: Lower Byte scattered r/w messages to LSC when available v2 (Jason Ekstrand): - Squash in brw_scheduler changes - Update brw_ir_performance Co-authored-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11600>	2021-06-30 16:17:18 +00:00
Sagar Ghuge	8f82c8aa1a	intel/fs: Lower untyped float atomic messages to LSC when available Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11600>	2021-06-30 16:17:18 +00:00
Mark Janes	bd40a1e8c9	intel/fs: Lower untyped atomic messages to LSC when available Bspec programming note metions that "Atomic messages are always forced to "un-cacheable" in the L1 cache". We can make the L1 cache un-cacheable and L3 with write-back policy. v2: (Sagar Ghuge): - Fix caching policy for atomic messages - Fix simd exec size v3: (Sagar Ghuge): - Add atomic messages to brw_schedule_instructions v4: (Jason Ekstrand): - Rebase on lsc_msg_desc reworks Co-authored-by: Sagar Ghuge <sagar.ghuge@intel.com> Co-authored-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11600>	2021-06-30 16:17:18 +00:00
Mark Janes	4f86a70599	intel/fs: Lower DW untyped r/w messages to LSC when available This puts the basic infrastructure in place for lowering logical dataport messages to LSC messages. We start with the two most obvious opcodes and add more in later patches. v2 (Sagar Ghuge): - Pass required params to message desc - Remove duplicate mlen calculation - Change commit message. v3 (Jason Ekstrand): - Drop TGM support Co-authored-by: Jason Ekstrand <mark.a.janes@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11600>	2021-06-30 16:17:18 +00:00
Mark Janes	32ec0662fd	intel/compiler: Add LSC messages to brw_schedule_instructions v2 (Jason Ekstrand): - Use lsc_msg_desc_opcode() - Drop all opcodes for now and add them in later patches. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11600>	2021-06-30 16:17:18 +00:00
Lionel Landwerlin	d665c2dcf0	intel/compiler: use existing helpers to pull bits of descriptors v2: Use new RT descriptor helper Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7455>	2021-05-02 20:20:06 +00:00
Jason Ekstrand	134af5ada2	intel/compiler: Don't insert barriers for NULL sources Normally, we never see NULL in a source. However, starting with `eab1c55590`, we can with a SHADER_OPCODE_SEND if it only has the first payload. We were inserting barriers which adds unnecessary scheduling dependencies and takes a lot of compile time because inserting a single barrier is an O(n) operation. All the extra O(n) can have a surprisingly large effect. This cuts the runtime of dEQP-VK.binding_model.buffer_device_address.set3.depth3. basessbo.convertcheckuv2.store.single.std140.frag by a factor of 20x for a debug build. Shader-db results on ICL: total instructions in shared programs: 19918983 -> 19921610 (0.01%) instructions in affected programs: 884074 -> 886701 (0.30%) helped: 1688 HURT: 817 helped stats (abs) min: 1 max: 163 x̄: 4.23 x̃: 1 helped stats (rel) min: 0.02% max: 12.50% x̄: 1.08% x̃: 0.61% HURT stats (abs) min: 1 max: 2674 x̄: 11.95 x̃: 2 HURT stats (rel) min: 0.11% max: 70.22% x̄: 1.71% x̃: 1.03% 95% mean confidence interval for instructions value: -1.97 4.06 95% mean confidence interval for instructions %-change: -0.28% -0.06% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 976503324 -> 975884809 (-0.06%) cycles in affected programs: 82581703 -> 81963188 (-0.75%) helped: 4144 HURT: 5010 helped stats (abs) min: 1 max: 79294 x̄: 311.31 x̃: 8 helped stats (rel) min: <.01% max: 53.69% x̄: 2.00% x̃: 0.51% HURT stats (abs) min: 1 max: 92266 x̄: 134.04 x̃: 8 HURT stats (rel) min: <.01% max: 218.09% x̄: 3.25% x̃: 0.53% 95% mean confidence interval for cycles value: -119.85 -15.29 95% mean confidence interval for cycles %-change: 0.68% 1.07% Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree). total spills in shared programs: 10659 -> 12014 (12.71%) spills in affected programs: 441 -> 1796 (307.26%) helped: 7 HURT: 12 total fills in shared programs: 11551 -> 14429 (24.92%) fills in affected programs: 993 -> 3871 (289.83%) helped: 8 HURT: 11 total sends in shared programs: 1025832 -> 1025353 (-0.05%) sends in affected programs: 2241 -> 1762 (-21.37%) helped: 105 HURT: 1 helped stats (abs) min: 1 max: 87 x̄: 4.57 x̃: 2 helped stats (rel) min: 5.56% max: 54.72% x̄: 11.37% x̃: 10.00% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% 95% mean confidence interval for sends value: -7.39 -1.65 95% mean confidence interval for sends %-change: -12.95% -7.70% Sends are helped. LOST: 93 GAINED: 109 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4648 Fixes: `eab1c55590` "intel/fs: Support SENDS in SHADER_OPCODE_SEND" Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10412>	2021-04-22 18:00:16 +00:00
Anuj Phogat	61e8636557	intel: Rename gen_device prefix to intel_device export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "gen_device" -rIl $SEARCH_PATH \| xargs sed -ie "s/gen_device/intel_device/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241>	2021-04-20 20:06:33 +00:00
Anuj Phogat	e7e55af4d6	intel: Rename GENx keyword to GFXx Commands used to do the changes: export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "GEN[[:digit:]]+" -rIl $SEARCH_PATH \| xargs sed -ie "s/GEN$[[:digit:]]\+$/GFX\1/g" Exclude the changes to modifiers: grep -E "I915_.GFX" -rIl $SEARCH_PATH \| xargs sed -ie "s/$I915_.$GFX/\1GEN/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>	2021-04-02 18:33:07 +00:00
Anuj Phogat	1d296484b4	intel: Rename Genx keyword to Gfxx Commands used to do the changes: export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "Gen[[:digit:]]+" -rIl $SEARCH_PATH \| xargs sed -ie "s/Gen$[[:digit:]]\+$/Gfx\1/g" Exclude changes in src/intel/perf/oa-.xml: find src/intel/perf -type f $ -name ".xml" $ \| xargs sed -ie "s/Gfx/Gen/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>	2021-04-02 18:33:07 +00:00
Anuj Phogat	b75f095bc7	intel: Rename genx keyword to gfxx in source files Commands used to do the changes: export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "gen[[:digit:]]+" -rIl $SEARCH_PATH \| xargs sed -ie "s/gen$[[:digit:]]\+$/gfx\1/g" Exclude pack.h and xml changes in this patch: grep -E "gfx[[:digit:]]+_pack\.h" -rIl $SEARCH_PATH \| xargs sed -ie "s/gfx$[[:digit:]]\+_pack\.h$/gen\1/g" grep -E "gfx[[:digit:]]+\.xml" -rIl $SEARCH_PATH \| xargs sed -ie "s/gfx$[[:digit:]]\+\.xml$/gen\1/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>	2021-04-02 18:33:07 +00:00
Anuj Phogat	c1f3a778de	intel: Rename GENx prefix in macros to GFXx in source files Commands used to do the changes: export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "GEN" -rIl src/intel/genxml \| grep -E ".py" \| xargs sed -ie "s/GEN$[%{]$/GFX\1/g" grep -E "[^_]GEN[[:digit:]]+" -rIl $SEARCH_PATH \| grep -E ".(\.c\|\.h\|\.y\|\.l)" \| xargs sed -ie "s/$[^_]$GEN$[[:digit:]]\+$/\1GFX\2/g" Leave out renaming GFX12_CCS_E macros. They fall under renaming pattern like "_GEN[[:digit:]]+": grep -E "GFX12_CCS_E" -rIl $SEARCH_PATH \| xargs sed -ie "s/GFX12_CCS_E/GEN12_CCS_E/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>	2021-04-02 18:33:07 +00:00
Anuj Phogat	abe9a71a09	intel: Rename gen field in gen_device_info struct to ver Commands used to do the changes: export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "info\)(.\|->)gen" -rIl $SEARCH_PATH \| xargs sed -ie "s/info$)$$\.\\|->$gen/info\1\2ver/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>	2021-04-02 18:33:07 +00:00
Jason Ekstrand	91192696e6	intel/fs: Add support for 16-bit A64 float and integer atomics The messages for those 16-bit operations still use 32-bit sources and destinations, so expand them accordingly when building the payload. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8750>	2021-03-18 00:13:40 +00:00

1 2

91 commits