fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-18 02:58:06 +02:00

Author	SHA1	Message	Date
Jason Ekstrand	aea88f16df	intel/fs: SEL_EXEC uses the integer pipe for 64-bit stuff Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16970>	2022-09-28 05:38:36 +00:00
Jason Ekstrand	c80c0ed943	intel/fs: Always use integer types for indirect MOVs There's a new Gen12.5 restriction which forbids using the VxH or Vx1 on the floating-point pipe. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16970>	2022-09-28 05:38:36 +00:00
Lionel Landwerlin	c6a7f4b34e	intel/devinfo: Rename & implement num_dual_subslices v2: Use the upper bound of dual subslices as the ID is not remapped with fused off parts and this is what we'll use for a bunch of computation in RT. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16970>	2022-09-28 05:38:36 +00:00
Kenneth Graunke	abba55382f	intel/compiler: Vectorize gl_TessLevelInner/Outer[] writes Setting the NIR options takes care of iris thanks to the common st/mesa linking code, and updating brw_nir_link_shaders should handle anv. The main effort here is updating remap_tess_levels, which needs to handle vector stores, writemasking, and swizzling. Unfortunately, we also need to continue handling the existing single-component access because it's used for TES inputs, which we don't vectorize. We could try to vectorize TES inputs too, but they're all pushed anyway, so it wouldn't buy us much other than deleting this code. Also, we do have opt_combine_stores, but not one for loads. One limitation of using nir_vectorize_tess_levels is that it works on variables, and so isn't able to combine outer/inner writes that happen to live in the same vec4 slot (for triangle domains). That said, it's still better than before. For writes, we allow the intrinsics to supply up to the full size of the variable (vec4 for outer, vec2 for inner) even if the domain only requires a subset of those components (i.e. triangles needs 3). shader-db results on Icelake: total instructions in shared programs: 19605070 -> 19602284 (-0.01%) instructions in affected programs: 65338 -> 62552 (-4.26%) helped: 271 / HURT: 0 helped stats (abs) min: 6 max: 24 x̄: 10.28 x̃: 12 helped stats (rel) min: 1.30% max: 18.18% x̄: 5.80% x̃: 7.59% 95% mean confidence interval for instructions value: -10.71 -9.85 95% mean confidence interval for instructions %-change: -6.17% -5.43% Instructions are helped. total cycles in shared programs: 851854659 -> 851820320 (<.01%) cycles in affected programs: 618749 -> 584410 (-5.55%) helped: 271 / HURT: 0 helped stats (abs) min: 69 max: 540 x̄: 126.71 x̃: 108 helped stats (rel) min: 2.57% max: 37.97% x̄: 6.17% x̃: 5.06% 95% mean confidence interval for cycles value: -135.89 -117.54 95% mean confidence interval for cycles %-change: -6.72% -5.63% Cycles are helped. total sends in shared programs: 1025285 -> 1024355 (-0.09%) sends in affected programs: 6454 -> 5524 (-14.41%) helped: 271 / HURT: 0 helped stats (abs) min: 2 max: 8 x̄: 3.43 x̃: 4 helped stats (rel) min: 5.71% max: 25.00% x̄: 14.98% x̃: 17.39% 95% mean confidence interval for sends value: -3.57 -3.29 95% mean confidence interval for sends %-change: -15.42% -14.54% Sends are helped. According to Felix DeGrood, this results in a 10% improvement in the draw call time for certain draw calls from Strange Brigade. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17944>	2022-09-27 18:17:56 -07:00
Kenneth Graunke	be21d54aca	intel/compiler: Use an existing URB write to end TCS threads when viable VS, TCS, TES, and GS threads must end with a URB write message with the EOT (end of thread) bit set. For VS and TES, we shadow output variables with temporaries and perform all stores at the end of the shader, giving us an existing message to do the EOT. In tessellation control shaders, we don't defer output stores until the end of the thread like we do for vertex or evaluation shaders. We just process store_output and store_per_vertex_output intrinsics where they occur, which may be in control flow. So we can't guarantee that there's a URB write being at the end of the shader. Traditionally, we've just emitted a separate URB write to finish TCS threads, doing a writemasked write to an single patch header DWord. On Broadwell, we need to set a "TR DS Cache Disable" bit, so this is a convenient spot to do so. But on other platforms, there's no such field, and this write is purely wasteful. Insetad of emitting a separate write, we can just look for an existing URB write at the end of the program and tag that with EOT, if possible. We already had code to do this for geometry shaders, so just lift it into a helper function and reuse it. No changes in shader-db. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17944>	2022-09-27 18:17:42 -07:00
Lionel Landwerlin	e76e3d9cea	intel/nir/rt: fixup alignment of memcpy iterations Not sure if fixes anything because it's always 16 at least, but this is more correct. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17396>	2022-09-23 08:29:17 +00:00
Lionel Landwerlin	139e8f4635	intel/fs: fixup a64 messages And run algebraic when either int64 for float64 are not supported so those don't end up in the generated code. Cc: mesa-stable Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17396>	2022-09-23 08:29:17 +00:00
Lionel Landwerlin	838bbdcf2e	intel/nir/rt: store ray query state in scratch Initially I tried to store ray query state in the RT scratch space but got the offset wrong. In the end putting this in the scratch surface makes more sense, especially for non RT stages. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `c78be5da30` ("intel/fs: lower ray query intrinsics") Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17396>	2022-09-23 08:29:17 +00:00
Lionel Landwerlin	f7fab09a07	intel/nir/rt: change scratch check validation It's very unfortunate that we have the RT scratch being conflated with the usual scratch. In our implementation those are 2 different buffers. The usual scratch access are done through the scratch surface state (delivered through thread payload), while RT scratch (which outlives thread dispatch with shader calls) is its own buffer. So checking the NIR scratch size makes no sense as we can have normal scratch accesses completely unrelated to RT scratch accesses. This change switches the validation by looking at whether the scratch base pointer intrinsic is being used (which is what we use/abuse to implement RT scratch). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `c78be5da30` ("intel/fs: lower ray query intrinsics") Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17396>	2022-09-23 08:29:17 +00:00
Lionel Landwerlin	259b1647e6	intel/nir/rt: fix ray query proceed level Initially the level is world (top level), then it's whatever level the potential hit is. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `c78be5da30` ("intel/fs: lower ray query intrinsics") Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17396>	2022-09-23 08:29:17 +00:00
Lionel Landwerlin	3f01071c79	intel/nir/rt: remove ray query mem hit writes at initialization This will not even be read by HW. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17396>	2022-09-23 08:29:17 +00:00
Lionel Landwerlin	f843bec7de	intel/nir/rt: spill/fill the entire ray query data We need the traversal stack to saved/restored along with mem hits. Total spill/fill is 256bytes. We can potentially optimize this but we have to be very careful about what state the query is in. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `c78be5da30` ("intel/fs: lower ray query intrinsics") Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17396>	2022-09-23 08:29:17 +00:00
Lionel Landwerlin	a88f725eea	intel/nir/rt: fixup generate hit This function copies the potential hit from its memory location to the committed hit location. A couple of fields got their bit offset wrong. Fixes some CTS tests in dEQP-VK.ray_query.* v2: Copy primitive/instance leaf pointers Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `0465714790` ("intel/nir/rt: add more helpers for ray queries") Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17396>	2022-09-23 08:29:17 +00:00
Marcin Ślusarz	ac8020ebfd	intel/compiler: add support for 8/16 bits task payload loads Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18501>	2022-09-21 09:16:20 +00:00
Marcin Ślusarz	ac581b30ec	intel/compiler: refactor brw_nir_lower_mem_access_bit_sizes Change dup_mem_intrinsic return type. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18501>	2022-09-21 09:16:20 +00:00
Marcin Ślusarz	a31b8fa38b	intel/compiler/task: use shared memory for small task payload loads & stores Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18501>	2022-09-21 09:16:20 +00:00
Marcin Ślusarz	37e78803d7	intel/compiler: use nir_lower_task_shader pass This implements task payload atomics in ANV. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16852>	2022-09-20 18:04:29 +00:00
Marcin Ślusarz	3c96959bbc	intel/compiler: print shader after successful brw_nir_lower_shading_rate_output Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18702>	2022-09-20 17:23:45 +00:00
Marcin Ślusarz	cfd1e5a91e	intel/compiler: remove second shading rate lowering for mesh It's already called in brw_postprocess_nir and calling it the second time actually breaks shading rate. Initially, when I added this call here in `9acb30c8c4`, I was testing it on an internal tree, which didn't have brw_nir_lower_shading_rate_output call in brw_postprocess_nir. Fixes: `9acb30c8c4` ("intel/compiler: implement primitive shading rate for mesh") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18702>	2022-09-20 17:23:45 +00:00
José Roberto de Souza	f4857591e1	intel/compiler/fs: Use DF to load constants when has_64bit_int is not supported This was already been done to gen7 platforms, so now extending to all platforms without has_64bit_int. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18577>	2022-09-14 19:32:43 +00:00
José Roberto de Souza	daf0b67bc2	intel/compiler/fs: Fix compilation of shaders with SHADER_OPCODE_SHUFFLE of float64 type During the lower_regioning() optimization, required_exec_type() is returning BRW_REGISTER_TYPE_UQ type when processing SHADER_OPCODE_SHUFFLE instructions of type BRW_REGISTER_TYPE_DF but MTL has float64 support but lacks int64 support causing shader compilation to fail. To fix that we could make required_exec_type() return BRW_REGISTER_TYPE_DF in such case but SHADER_OPCODE_SHUFFLE virtual instruction runs in the integer pipeline(inferred_exec_pipe()). So here replacing the has_64bit check by has_64bit_int, this will properly handle older and newer cases making this function return BRW_REGISTER_TYPE_UD. Then lower_exec_type() will take care to generate 2 32bits operations to accomplish the same. While at it also dropping the 'devinfo->verx10 == 70' check as GFX7_FEATURES fall into the same category as MTL, has float64 but no int64 support. Fixes at least this crucible tests: func.uniform-subgroup.exclusive.fadd64.q0 func.uniform-subgroup.exclusive.fmin64.q0 func.uniform-subgroup.exclusive.fmax64.q0 Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18577>	2022-09-14 19:32:43 +00:00
Caio Oliveira	e612f32e1a	intel/compiler: Use brw_ud* helpers in thread payload code Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>	2022-09-13 01:44:24 +00:00
Caio Oliveira	f019687d23	intel/compiler: Add a few more brw_ud* helpers Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>	2022-09-13 01:44:24 +00:00
Caio Oliveira	3272868218	intel/compiler: Make thread_payload struct abstract Each shader stage has its own struct and will instantiate it, so the base class doesn't need to be instantiated anymore. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>	2022-09-13 01:44:24 +00:00
Caio Oliveira	0b6e613de8	intel/compiler: Create and use struct for CS thread payload Move subgroup_id, that's only used by CS for verx10 < 125, as part of the payload too -- even though is not, strictly speaking. Note the thread execution of Task/Mesh is similar enough, so we make their common struct inherit from cs_thread_payload. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>	2022-09-13 01:44:24 +00:00
Caio Oliveira	d8461e975a	intel/compiler: Export brw_get_subgroup_id_param_index() Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>	2022-09-13 01:44:24 +00:00
Caio Oliveira	9de790760e	intel/compiler: Create and use struct for Bindless thread payload Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>	2022-09-13 01:44:24 +00:00
Caio Oliveira	a70378f292	intel/compiler: Store start of ICP handles in GS thread payload struct Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>	2022-09-13 01:44:24 +00:00
Caio Oliveira	5b6987daee	intel/compiler: Create and use struct for GS thread payload Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>	2022-09-13 01:44:24 +00:00
Caio Oliveira	7664c85b1d	intel/compiler: Create and use struct for TASK and MESH thread payloads Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>	2022-09-13 01:44:24 +00:00
Caio Oliveira	0ca65b3c4c	intel/compiler: Create and use struct for VS thread payload Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>	2022-09-13 01:44:24 +00:00
Caio Oliveira	19c6e1b447	intel/compiler: Create and use struct for TES thread payload Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>	2022-09-13 01:44:24 +00:00
Caio Oliveira	eb837dd23b	intel/compiler: Store start of ICP handles in TCS thread payload struct Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>	2022-09-13 01:44:24 +00:00
Caio Oliveira	2622fc3af1	intel/compiler: Store Primitive ID in TCS thread payload struct Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>	2022-09-13 01:44:24 +00:00
Caio Oliveira	9a9b1119b4	intel/compiler: Store Patch URB output in TCS thread payload struct Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>	2022-09-13 01:44:24 +00:00
Caio Oliveira	e21359ed0e	intel/compiler: Create struct for TCS thread payload Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>	2022-09-13 01:44:24 +00:00
Caio Oliveira	73920b7e2f	intel/compiler: Use FS thread payload only for FS Move the setup into the FS thread payload constructor. Consolidate payload setup for that in brw_fs_thread_payload.cpp file. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>	2022-09-13 01:44:24 +00:00
Caio Oliveira	dab66d20a7	intel/compiler: Make a type for Thread Payload and FS variant Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18176>	2022-09-13 01:44:24 +00:00
Tapani Pälli	40c2e0a317	intel/compiler: fix assert from ver to verx10 Fixes: `027b8b4249` ("intel/compiler: Add helper for barrier message payload setup for gfx >= 125") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18546>	2022-09-12 19:03:17 +00:00
Jordan Justen	af8ab4a889	intel/compiler: Use builder to allocate fs regs for gs control data bits Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18537>	2022-09-12 10:00:28 -07:00
Caio Oliveira	00b8f9a3a6	intel/compiler: Use builder to allocate fs regs for TCS store output Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18537>	2022-09-12 10:00:18 -07:00
Caio Oliveira	027b8b4249	intel/compiler: Add helper for barrier message payload setup for gfx >= 125 CS-like and TCS control barriers converged in gfx >= 125, so use a common helper for the message payload setup. Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18362>	2022-09-09 09:35:08 -07:00
Caio Oliveira	55db3aaa3a	intel/compiler: Create fs_visitor::emit_tcs_barrier() Allow us to implement this in brw_fs_visitor.cpp, which then will let us deduplicate code between the CS-like barrier and the TCS barrier in a later patch. Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18362>	2022-09-09 09:35:08 -07:00
Kenneth Graunke	19fc870ac6	intel/compiler: Use subgroup invocation for ICP handle loads When loading a TCS or GS input, we generate some code to read the URB handle for a particular input control point (ICP handle), which often involves indirect addressing due to a non-constant vertex. For example: mov(8) vgrf148+0.0:UW, 76543210V shl(8) vgrf149:UD, vgrf148+0.0:UW, 2u shl(8) vgrf150:UD, vgrf145:UD, 5u add(8) vgrf151:UD, vgrf150:UD, vgrf149:UD mov_indirect(8) vgrf147:UD, g2:UD, vgrf151:UD, 96u Unfortunately, the first load with 76543210V is considered a partial write because the 8 channels of 16-bit UW data doesn't fill an entire register, and we can't allocate VGRFs at sub-register granularity. This causes none of the above math to be CSE'd, even though the first two instructions are common to all input loads, and the rest may be reused sometimes as well. To work around this, we stop emitting 76543210V to a temporary, and instead use nir_system_values[SYSTEM_VALUE_SUBGROUP_INVOCATION], which already contains this value, and is unconditionally set up for us. With all input loads using the same register for the sequence, our CSE pass is able to eliminate the rest of the common math. shader-db results on Tigerlake: total instructions in shared programs: 20748243 -> 20744844 (-0.02%) instructions in affected programs: 73410 -> 70011 (-4.63%) helped: 242 / HURT: 21 helped stats (abs) min: 1 max: 37 x̄: 14.17 x̃: 15 helped stats (rel) min: 0.17% max: 19.58% x̄: 6.13% x̃: 6.32% HURT stats (abs) min: 1 max: 4 x̄: 1.38 x̃: 1 HURT stats (rel) min: 0.18% max: 1.31% x̄: 0.58% x̃: 0.58% 95% mean confidence interval for instructions value: -13.73 -12.12 95% mean confidence interval for instructions %-change: -6.00% -5.19% Instructions are helped. total cycles in shared programs: 785828951 -> 785788480 (<.01%) cycles in affected programs: 597593 -> 557122 (-6.77%) helped: 227 / HURT: 13 helped stats (abs) min: 6 max: 624 x̄: 182.19 x̃: 185 helped stats (rel) min: 0.24% max: 18.22% x̄: 7.85% x̃: 7.80% HURT stats (abs) min: 2 max: 153 x̄: 68.08 x̃: 36 HURT stats (rel) min: 0.03% max: 7.79% x̄: 2.97% x̃: 1.25% 95% mean confidence interval for cycles value: -182.55 -154.71 95% mean confidence interval for cycles %-change: -7.84% -6.69% Cycles are helped. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18455>	2022-09-08 15:12:41 +00:00
Tapani Pälli	d276ad4520	intel/compiler: implement Wa_14014595444 for DG2 According to the workaround, we should setup MLOD as parameter 4 and 5 for the sample_b message. v2: only SAMPLE_B, not SAMPLE_B_C (Lionel) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18408>	2022-09-07 05:44:56 +00:00
Marcin Ślusarz	2e1b96bb1b	intel/compiler: implement EXT_mesh_shader Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18371>	2022-09-02 17:40:47 +00:00
Emma Anholt	5f66a927ec	gallium,glsl: Delete PIPE_CAP_VERTEXID_NOBASE and lower_vertex_id. Every driver uses the nir_lower_system_values path now. Reviewed-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18327>	2022-08-31 22:57:03 +00:00
Jason Ekstrand	f1768f5640	intel/compiler: Store the number of position slots in the VUE map Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17602>	2022-08-31 02:00:18 +00:00
Rhys Perry	aa2d6e020b	Revert "nir: Drop the unused instr arg for src/dest copy functions." This reverts commit `c3a0184118`. Acked-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12910>	2022-08-30 18:21:44 +00:00
Marcin Ślusarz	66bc9aec65	intel/compiler: add support for non-zero base in [load\|store]_shared intrins Acked-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17618>	2022-08-29 12:42:40 +00:00

1 2 3 4 5 ...

2233 commits