fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-21 20:10:14 +01:00

Author	SHA1	Message	Date
Marcin Ślusarz	a252123363	intel/compiler/mesh: compactify MUE layout Instead of using 4 dwords for each output slot, use only the amount of memory actually needed by each variable. There are some complications from this "obvious" idea: - flat and non-flat variables can't be merged into the same vec4 slot, because flat inputs mask has vec4 stride - multi-slot variables can have different layout: float[N] requires N 1-dword slots, but i64vec3 requires 1 fully occupied 4-dword slot followed by 2-dword slot - some output variables occur both in single-channel/component split and combined variants - crossing vec4 boundary requires generating more writes, so avoiding them if possible is beneficial This patch fixes some issues with arrays in per-vertex and per-primitive data (func.mesh.ext.outputs.*.indirect_array.q0 in crucible) and by reduction in single MUE size it allows spawning more threads at the same time. Note: this patch doesn't improve vk_meshlet_cadscene performance because default layout is already optimal enough. Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20407>	2023-07-24 07:55:29 +00:00
Lionel Landwerlin	3384f029be	intel/compiler: rework input parameters Use a struct for various common parameters rather than per stage structure or arguments to stage specific entrypoints. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23942>	2023-07-20 09:08:08 +00:00
Marcin Ślusarz	36ff6c0004	intel/compiler: remove NV_mesh_shader support Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24071>	2023-07-14 08:27:14 +00:00
Lionel Landwerlin	c26c0a36d3	intel/fs: disable coarse pixel shader with interpolater messages at sample Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9292 Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23962>	2023-07-06 12:48:52 +00:00
Yonggang Luo	68b8aa788d	intel/compiler: Switch to use nir_foreach_function_impl Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23920>	2023-06-29 11:29:54 +00:00
Ian Romanick	ed5d346868	intel/fs: Add missing newline Emacs will add a newline to the end of this file whether I've edited that line or not. It was driving me up the wall, so... yeah. Trivial. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23777>	2023-06-21 19:57:58 +00:00
Caio Oliveira	fde8bf7b7f	intel/compiler: Respect NIR_DEBUG_PRINT_INTERNAL flag If flag is not set, don't print debugging information for internal shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23756>	2023-06-21 00:01:10 +00:00
Lionel Landwerlin	ff3494fce3	intel/fs: print identation for control flow INTEL_DEBUG=optimizer output changes from : { 10} 40: cmp.nz.f0.0(8) null:F, vgrf3470:F, 0f { 10} 41: (+f0.0) if(8) (null):UD, { 11} 42: txf_logical(8) vgrf3473:UD, vgrf250:D(null):UD, 0d(null):UD(null):UD(null):UD(null):UD, 31u, 0u(null):UD(null):UD(null):UD, 3d, 0d { 12} 43: and(8) vgrf262:UD, vgrf3473:UD, 2u { 11} 44: cmp.nz.f0.0(8) null:D, vgrf262:D, 0d { 10} 45: (+f0.0) if(8) (null):UD, { 11} 46: mov(8) vgrf270:D, -1082130432d { 12} 47: mov(8) vgrf271:D, 1082130432d { 14} 48: mov(8) vgrf274+0.0:D, 0d { 14} 49: mov(8) vgrf274+1.0:D, 0d to : { 10} 40: cmp.nz.f0.0(8) null:F, vgrf3470:F, 0f { 10} 41: (+f0.0) if(8) (null):UD, { 11} 42: txf_logical(8) vgrf3473:UD, vgrf250:D(null):UD, 0d(null):UD(null):UD(null):UD(null):UD, 31u, 0u(null):UD(null):UD(null):UD, 3d, 0d { 12} 43: and(8) vgrf262:UD, vgrf3473:UD, 2u { 11} 44: cmp.nz.f0.0(8) null:D, vgrf262:D, 0d { 10} 45: (+f0.0) if(8) (null):UD, { 11} 46: mov(8) vgrf270:D, -1082130432d { 12} 47: mov(8) vgrf271:D, 1082130432d { 14} 48: mov(8) vgrf274+0.0:D, 0d { 14} 49: mov(8) vgrf274+1.0:D, 0d Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23477>	2023-06-14 12:04:05 +00:00
Caio Oliveira	2bb26cc01d	intel/compiler: Refactor dump_instruction(s) Delete unnecessary virtual functions, we need just two. Refactor code so the 'default behavior' logic (stderr and/or creating file) is not duplicated. Rename the virtuals so overrides don't hide the common convenience functions. Finally, provide a variant of dump_instructions() with a `FILE *` parameter. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23457>	2023-06-08 22:00:21 +00:00
Ian Romanick	4cc3206218	intel/fs: Don't munge source order of 3-src instructions in opt_algebraic This only impacts ADD3, so at this point it should not have any affect. As soon as constants are propagated into ADD3 instructions, it will be a problem. The worst part is, the ADD3 instrutions that are broken by the old code aren't even "progress" on this pass. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23262>	2023-06-06 06:10:53 +00:00
Kenneth Graunke	2d9a3bb093	intel/compiler: Fix a fallthrough in components_read() for atomics In commit `284f0c9a57` I refactored the handling of the data source to just call a helper rather than special casing opcodes with 0 or 2 sources. Unfortunately, I also dropped the "else return 1", creating a fallthrough for all sources other than SURFACE_LOGICAL_SRC_ADDRESS and SURFACE_LOGICAL_SRC_DATA. The case below happened to return the correct value for all cases except SURFACE_LOGICAL_SRC_SURFACE, which has been returning 2 instead of 1 since that commit. Restore the else case. Thanks to Marcin Ślusarz for catching this. Fixes: `284f0c9a57` ("intel/compiler: Add an lsc_op_num_data_values() helper") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23347>	2023-06-01 21:06:57 +00:00
Rohan Garg	ef2b763d9c	anv: fix incorrect asserts when combining CPS and per sample interpolation CPS is dynamically turned off when per sample interpolation is active. Update the asserts to reflect this. Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `5644011f06` ("intel/compiler: Convert wm_prog_key::persample_interp to a tri-state") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23103>	2023-05-31 19:26:59 +00:00
Lionel Landwerlin	ad9bc1ffb5	intel/fs: enable UBO accesses through bindless heap Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>	2023-05-30 06:36:37 +00:00
Lionel Landwerlin	e09cfda0de	intel/fs: lower get_buffer_size like other logical sends This will also enable the use of the bindless heap. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>	2023-05-30 06:36:36 +00:00
Lionel Landwerlin	21c7b55f6f	intel/fs: fix size_read() for LOAD_PAYLOAD With Anv/Zink, the piglit test : arb_shader_storage_buffer_object-max-ssbo-size -auto -fbo fsexceed is failing validation after copy propagation : load_payload(8) vgrf15:F, vgrf1+0.12<0>:F, vgrf1+0.0<0>:F, vgrf1+0.4<0>:F, vgrf1+0.8<0>:F, vgrf1+0.12<0>:F ../src/intel/compiler/brw_fs_validate.cpp:191: A <= B failed A = inst->src[i].offset / REG_SIZE + regs_read(inst, i) = 2 B = alloc.sizes[inst->src[i].nr] = 1 In most cases it works because src[0] would be at offset 0 and so reading a full reg passes validation, but Anv/Zink started emitting slightly different code adding an offset maybe the size read 2 GRFs. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23126>	2023-05-23 12:39:08 +00:00
Rohan Garg	a15cc833f9	intel: drop unused is_scalar function parameter in brw_nir_apply_key Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23098>	2023-05-18 15:46:06 +02:00
Rohan Garg	212810ac8a	intel: infer scalar'ness locally for brw_postprocess_nir Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23098>	2023-05-18 15:46:06 +02:00
Lionel Landwerlin	b4b17f8aaa	Revert "intel/compiler: make uses_pos_offset a tri-state" This reverts commit `5489033fa8`. The problem I was trying to address is that we were programming the 3DSTATE_PS::PositionXYOffsetSelect bit differently with GPL (CENTROID) than without (NONE). I failed to understand that this bit also impacts the thread payload layout. GPL fragment shaders don't know ahead of time if pos_offset is going to be used. It'll be choosen at runtime base on push constant bits. So we need to program this bit different just to have a payload matching the compiled shader code. This fixes the freedoom replay with GPL FS shader in SIMD32. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22938>	2023-05-11 08:01:46 +00:00
Lionel Landwerlin	5489033fa8	intel/compiler: make uses_pos_offset a tri-state This value depends on the per-sample value which can be unknown at compile time with graphics pipeline libraries. So we need to have this dynamic has well and pick the right value when generating the 3DSTATE_PS/3DSTATE_WM packet. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `d8dfd153c5` ("intel/fs: Make per-sample and coarse dispatch tri-state") Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22728>	2023-05-03 10:03:57 +00:00
Tapani Pälli	ccf16693e1	intel/fs: use intel_needs_workaround for Wa_22013689345 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22437>	2023-04-13 07:33:50 +00:00
Alyssa Rosenzweig	7f6491b76d	nir: Combine if_uses with instruction uses Every nir_ssa_def is part of a chain of uses, implemented with doubly linked lists. That means each requires 2 * 64-bit = 16 bytes per def, which is memory intensive. Together they require 32 bytes per def. Not cool. To cut that memory use in half, we can combine the two linked lists into a single use list that contains both regular instruction uses and if-uses. To do this, we augment the nir_src with a boolean "is_if", and reimplement the abstract if-uses operations on top of that list. That boolean should fit into the padding already in nir_src so should not actually affect memory use, and in the future we sneak it into the bottom bit of a pointer. However, this creates a new inefficiency: now iterating over regular uses separate from if-uses is (nominally) more expensive. It turns out virtually every caller of nir_foreach_if_use(_safe) also calls nir_foreach_use(_safe) immediately before, so we rewrite most of the callers to instead call a new single `nir_foreach_use_including_if(_safe)` which predicates the logic based on `src->is_if`. This should mitigate the performance difference. There's a bit of churn, but this is largely a mechanical set of changes. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22343>	2023-04-07 23:48:03 +00:00
Ian Romanick	6dfb7061e0	intel/fs: Preserve meta data more often in brw_nir_move_interpolation_to_top This pass rarely makes any changes, so work a little harder to preserve more meta data. On my Ice Lake laptop (using a locked CPU speed and other measures to prevent thermal throttling, etc.) using a debugoptimized build, improves performance of Vulkan CTS "deqp-vk --deqp-case='dEQP-VK.spir'" by -0.2% ± 0.1% (n = 5, pooled s = 0.431885). v2: Add some parenthesis. Suggested by Lionel. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22299>	2023-04-06 19:07:50 +00:00
Ian Romanick	3037603b70	intel/fs: Linked list micro optimizations in brw_nir_move_interpolation_to_top Two linked list management changes: - Use the list head sentinel as the initial cursor. It is, after all, a proper node in the list. - Iterate the list of blocks starting with the second block instead of skipping the first block in the loop. On my Ice Lake laptop (using a locked CPU speed and other measures to prevent thermal throttling, etc.) using a release build, improves performance of compiling shaders from batman_arkham_city_goty.foz by -0.24% ± 0.09% (n = 5, pooled s = 0.324106). v2: Use nir_cursor instead of direct list manipultion. Suggested by Lionel. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22299>	2023-04-06 19:07:50 +00:00
Ian Romanick	d47f521ee4	intel/compiler: Use NIR_PASS instead of NIR_PASS_V Reduce debug log spam by only logging the shader if a pass made some changes. This can also elide some nir_validate calls in debug builds. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22299>	2023-04-06 19:07:50 +00:00
Lionel Landwerlin	adb8c30436	intel/fs: UNDEF fixup_nomask_control_flow temp register Ensure that the register's liveness is not expanded to loops. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21853>	2023-04-05 12:32:56 +00:00
Lionel Landwerlin	362a07db3a	intel/fs: don't consider fixup_nomask_control_flow SENDs predicate Those SENDs are still doing a full register write. We just inserted some predication for a workaround. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21853>	2023-04-05 12:32:56 +00:00
Lionel Landwerlin	34d8bfe65f	intel/fs: run VGRF compaction just before max live register accounting There are a number of instances of the dead code elimination pass that could reduce the count. For some reason this also seems to affect register allocation itself. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21853>	2023-04-05 12:32:56 +00:00
Mark Janes	33d03e57ad	intel/fs: use generated helpers for Wa_14013363432 / Wa_14012688258 Wa_14013363432 is a clone of Wa_14012688258. It does not apply to all gfx 12.5 platforms. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21745>	2023-03-23 19:13:09 +00:00
Lionel Landwerlin	2acc2f18ea	intel/compiler: report max dispatch width statistic Most tools looking at shader stats assume that there is only a single resulting binary shader out of a single input. On Intel HW this is not always the case. So having a statistic on each variant that reports the maximum dispatch width helps showing improvement on a single shader in terms of how large we manage to compile it. For shaders that can be compiled in multiple SIMD width (like fragment shaders), this will report the maximum dispatch width in the statistics of each variants. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22014>	2023-03-21 11:53:04 +00:00
Iván Briano	4dd81b4e2f	intel/fs: handle interpolation modes for at_sample and at_offset too Fixes dEQP-VK.draw..linear_interpolation. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19647>	2023-03-18 10:18:15 +00:00
Lionel Landwerlin	ed3c2f73db	intel/fs: fixup sources number from opt_algebraic Fixes issues with register_coalesce : fossilize-replay: brw_fs_register_coalesce.cpp:297: bool fs_visitor::register_coalesce(): Assertion `mov[i]->sources == 1' failed. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21782>	2023-03-14 10:38:50 +00:00
Lionel Landwerlin	efde1917c9	intel/fs: don't SEND messages as partial writes For instance, to load uniform data with the LSC we usually rely on tranpose messages which have to execute in SIMD1. Those end up being considered as partial writes so within loops their life span spread to the whole loop, increasing register pressure. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21867>	2023-03-14 10:10:32 +00:00
Lionel Landwerlin	09cdb77a92	intel/fs: report max register pressure in shader stats Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21756>	2023-03-08 13:37:07 +00:00
Mark Janes	08649e3673	intel/fs: use generated workaround helpers for Wa_14017989577 Wa_14017989577 is a clone of Wa_14015360517, which applies to several platforms beyond INTEL_PLATFORM_DG2_G10. Update references to Wa_14017989577, and use the generated workaround helper to ensure application to the proper platforms. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21744>	2023-03-07 21:43:11 +00:00
Lionel Landwerlin	fd7debc8bb	intel/fs: make alpha_to_coverage a tristate That way in some cases we can do this dynamically. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>	2023-02-06 09:12:18 +00:00
Jason Ekstrand	f3969e2413	intel/fs: Rework dynamic coarse handling Use 2 flags for PI & RT messages. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>	2023-02-06 09:12:18 +00:00
Jason Ekstrand	949b42c4dc	intel/compiler: Convert wm_prog_key::multisample_fbo to a tri-state This allows us to communicate to the back-end that we don't actually know if the framebuffer is multisampled or not. No drivers set anything but ALWAYS/NEVER and we still have a few ALWAYS/NEVER assumptions but those should be asserted. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>	2023-02-06 09:12:18 +00:00
Jason Ekstrand	5644011f06	intel/compiler: Convert wm_prog_key::persample_interp to a tri-state This allows for the possibility that we may not know at compile time if sample shading is enabled through the API. While we're here, also document exactly what this bit means so we don't confuse ourselves. v2: Fixup coarse pixel values (Lionel) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>	2023-02-06 09:12:18 +00:00
Jason Ekstrand	d8dfd153c5	intel/fs: Make per-sample and coarse dispatch tri-state Whenever one of them is BRW_SOMETIMES, we depend on dynamic flag pushed in as a push constant. In this case, we have to often have to do the calculation both ways and SEL the result. It's a bit more code but decouples MSAA from the shader key. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>	2023-02-06 09:12:18 +00:00
Jason Ekstrand	5d1c538449	intel/fs: Return early in a couple builtin setup helpers Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21094>	2023-02-06 09:12:17 +00:00
Marcin Ślusarz	432e263284	intel/compiler: fine-grained control of dispatch widths Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> [v2] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20854>	2023-01-27 11:00:41 +00:00
Lionel Landwerlin	13cca48920	intel/fs: drop FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD_GFX7 We can lower FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD into other more generic sends and drop this internal opcode. The idea behind this change is to allow bindless surfaces to be used for UBO pulls and why it's interesting to be able to reuse setup_surface_descriptors(). But that will come in a later change. No shader-db changes on TGL & DG2. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20416>	2023-01-26 11:26:53 +00:00
Kenneth Graunke	7092c1218a	intel/compiler: Use more symbolic source names in components_read() Rather than hardcoding source 1, source 2, etc. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20604>	2023-01-19 08:42:22 +00:00
Kenneth Graunke	780f3e2e6b	intel/compiler: Delete all the A64 atomic variants for type sizes These are handled identically in almost all cases. There is one place in the legacy surface lowering that was obtaining the bitsize from the opcode, but the LSC-based lowering uses (type_sz(inst->dst.type) * 8) for that and works just fine. If we just do that in the legacy lowering too, then we don't need this plethora of opcodes. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20604>	2023-01-19 08:42:22 +00:00
Kenneth Graunke	02129eee3a	intel/compiler: Eliminate SHADER_OPCODE_UNTYPED_ATOMIC_FLOAT The only reason for the separate opcode was because of the overlapping BRW_AOP_* enums, making it impossible to tell whether a particular AOP was the integer or float operation. Now that we use the lsc_opcode enums, we can just have the legacy lowering inspect the opcode and select the right descriptor. No need for a separate opcode. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20604>	2023-01-19 08:42:22 +00:00
Kenneth Graunke	284f0c9a57	intel/compiler: Add an lsc_op_num_data_values() helper There are a number of places that need to know how many operands an LSC atomic takes (0 for inc/dec, 1 for most things, 2 for cmpxchg). We can add a helper for that and eliminate some code (with more to come). Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20604>	2023-01-19 08:42:22 +00:00
Kenneth Graunke	90a2137cd5	intel/compiler: Use LSC opcode enum rather than legacy BRW_AOPs This gets our logical atomic messages using the lsc_opcode enum rather than the legacy BRW_AOP_* defines. We have to translate one way or another, and using the modern set makes sense going forward. One advantage is that the lsc_opcode encoding has opcodes for both integer and floating point atomics in the same enum, whereas the legacy encoding used overlapping values (BRW_AOP_AND == 1 == BRW_AOP_FMAX), which made it impossible to handle both sensibly in common code. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20604>	2023-01-19 08:42:22 +00:00
Nico Cortes	29adbb132f	Revert "intel/compiler: fine-grained control of dispatch widths" This reverts commit `bed18ab3e2`. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8063 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20654>	2023-01-12 00:33:25 +00:00
Marcin Ślusarz	bed18ab3e2	intel/compiler: fine-grained control of dispatch widths Reviewed-by: Matt Turner <mattst88@gmail.com> [v1] Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20535>	2023-01-11 08:17:12 +00:00
Lionel Landwerlin	6b494745be	intel/fs: only avoid SIMD32 if strictly inferior in throughput This enabled SIMD32 in blorp shaders and seems to be give a small FPS bump when using a DG2 GPU as secondary (requires copies to linear buffers to exchange with main GPU). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19341>	2023-01-09 08:41:47 +00:00

1 2 3 4 5 ...

596 commits