fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-19 17:58:09 +02:00

Author	SHA1	Message	Date
Marek Olšák	45b20c8249	nir/lower_clip: fixes for lowered IO without compact arrays Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32173>	2024-11-19 23:48:38 +00:00
Marek Olšák	878d23e171	nir/lower_pntc_ytransform: handle lowered IO Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32173>	2024-11-19 23:48:38 +00:00
Marek Olšák	18f3c92b87	nir/print: print fb_fetch_output for variables Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32173>	2024-11-19 23:48:38 +00:00
Rhys Perry	65a54b4ec4	nir/lcssa: fix premature exit of loop after rematerializing derefs If we have NIR such as: 32x4 %48 = @load_vulkan_descriptor (%47) (desc_type=SSBO) 32x4 %76 = deref_cast (tint_symbol_11 )%48 (ssbo tint_symbol_11) (ptr_stride=0, align_mul=4, align_offset=0) 32x4 %77 = deref_struct &%76->tint_symbol_10 (ssbo int) // &((tint_symbol_11 )%48)->tint_symbol_10 A single nir_rematerialize_deref_in_use_blocks() will rematerialize the deref_struct and then it's deref_cast. However, nir_foreach_instr_reverse_safe is not safe if the next iteration's instruction is removed. This can result in the instruction loop exiting and the load_vulkan_descriptor never having an LCSSA phi. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Fixes: `439e8c42cc` ("nir/lcssa: Fix rematerializing derefs") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11770 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32225>	2024-11-19 18:59:05 +00:00
Rhys Perry	327e5465fc	nir/algebraic: check bit sizes in lowered unpack(pack()) optimization Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Fixes: `894f7f4387` ("nir_opt_algebraic: Add a couple optimizations for lowered unpack(pack())") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32157>	2024-11-19 18:17:18 +00:00
Rhys Perry	ecd6ae12fb	nir/algebraic: fix iabs(ishr(iabs(a), b)) optimization iabs(a) is not positive if "a" is the minimum signed value, so this is incorrect in that case for some values of "b". Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Fixes: `2b76de9b5d` ("nir/algebraic: Add a couple optimizations for iabs and ishr") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32157>	2024-11-19 18:17:17 +00:00
Matt Turner	ba5c65f10b	nir: Get correct number of components The code wants the number of components used by the variable in the current attribute slot, not the total number of components. For e.g. a 4x3 matrix, glsl_get_components() returns 12, leading to the following error reported by AddressSanitizer: ``` Test case 'dEQP-VK.tessellation.shader_input_output.cross_invocation_per_patch_mat4x3'.. ../src/compiler/nir/nir_lower_io_to_vector.c:265:16: runtime error: index 4 out of bounds for type 'nir_variable *[4]' ``` Fixes: `5ef2b8f1f2` ("nir: Add a pass for lowering IO back to vector when possible") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32193>	2024-11-19 16:35:17 +00:00
Caterina Shablia	a5bcf566a9	nir: lower INSTANCE_{ID,INDEX} to an offset load_instance_{index,id} respectively If the hardware does not support INSTANCE_INDEX natively, it will be lowered to load_instance_id + base_instance. Otherwise, INSTANCE_ID will be lowered to load_instance_index - base_instance. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32158>	2024-11-19 09:18:47 +00:00
Caterina Shablia	b9be1f1f20	nir: introduce instance_index system value The semantics of this newly introduced system value match Vulkan's InstanceIndex exactly, and are equivalent to instance_id + base_instance. Some hardware, such as Mali Valhall or later, only provides instance id offset by base_instance. Introducing a new system value to represent this, rather than handling the mismatch when lowering to BIR lets us use NIR to eliminate redundant arithmetic that would follow from mismatched semantics, e.g. instance_id could be lowered to instance_index - base_instance, so expressions such as instance_id + base_instance would be optimized to a simple instance_index. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32158>	2024-11-19 09:18:47 +00:00
Dave Airlie	6714689613	nir/functions: force inlining for barriers. A recent algebraic opt made a function that used to inline with llvmpipe CL not inline anymore. However that function has a barrier in it. Handling barriers from inside a callstack is hard for llvmpipe coroutines, so just force functions with barriers to be inlined. Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32204>	2024-11-19 12:26:28 +10:00
Karol Herbst	fa379a9495	nir/lower_cl_images: lower scalar image_loads to vec4 This will be required for supporting depth images as the rest of mesa assumes those to always return vec4. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30831>	2024-11-18 17:57:28 +00:00
Marek Olšák	899bee4af8	nir/opt_varyings: don't count the cost of the same instruction multiple times Use pass_flags to indicate whether the instruction has already been added to the total cost of the expression. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32174>	2024-11-18 13:39:08 +00:00
Marek Olšák	405e9d9b74	nir/opt_varyings: implement compaction without flexible interpolation We have to honor drivers when they say that different interpolation qualifiers can't be mixed in the same vec4, indicated by nir_io_has_flexible_input_interpolation_except_flat not being set. This is a prerequisite for enabling nir_opt_varyings for all drivers. Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32174>	2024-11-18 13:39:08 +00:00
Marek Olšák	a7c671efc6	nir/opt_varyings: fix packing color varyings BITSET_TEST_RANGE_INSIDE_WORD uses first_bit .. last_bit, same as BITSET_RANGE, not first_bit .. size like BITFIELD_RANGE. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32174>	2024-11-18 13:39:08 +00:00
Marek Olšák	f9b03cf405	nir/opt_varyings: add nir_io_compaction_rotates_color_channels This was enabled by default in nir_opt_varyings, but vc4 can't handle when shader outputs write Y but not X. Add an option for it and enable it only for the driver that benefits from it. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32174>	2024-11-18 13:39:08 +00:00
Marek Olšák	8518e1cfd7	nir/opt_varyings: add nir_io_always_interpolate_convergent_fs_inputs for Asahi Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32174>	2024-11-18 13:39:08 +00:00
Kenneth Graunke	95bc42af74	nir: Use load_global_constant for reorderable nir_var_mem_global access The main difference between load_global and load_global_constant is that the latter can be reordered arbitrarily. If the access being lowered is already tagged as being reorderable, then we can preserve that by using the load_global_constant intrinsics instead of load_global. This gives us more flexibility. On Intel, this lets us use the load_global_constant_uniform_block_intel intrinsic for doing convergent block loads in more cases. This nets us significant reductions in spill/fills: Borderlands 3 on Lunarlake sees spills/fills reduced by 53%. Alchemist sees a 13% reduction. Improves performance of Borderlands 3 DX12 on Intel Battlemage by around 44%. Improves Hogwarts Legacy by around 14%. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31995>	2024-11-18 12:55:47 +00:00
Danylo Piliaiev	b501cbf153	nir/nir_opt_offsets: Do not fold load/store with const offset > max When (off_const > max) there is a wrap around uint when calling try_extract_const_addition. Exit early since folding doesn't make sense in this case. Cc: mesa-stable Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32118>	2024-11-14 10:22:39 +00:00
Rhys Perry	d3ae1842a2	aco,ac/nir: flag loads to use smem in NIR This pass will be re-used later. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31904>	2024-11-13 12:59:26 +00:00
Rhys Perry	7fe4f4c14c	nir_lower_mem_access_bit_sizes: support load_constant Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31904>	2024-11-13 12:59:26 +00:00
Rhys Perry	45c1280d2c	nir_lower_mem_access_bit_sizes: pass access to callback Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31904>	2024-11-13 12:59:26 +00:00
Rhys Perry	61752152f7	nir_lower_mem_access_bit_sizes: add nir_mem_access_shift_method Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31904>	2024-11-13 12:59:26 +00:00
Rhys Perry	e2dd36c66e	nir_lower_mem_access_bit_sizes: support 64-bit offsets Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31904>	2024-11-13 12:59:26 +00:00
Rhys Perry	0619e4db63	nir,aco,ac/llvm: add nir_op_alignbyte_amd Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31904>	2024-11-13 12:59:26 +00:00
Rhys Perry	0c7830eb85	nir/algebraic: optimize ushr(a, ishl(iand(b, 3), 3)) nir_lower_mem_access_bit_sizes creates this. No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31904>	2024-11-13 12:59:26 +00:00
Rhys Perry	e95a3364b8	nir/algebraic: optimize bcsel(ieq(b, 0), a, shift(a, b)) nir_lower_mem_access_bit_sizes can create this. No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31904>	2024-11-13 12:59:26 +00:00
Rhys Perry	80b76ba692	nir: add more intrinsics to nir_intrinsic_can_reorder Including nir_intrinsic_load_global. fossil-db (navi21): Totals from 2725 (3.43% of 79395) affected shaders: MaxWaves: 71972 -> 71964 (-0.01%); split: +0.01%, -0.02% Instrs: 2831052 -> 2819902 (-0.39%); split: -0.45%, +0.06% CodeSize: 15047548 -> 14973072 (-0.49%); split: -0.57%, +0.08% VGPRs: 108864 -> 108856 (-0.01%); split: -0.02%, +0.01% SpillSGPRs: 906 -> 926 (+2.21%) SpillVGPRs: 196 -> 1092 (+457.14%) Scratch: 729088 -> 741376 (+1.69%) Latency: 16621317 -> 16586551 (-0.21%); split: -0.34%, +0.13% InvThroughput: 4169987 -> 4164876 (-0.12%); split: -0.23%, +0.11% VClause: 63247 -> 63471 (+0.35%); split: -0.21%, +0.56% SClause: 56978 -> 55276 (-2.99%); split: -3.50%, +0.51% Copies: 252545 -> 252495 (-0.02%); split: -0.98%, +0.96% Branches: 91378 -> 91388 (+0.01%); split: -0.03%, +0.04% PreSGPRs: 112753 -> 126850 (+12.50%); split: -0.48%, +12.98% PreVGPRs: 90617 -> 90708 (+0.10%) VALU: 1709034 -> 1709368 (+0.02%); split: -0.01%, +0.03% SALU: 463554 -> 462253 (-0.28%); split: -0.57%, +0.29% VMEM: 115952 -> 116272 (+0.28%); split: -0.21%, +0.49% SMEM: 129097 -> 120538 (-6.63%); split: -6.64%, +0.01% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31904>	2024-11-13 12:59:26 +00:00
Georg Lehmann	8f094a7762	nir: handle fmul(a,a)/ffma(a,a,b) in nir_def_all_uses_ignore_sign_bit Foz-DB Navi31: Totals from 436 (0.55% of 79395) affected shaders: Instrs: 808917 -> 805868 (-0.38%) CodeSize: 4269056 -> 4246512 (-0.53%) Latency: 5827077 -> 5819815 (-0.12%); split: -0.13%, +0.00% InvThroughput: 625482 -> 622959 (-0.40%); split: -0.41%, +0.00% SClause: 21797 -> 21756 (-0.19%); split: -0.23%, +0.04% Copies: 48502 -> 48505 (+0.01%); split: -0.04%, +0.05% VALU: 481686 -> 479074 (-0.54%); split: -0.54%, +0.00% SALU: 76699 -> 76700 (+0.00%) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31844>	2024-11-12 18:03:57 +00:00
Georg Lehmann	34f41abe24	nir: add nir_def_all_uses_ignore_sign_bit Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31844>	2024-11-12 18:03:57 +00:00
Samuel Pitoiset	a85f0143e0	nir: add nir_intrinsic_debug_break instruction This instruction can be used as a breakpoint in shaders to enter a trap if supported by the driver. It will be used to handle NonSemantic.DebugBreak in SPIR-V. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32061>	2024-11-12 16:05:17 +00:00
Karmjit Mahil	2a7df331af	nir: Fix `no_lower_set` leak on early return Addresses: ``` Indirect leak of 256 byte(s) in 2 object(s) allocated from: #0 0x7faaf53ee0 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145 #1 0x7fa8cfe900 in ralloc_size ../src/util/ralloc.c:118 #2 0x7fa8cfeb20 in rzalloc_size ../src/util/ralloc.c:152 #3 0x7fa8cff004 in rzalloc_array_size ../src/util/ralloc.c:232 #4 0x7fa8d06a84 in _mesa_set_init ../src/util/set.c:133 #5 0x7fa8d06bcc in _mesa_set_create ../src/util/set.c:152 #6 0x7fa8d0939c in _mesa_pointer_set_create ../src/util/set.c:613 #7 0x7fa95e5790 in nir_lower_mediump_vars ../src/compiler/nir/nir_lower_mediump.c:574 #8 0x7fa862c1c8 in tu_spirv_to_nir(tu_device, void, unsigned long, VkPipelineShaderStageCreateInfo const, tu_shader_key const, pipe_shader_type) ../src/freedreno/vulkan/tu_shader.cc:116 #9 0x7fa8646f24 in tu_compile_shaders(tu_device, unsigned long, VkPipelineShaderStageCreateInfo const, nir_shader, tu_shader_key const, tu_pipeline_layout, unsigned char const, tu_shader, char, void, nir_shader, VkPipelineCreationFeedback) ../src/freedreno/vulkan/tu_shader.cc:2741 #10 0x7fa85a16a4 in tu_pipeline_builder_compile_shaders ../src/freedreno/vulkan/tu_pipeline.cc:1887 #11 0x7fa85eb844 in tu_pipeline_builder_build<(chip)7> ../src/freedreno/vulkan/tu_pipeline.cc:3923 #12 0x7fa85e6bd8 in tu_graphics_pipeline_create<(chip)7> ../src/freedreno/vulkan/tu_pipeline.cc:4203 #13 0x7fa85c2588 in VkResult tu_CreateGraphicsPipelines<(chip)7>(VkDevice_T, VkPipelineCache_T, unsigned int, VkGraphicsPipelineCreateInfo const, VkAllocationCallbacks const, VkPipeline_T**) ../src/freedreno/vulkan/tu_pipeline.cc:4234 ``` seen in: dEQP-VK.binding_model.mutable_descriptor.single.switches.uniform_texel_buffer_storage_image.update_write.no_source.no_source.pool_expand_types.pre_update.no_array.vert Fixes: `7e986e5f04` ("nir/lower_mediump_vars: Don't lower mediump shared vars with atomic access.") Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32057>	2024-11-12 11:48:11 +00:00
Georg Lehmann	ee74b090db	nir/opt_16bit_tex_image: optimize extract half sources I also tried extract_i16/u16, but that causes a lot of regressions. Foz-DB Navi21: Totals from 3 (0.00% of 79395) affected shaders: Instrs: 367 -> 355 (-3.27%) CodeSize: 2156 -> 2136 (-0.93%) VGPRs: 80 -> 72 (-10.00%) Latency: 3163 -> 3153 (-0.32%); split: -0.51%, +0.19% InvThroughput: 424 -> 404 (-4.72%) Copies: 31 -> 42 (+35.48%); split: -3.23%, +38.71% PreVGPRs: 27 -> 25 (-7.41%) VALU: 208 -> 196 (-5.77%) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32058>	2024-11-12 10:19:40 +00:00
Konstantin Seurer	cf447c5da1	nir: Do not gather source locations for phis Phi instructions are expected to be the first instructions in a block. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298>	2024-11-11 08:39:14 +00:00
Konstantin Seurer	f2c204daf0	nir: Add a first_line parameter to gather_debug_info Useful when the file contains multiple shaders. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298>	2024-11-11 08:39:14 +00:00
Konstantin	4d09cd7fa5	nir/lower_non_uniform_access: Group accesses using the same resource Avoids emitting the waterfall loop for every access if they use the same resource: waterfall_loop { access } waterfall_loop { access } -> waterfall_loop { access access } Totals from 276 (0.33% of 84770) affected shaders: MaxWaves: 3360 -> 3356 (-0.12%) Instrs: 3759927 -> 3730650 (-0.78%) CodeSize: 21125784 -> 20899580 (-1.07%) VGPRs: 23096 -> 23104 (+0.03%) Latency: 35593716 -> 35315455 (-0.78%); split: -0.78%, +0.00% InvThroughput: 7353071 -> 7297309 (-0.76%); split: -0.76%, +0.00% VClause: 120983 -> 118579 (-1.99%) SClause: 113073 -> 110671 (-2.12%) Copies: 358272 -> 348686 (-2.68%) Branches: 166706 -> 159500 (-4.32%) PreSGPRs: 18598 -> 18596 (-0.01%) PreVGPRs: 21417 -> 21424 (+0.03%); split: -0.01%, +0.04% VALU: 2354862 -> 2350053 (-0.20%) SALU: 582291 -> 567638 (-2.52%) SMEM: 139875 -> 137473 (-1.72%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30509>	2024-11-11 07:53:13 +00:00
Konstantin Seurer	d44f74896e	nir: Add missing access flags to print_access Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30509>	2024-11-11 07:53:13 +00:00
Alyssa Rosenzweig	5c73a8af44	nir/lower_uniforms_to_ubo: use amul Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig	fc460e7f20	nir/opt_algebraic: don't lower amul if requested Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig	1f3c97547a	nir/builder: use amul over ishl on agx ishl can wrap, amul cannot. so we need amul in the backend, or otherwise we would need to introduce an ashl opcode instead. that doesn't seem better. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig	9ab8d70fa6	nir: add ilea_agx/ulea_agx opcodes to facilitate address mode lowering. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig	23afe968ad	nir: add late_lower_int64 option Some drivers generally need int64 lowered, but prefer to do this lowering themselves late, to have a chance to optimize targeted int64 patterns before lowering the rest. This isn't currently possible since nir_lower_int64 takes no options except what's const* in the shader, and frontends call nir_lower_int64 before passing the shader off to the driver. Add an option to defer int64 lowering. This is a bit ugly but the alternative is replumbing nir_lower_int64's option handling cross-tree and no-thank-you-not-right-now. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig	eaf75169ee	nir: add amul flag Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig	227026b7ad	nir/opt_algebraic: add another 64-bit pattern clpeak Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig	2a3f133fd0	nir/opt_algebraic: add more 64-bit patterns Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:41 -04:00
Alyssa Rosenzweig	a4a3487aae	nir/opt_algebraic: optimize patterns from Skia shaders/skia/1567.shader_test relies on algebraic + constant folding, subtle changes in the input compiling flow can cause it to baloon. these patterns fix that. annoying! shader-db results aren't amazing, but they avert a major stats regression for that one Skia shader. total instructions in shared programs: 2751399 -> 2751295 (<.01%) instructions in affected programs: 6509 -> 6405 (-1.60%) helped: 21 HURT: 1 helped stats (abs) min: 1 max: 14 x̄: 5.62 x̃: 6 helped stats (rel) min: 0.53% max: 13.73% x̄: 3.57% x̃: 1.62% HURT stats (abs) min: 14 max: 14 x̄: 14.00 x̃: 14 HURT stats (rel) min: 2.45% max: 2.45% x̄: 2.45% x̃: 2.45% 95% mean confidence interval for instructions value: -7.09 -2.36 95% mean confidence interval for instructions %-change: -5.14% -1.45% Instructions are helped. total alu in shared programs: 2274577 -> 2274468 (<.01%) alu in affected programs: 6178 -> 6069 (-1.76%) helped: 21 HURT: 1 helped stats (abs) min: 1 max: 14 x̄: 5.86 x̃: 7 helped stats (rel) min: 0.55% max: 16.47% x̄: 3.93% x̃: 1.72% HURT stats (abs) min: 14 max: 14 x̄: 14.00 x̃: 14 HURT stats (rel) min: 2.83% max: 2.83% x̄: 2.83% x̃: 2.83% 95% mean confidence interval for alu value: -7.35 -2.56 95% mean confidence interval for alu %-change: -5.67% -1.57% Alu are helped. total fscib in shared programs: 2272894 -> 2272785 (<.01%) fscib in affected programs: 6178 -> 6069 (-1.76%) helped: 21 HURT: 1 helped stats (abs) min: 1 max: 14 x̄: 5.86 x̃: 7 helped stats (rel) min: 0.55% max: 16.47% x̄: 3.93% x̃: 1.72% HURT stats (abs) min: 14 max: 14 x̄: 14.00 x̃: 14 HURT stats (rel) min: 2.83% max: 2.83% x̄: 2.83% x̃: 2.83% 95% mean confidence interval for fscib value: -7.35 -2.56 95% mean confidence interval for fscib %-change: -5.67% -1.57% Fscib are helped. total bytes in shared programs: 21489352 -> 21488668 (<.01%) bytes in affected programs: 53362 -> 52678 (-1.28%) helped: 21 HURT: 2 helped stats (abs) min: 6 max: 98 x̄: 35.52 x̃: 40 helped stats (rel) min: 0.39% max: 10.63% x̄: 2.27% x̃: 1.27% HURT stats (abs) min: 2 max: 60 x̄: 31.00 x̃: 31 HURT stats (rel) min: 0.08% max: 1.40% x̄: 0.74% x̃: 0.74% 95% mean confidence interval for bytes value: -42.73 -16.74 95% mean confidence interval for bytes %-change: -3.13% -0.89% Bytes are helped. total regs in shared programs: 865162 -> 865148 (<.01%) regs in affected programs: 509 -> 495 (-2.75%) helped: 4 HURT: 5 helped stats (abs) min: 2 max: 14 x̄: 6.00 x̃: 4 helped stats (rel) min: 3.17% max: 35.90% x̄: 14.01% x̃: 8.48% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 3.17% max: 3.17% x̄: 3.17% x̃: 3.17% 95% mean confidence interval for regs value: -5.75 2.64 95% mean confidence interval for regs %-change: -14.31% 5.39% Inconclusive result (value mean confidence interval includes 0). total uniforms in shared programs: 2120731 -> 2120735 (<.01%) uniforms in affected programs: 358 -> 362 (1.12%) helped: 1 HURT: 2 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 2.94% max: 2.94% x̄: 2.94% x̃: 2.94% HURT stats (abs) min: 2 max: 4 x̄: 3.00 x̃: 3 HURT stats (rel) min: 1.05% max: 4.00% x̄: 2.53% x̃: 2.53% Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:41 -04:00
Rhys Perry	da5c5a3edd	nir/algebraic: add bit-size check to extract_u8 pattern This only worked when "a" was 16-bit because a pattern above replaced the shift. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762>	2024-11-06 19:31:20 +00:00
Marek Olšák	2352fcd5b4	nir/lower_clip_disable: handle non-scalar store intrinsics It only supported scalar intrinsics because it was written before nir_opt_vectorize_io existed. The introduction of nir_opt_vectorize_io exposes this issue. The direct path has been tested. The indirect path hasn't. That's fine because if we see a CLIP_DIST failure with indirect in the future, this pass is likely the cause. This is a prerequisite for enabling nir_opt_varyings for all gallium drivers. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31994>	2024-11-06 15:51:51 +00:00
Georg Lehmann	917f312873	nir/lower_fragcoord_wtrans: use intrinsics_pass Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31967>	2024-11-06 12:57:08 +00:00
Georg Lehmann	8104c89174	nir/lower_wpos_ytransform: remove reference to long removed TGSI code Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31951>	2024-11-05 21:42:37 +00:00
Georg Lehmann	e307f40ebe	nir/lower_wpos_ytransform: use more typical pass structure Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31951>	2024-11-05 21:42:37 +00:00

1 2 3 4 5 ...

5721 commits