fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-03 03:28:09 +02:00

Author	SHA1	Message	Date
Connor Abbott	046c75e95c	tu: Use start offset for storage buffers This lets us expose a minStorageBufferOffsetAlignment of 4 which is what vkd3d-proton expects. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20105>	2022-12-14 16:19:47 +00:00
Connor Abbott	316ed8f965	tu: Expose *TexelBufferOffsetSingleTexelAlignment This exactly matches what the HW can do. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20105>	2022-12-14 16:19:47 +00:00
Connor Abbott	4d2aa9a9f7	freedreno/fdl: Support texel-aligned iova for buffer views Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20105>	2022-12-14 16:19:47 +00:00
Connor Abbott	3ca90405e8	freedreno/a6xx: Document buffer-specific tex const fields Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20105>	2022-12-14 16:19:47 +00:00
Connor Abbott	f94bd1d723	freedreno: Document various preemption-related registers/packets Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20229>	2022-12-14 15:52:22 +00:00
Hans-Kristian Arntzen	34010a50d4	wsi/x11: Rename the present progress objects. The lock and condition variable isn't just for present_id anymore, it's also for normal forward progress. Adds more detailed comments what the variables are supposed to accomplish. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19990>	2022-12-14 14:20:54 +00:00
Hans-Kristian Arntzen	9e55766f63	wsi/x11: Fix possible deadlock with wait_ready. With the introduction of locks around the XCB polling mechanism, a possible deadlock was introduced. If all 5 images were rapidly acquired and presented before the FIFO thread had the chance to submit a present, we would deadlock. Before the lock however, it was still buggy since the two threads would race to poll events and update internal state. The fix is to just ensure that there are pending presentation requests in flight, so that forward progress is guaranteed before we take the poll lock. Also, use a timedlock for acquire next image. Similar as WaitForPresentKHR. Also need to make the busy flag atomic to actually allow acquire thread and present threads to access the busy flag. Take advantage of busy flag being atomic so that we can gracefully handle timeout == 0 scenarios where there actually are images available. Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no> Fixes: `8fc7927787` ("wsi/x11: Implement VK_KHR_present_wait on X11.) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19990>	2022-12-14 14:20:54 +00:00
Timur Kristóf	657d1be153	radv: Don't lower subgroup shuffle on GFX11. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20293>	2022-12-14 13:54:04 +00:00
Timur Kristóf	db5c3f170f	aco: Emulate Wave64 bpermute on GFX11. Similar to emit_gfx10_wave64_bpermute, but uses the new v_permlane64_b32 instruction to swap data between wave halves. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20293>	2022-12-14 13:54:04 +00:00
Timur Kristóf	853e76f007	aco: Stylistic changes to emit_gfx10_wave64_bpermute. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20293>	2022-12-14 13:54:04 +00:00
Timur Kristóf	640e801651	aco: Split opcodes for GFX6 and GFX10 emulated bpermute. Different sequences are emitted for these, so it makes sense to have different opcodes too. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20293>	2022-12-14 13:54:04 +00:00
Timur Kristóf	614348f28b	aco: Don't accept constants on p_bpermute. The sequence emitted for this pseudo instruction is not ready to handle constants or literals at all. Cc: mesa-stable Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20293>	2022-12-14 13:54:04 +00:00
Martin Roukala (né Peres)	27b70f28d9	ci/venus: add a VKCTS mapping test to the flakes list Seen on https://gitlab.freedesktop.org/mesa/mesa/-/jobs/33483156. Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20312>	2022-12-14 13:36:38 +00:00
Kenneth Graunke	16a7e15d4f	iris: Enable compression for image load/store in more cases We were calling iris_resource_texture_aux_usage here, which disables auxiliary support if color happens to already be resolved. This makes sense for read only images, where if we know ahead of time that aux doesn't contain any useful information, we can just tell the hardware to not bother looking at it. However, it makes no sense for mutable images, as even if the aux currently has no useful data, we want to produce that data when doing our image writes. Import the bits of logic we need from there and shed the rest. We don't need to consider HiZ, MCS, or MC, nor do we need to do format-based CCS compatibility checks on Gfx12+, so it's actually very little code. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19060>	2022-12-14 13:01:27 +00:00
Kenneth Graunke	bf3d6ca94f	iris: Allow fast clears on compressed image load/store access While I haven't found documentation saying definitively that HDC supports fast clear blocks, it seems to work just fine, even on Tigerlake. I have found several issues (atomics and HDC support for linear compression) that both call out fast clears as an issue in those corner cases, which suggests that fast clears do actually work outside of those corners (which we already disable). The previous commit implemented actual aux state updates for image views. With ISL_AUX_USAGE_GFX12_CCS_E, this means that we update the aux state to COMPRESSED_CLEAR after writes. But because we weren't supporting fast clears, this meant that any such images would need partial resolves to remove the clear color on next use. Supporting fast clears allows us to drop all these resolves. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19060>	2022-12-14 13:01:27 +00:00
Kenneth Graunke	7b2a690a35	iris: Update aux state tracking for image views after draws/dispatches On Tigerlake and later, we enable compression for image views. However, we never actually added any code to update the aux state, which meant that if it ever changed, things would break, badly. We managed to avoid catastrophic effects in most cases because of two other issues which papered over the problem: if compression wasn't already enabled for an image, we'd leave it disabled. And, we avoided writing via the CPU to buffers with auxiliary. So in most cases, CCS remained disabled, or got enabled (say by glTexImage()) then stayed on permanently. There were still issues, but they managed to remain more hidden than one would expect given the severity of the bug. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19060>	2022-12-14 13:01:27 +00:00
Kenneth Graunke	a9652fe588	iris: Drop disable_rb_aux_buffer handling for image views The goal here is to support OpenGL 4.6 section 9.3, "Feedback Loops Between Textures and the Framebuffer" (from GL_ARB_texture_barrier) where you can bind an image as both a framebuffer attachment and a texture, and simultaneously sample-from and render-to it. I'm not aware of any spec language that requires us to handle simultaneously accessing something as a framebuffer attachment and an image load/store resource. GL_ARB_shader_image_load_store tends to make flushing and synchronization something the app has to handle explicitly rather than something the driver needs to do implicitly. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19060>	2022-12-14 13:01:27 +00:00
Kenneth Graunke	806082e96f	iris: Drop 'isl_' prefix from 'formats_are_fast_clear_compatible' Every time I see this function I think it's part of isl. But it's not, it's just a static function in an iris file. The point of the name was that the function checks two isl_format enums...but the prefix is just confusing. Just drop the prefix as it's a static function. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19060>	2022-12-14 13:01:27 +00:00
Kenneth Graunke	880fab60a7	iris: Pin the clear color BO in use_image() Images with the RC_CCS modifier store the clear color in a separate BO, which we also need to pin when using an image view. Most images store the clear color in the same BO so it works anyway. Thanks to Nanley Chery for catching this! Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19060>	2022-12-14 13:01:27 +00:00
Kenneth Graunke	699e60681a	iris: Drop batch parameter from iris_update_postdraw_resolve_tracking Eventually the resolve code started making everything take ice instead of batch, and at some point this ceased to be used. It's always render. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19060>	2022-12-14 13:01:27 +00:00
Emma Anholt	9dedbf66f6	zink: Fix reversed cap declarations for ImageBuffer Fixes validation fails on KHR-GLES31.core.texture_buffer.texture_buffer_texture_buffer_range. Fixes: `f55a4407ef` ("zink: more accurately set {Sampled,Image}Buffer caps") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20308>	2022-12-14 12:23:58 +00:00
Martin Roukala (né Peres)	bedb9b73db	radv/ci: bump most jobs to the kernel to 6.1 + latest firmwares Unfortunately, not all jobs can be using Linux 6.1 right now, as NAVI10 hits __vm_enough_memory errors then hangs in VKCTS. So for this job, we will keep Linux 5.17 until this gets fixed. Reference: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7888 Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16835>	2022-12-14 10:20:11 +00:00
Marcin Ślusarz	264a0cabd1	anv: assert when number of primitives is higher than max Such cases can lead to memory corruptions. Acked-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20279>	2022-12-14 09:55:11 +00:00
Marcin Ślusarz	d7a1916798	anv: handle mesh shaders with max primitives == 0 Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20279>	2022-12-14 09:55:10 +00:00
Samuel Pitoiset	c26a053f2b	radv: disable more NIR opts in radv_postprocess_nir() with DISABLE_OPTIMIZATIONS To make fast-linking with GPL hopefully a bit faster. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20244>	2022-12-14 09:01:31 +00:00
Samuel Pitoiset	05d2ed7350	radv: move a conditional check to radv_remove_color_exports() Better to have all restrictions inside the function. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20244>	2022-12-14 09:01:31 +00:00
Samuel Pitoiset	a43482e8d6	radv: advertise VK_AMD_shader_early_and_late_fragment_tests Pass all dEQP-VK.early_and_late tests on GFX10.3. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19738>	2022-12-14 08:16:27 +00:00
Samuel Pitoiset	3ff58049b5	radv: implement AMD_shader_early_and_late_fragment_tests Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19738>	2022-12-14 08:16:27 +00:00
Samuel Pitoiset	877c10efd1	spirv: add support for AMD_shader_early_and_late_fragment_tests Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19738>	2022-12-14 08:16:27 +00:00
David Wu	ac8131b564	radeonsi/vcn: add support for 10bit input and enc 8bit output This change is to support 10bit YUV input in addition to original H264/HEVC 8bit output case. It adds rvcn_enc_input_format_t and rvcn_enc_output_format_t for picture input format and output format separately. Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20284>	2022-12-14 07:42:28 +00:00
Ian Romanick	eb76cee9f8	nir: Eliminate nir_op_i2b There are a lot of optimizations in opt_algebraic that match ('ine', a, 0), but there are almost none that match i2b. Instead of adding a huge pile of additional patterns (including variations that include both ine and i2b), always lower i2b to a != 0. At this point in the series, it should be impossible for anything to generate i2b, so there /should not/ be any changes. The failing test on d3d12 is a pre-existing bug that is triggered by this change. I talked to Jesse about it, and, after some analysis, he suggested just adding it to the list of known failures. v2: Don't rematerialize i2b instructions in dxil_nir_lower_x2b. v3: Don't rematerialize i2b instructions in zink_nir_algebraic.py. v4: Fix zink-on-TGL CI failures by calling nir_opt_algebraic after nir_lower_doubles makes progress. The latter can generate b2i instructions, but nir_lower_int64 can't handle them (anymore). v5: Add back most of the hunk at line 2125 of nir_opt_algebraic.py. I had accidentally removed the f2b(bf2(x)) optimization. v6: Just eliminate the i2b instruction. v7: Remove missed i2b32 in midgard_compile.c. Remove (now unused) emit_alu_i2orf2_b1 function from sfn_instr_alu.cpp. Previously this function was still used. 🤷 No shader-db changes on any Intel platform. All Intel platforms had similar results. (Ice Lake shown) Instructions in all programs: 141165875 -> 141165873 (-0.0%) Instructions helped: 2 Cycles in all programs: 9098956382 -> 9098956350 (-0.0%) Cycles helped: 2 The two Vulkan shaders are helped because of the "new" (('b2i32', ('ine', ('ubfe', a, b, 1), 0)), ('ubfe', a, b, 1)) algebraic pattern. Acked-by: Jesse Natalie <jenatali@microsoft.com> [earlier version] Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Daniel Schürmann <daniel@schuermann.dev> [earlier version] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>	2022-12-14 06:23:21 +00:00
Ian Romanick	8b37046765	nir/builder: Handle i2b conversions specially in nir_type_convert The shaders affected here are ones that were previously affected when i2b was unconditionally lowered in opt_algebraic. There are a few places where some transformations happen in a different order, so some algebraic patterns are missed. All Broadwell and newer Intel platforms had similar results. (Ice Lake shown) total instructions in shared programs: 19914369 -> 19914566 (<.01%) instructions in affected programs: 92375 -> 92572 (0.21%) helped: 0 / HURT: 90 total cycles in shared programs: 853851470 -> 853867215 (<.01%) cycles in affected programs: 12400663 -> 12416408 (0.13%) helped: 28 / HURT: 69 Haswell and Ivy Bridge had similar results. (Haswell shown) total instructions in shared programs: 16710721 -> 16710700 (<.01%) instructions in affected programs: 108010 -> 107989 (-0.02%) helped: 57 / HURT: 103 total cycles in shared programs: 884299412 -> 884306546 (<.01%) cycles in affected programs: 12986423 -> 12993557 (0.05%) helped: 87 / HURT: 102 total spills in shared programs: 14937 -> 14925 (-0.08%) spills in affected programs: 12 -> 0 helped: 9 / HURT: 0 total fills in shared programs: 17569 -> 17557 (-0.07%) fills in affected programs: 12 -> 0 helped: 9 / HURT: 0 Sandy Bridge total instructions in shared programs: 13902341 -> 13902347 (<.01%) instructions in affected programs: 7311 -> 7317 (0.08%) helped: 3 / HURT: 8 total cycles in shared programs: 741795500 -> 741792266 (<.01%) cycles in affected programs: 273308 -> 270074 (-1.18%) helped: 9 / HURT: 2 No shader-db changes on any other Intel platform. No fossil-db changes on any Intel platform. Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>	2022-12-14 06:23:21 +00:00
Ian Romanick	edae161d98	intel/fs: Use nir_type_convert instead of nir_type_conversion_op In a future commit, nit_type_conversion_op won't be able to handle i2b (and in a much later commit f2b), so switch many users to the fully featured function. No shader-db or fossil-db changes on any Intel platform. Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>	2022-12-14 06:23:21 +00:00
Ian Romanick	e34b8866b4	microsoft/compiler: Use nir_type_convert instead of nir_type_conversion_op In a future commit, nit_type_conversion_op won't be able to handle i2b (and in a much later commit f2b), so switch many users to the fully featured function. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>	2022-12-14 06:23:21 +00:00
Ian Romanick	58164794f4	spirv: Use nir_type_convert instead of nir_type_conversion_op In a future commit, nit_type_conversion_op won't be able to handle i2b (and in a much later commit f2b), so switch many users to the fully featured function. No shader-db or fossil-db changes on any Intel platform. v2: Use the actual bit size of the source to determine the conversion op. With mediump, the "planned" bit size and the actual bit size might be different. Fixes many, many Vulkan CTS assertion failures on any platform that sets mediump_16bit_alu (e.g., Freedreno). Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> [v1] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>	2022-12-14 06:23:21 +00:00
Ian Romanick	ded3572947	nir: Use nir_type_convert instead of nir_type_conversion_op In a future commit, nit_type_conversion_op won't be able to handle i2b (and in a much later commit f2b), so switch many users to the fully featured function. No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>	2022-12-14 06:23:21 +00:00
Ian Romanick	1197030727	glsl: Use nir_type_convert instead of nir_type_conversion_op In a future commit, nit_type_conversion_op won't be able to handle i2b (and in a much later commit f2b), so switch many users to the fully featured function. In gl_nir_lower_packed_varyings, all of the type conversions are between int32 and uint32 types. In NIR, those are just moves, so elide them. No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>	2022-12-14 06:23:21 +00:00
Ian Romanick	9f86d18b2d	nir/builder: Add rounding mode parameter to nir_type_convert Later changes will use this. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>	2022-12-14 06:23:21 +00:00
Ian Romanick	43da822312	glsl_to_nir: Fix NIR bit-size of ir_triop_bitfield_extract and ir_quadop_bitfield_insert Previously these would return result->bit_size of 32 even though the type might have been int16_t or uint16_t. This prevents many assertion failures in "glsl: Use nir_type_convert instead of nir_type_conversion_op" on zink. Fixes: `5e922fbc16` ("glsl_to_nir: fix bitfield_extract with 16-bit operands") Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>	2022-12-14 06:23:21 +00:00
Ian Romanick	1fae751d49	microsoft/compiler: Simplify nir_intrinsic_load_front_face handling It is invalid to have Boolean variables as either shader inputs or outputs, so there is no point to try to lower them in general. The only use for this was some two-phase lowering of nir_intrinsic_load_front_face that could be done in a single phase. Create the SYSTEM_VALUE_FRONT_FACE as a uint and compare it with zero at the same time. No shader-db or fossil-db changes on any Intel platform. v2: Remove dxil_nir_lower_bool_input from dxil_nir.h and drop it from the other caller in the spirv_to_dxil codepath. Noticed by Jesse. Fix setting bit size when loading SYSTEM_VALUE_FRONT_FACE. Caught by CI. v3: Use nir_ine_imm. Change type of gl_FrontFacing GS output in d3d12_nir_passes from Boolean to integer. Both suggested by Jesse. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>	2022-12-14 06:23:21 +00:00
Ian Romanick	9342c14eeb	nir/builder: Emit x != 0 for nir_i2b There are a lot of optimizations in opt_algebraic that match ('ine', a, 0), but there are almost none that match i2b. Instead of adding a huge pile of additional patterns (including variation that include both ine and i2b), just emit a != 0 instead of i2b(a). I think that the changes to the unit tests weaken them slightly, but perhaps that's okay? No shader-db changes on any Intel platform. The GLSL paths use other means to generate i2b operations, but the SPIR-V paths use nir_i2b. Presumably since `4676b3d3dd` (nir: Use nir_test_mask instead of i2b(iand)), no fossil-db changes either. v2: Use nir_ine_imm. Suggested by Jesse. Acked-by: Jesse Natalie <jenatali@microsoft.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>	2022-12-14 06:23:21 +00:00
Ian Romanick	7a5e9df39d	nir: Use nir_i2b wrapper everywhere instead of using nir_i2b1 directly No shader-db or fossil-db changes on any Intel platform. v2: Add missed i2b1 in ir3_nir_opt_preamble.c. v3: Add missed i2b1 in ac_nir_lower_ngg.c. Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Acked-by: Jesse Natalie <jenatali@microsoft.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>	2022-12-14 06:23:21 +00:00
Ian Romanick	b60b2f2add	nir/algebraic: Optimize some b2i involved in masking operations v2: Remove the ineg from the b2i in the ior pattern. Suggested by Jason. All Ivy Bridge and newer Intel platforms had similar results. (Ice Lake shown) total instructions in shared programs: 19914441 -> 19914369 (<.01%) instructions in affected programs: 63507 -> 63435 (-0.11%) helped: 24 / HURT: 0 total cycles in shared programs: 853869766 -> 853851470 (<.01%) cycles in affected programs: 10551542 -> 10533246 (-0.17%) helped: 24 / HURT: 0 All Intel platforms had similar results. (Ice Lake shown) Instructions in all programs: 141163061 -> 141092683 (-0.0%) Instructions helped: 14103 Instructions hurt: 55 Cycles in all programs: 9132376195 -> 9133183045 (+0.0%) Cycles helped: 13775 Cycles hurt: 380 Spills in all programs: 18286 -> 18284 (-0.0%) Spills helped: 1 Fills in all programs: 30647 -> 30643 (-0.0%) Fills helped: 1 Gained: 133 Lost: 130 Acked-by: Jesse Natalie <jenatali@microsoft.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>	2022-12-14 06:23:21 +00:00
Ian Romanick	ba0b248ac2	nir/algebraic: Eliminate unary op on src of integer comparison w/ zero This helps because it enables cmod propagation to do more. The removed patterns involving b2i will be handled by other existing patterns after the unary operations are removed. All Intel platforms had similar results. (Ice Lake shown) total instructions in shared programs: 19914458 -> 19914441 (<.01%) instructions in affected programs: 5456 -> 5439 (-0.31%) helped: 17 / HURT: 0 total cycles in shared programs: 855302118 -> 853869766 (-0.17%) cycles in affected programs: 327354347 -> 325921995 (-0.44%) helped: 291 / HURT: 81 All Intel platforms had similar results. (Ice Lake shown) Instructions in all programs: 141205979 -> 141205961 (-0.0%) Instructions helped: 4 Instructions hurt: 3 SENDs in all programs: 7466919 -> 7466913 (-0.0%) SENDs helped: 1 Cycles in all programs: 9133387327 -> 9133384475 (-0.0%) Cycles helped: 3 Cycles hurt: 12 In the shader that was helped for sends, it appears that a NIR pass that moves code out of loops was able to move 3 send operations outside a loop after this change. I did not investigate further. Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Acked-by: Jesse Natalie <jenatali@microsoft.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>	2022-12-14 06:23:20 +00:00
Ian Romanick	ee15d89322	nir/algebraic: Simplify min and max of b2i This prevents ~400 shader-db regresssions and a handful of fossil-db regressions after i2b is always lowered. All Ivy Bridge and newer Intel platforms had similar results. (Ice Lake shown) total cycles in shared programs: 855301494 -> 855302118 (<.01%) cycles in affected programs: 52787 -> 53411 (1.18%) helped: 4 / HURT: 5 All Intel platforms had similar results. (Ice Lake shown) Instructions in all programs: 141206055 -> 141205979 (-0.0%) Instructions helped: 14 Cycles in all programs: 9133376616 -> 9133387327 (+0.0%) Cycles helped: 13 Cycles hurt: 3 Acked-by: Jesse Natalie <jenatali@microsoft.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>	2022-12-14 06:23:20 +00:00
Ian Romanick	19222867e4	nir/algebraic: Reassociate some iand to eliminate an operation No shader-db changes on any Intel platform. All of the helped shaders were presumably regressed by `4676b3d3dd` (nir: Use nir_test_mask instead of i2b(iand)). v2: Add some comments explaining why specific replacements are used. In the umin pattern, only markup the first usage of 'b' in the source pattern. Tiger Lake, Ice Lake, and Skylake had similar results. (Ice Lake shown) Instructions in all programs: 141384970 -> 141200966 (-0.1%) Instructions helped: 45842 Cycles in all programs: 9133648977 -> 9133282672 (-0.0%) Cycles helped: 26812 Cycles hurt: 6025 Gained: 23 Lost: 135 Acked-by: Jesse Natalie <jenatali@microsoft.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>	2022-12-14 06:23:20 +00:00
Ian Romanick	d48ce1f47d	nir/algebraic: Remove redundant i2b(b2i(x)) patterns A loop below already adds all the permutations... including the 1-bit version that isn't included in this group. No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Acked-by: Jesse Natalie <jenatali@microsoft.com> Tested-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>	2022-12-14 06:23:20 +00:00
Ian Romanick	14a9bb04e4	nir/algebraic: Remove redundant i2b(-x) pattern The exact same pattern appears later (around line 1323). No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Acked-by: Jesse Natalie <jenatali@microsoft.com> Tested-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>	2022-12-14 06:23:20 +00:00
Ian Romanick	8d90b13954	nir/algebraic: Catch some kinds of copy-and-paste bugs in algebraic patterns A later commit adds a pattern (('umin', ('iand', a, '#b(is_pos_power_of_two)'), ('iand', c, '#b(is_pos_power_of_two)')), ('iand', ('iand', a, b), ('iand', c, b))), When I originally made that pattern, I copied and pasted the search to the replacement as (('umin', ('iand', a, '#b(is_pos_power_of_two)'), ('iand', c, '#b(is_pos_power_of_two)')), ('iand', ('iand', a, '#b(is_pos_power_of_two)'), ('iand', c, '#b(is_pos_power_of_two)'))), The caused the variables in the replacement to be marked is_constant, and that resulted in an assertion failure deep inside nir_search. src/compiler/nir/nir_search.c:530: construct_value: Assertion `!var->is_constant' failed. These extra validation rules catch this kind of error at compile time rather than at run time. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Acked-by: Jesse Natalie <jenatali@microsoft.com> Tested-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>	2022-12-14 06:23:20 +00:00
Yonggang Luo	fa02fb5cca	gallium/pp: typedef and use pp_st_invalidate_state_func to avoid cast Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20042>	2022-12-14 05:47:52 +00:00

1 2 3 4 5 ...

164304 commits