fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-20 02:38:07 +02:00

Author	SHA1	Message	Date
Georg Lehmann	bf0d1a42b4	nir: remove uses_fddx_fddy Unused and the code didn't even do what the comment said. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31014>	2024-10-17 09:50:19 +00:00
Georg Lehmann	cba575f4df	nir: always emit ddx intrinsics Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31014>	2024-10-17 09:50:19 +00:00
Georg Lehmann	41cce70584	spirv: remove alu fddx/fddy from comment Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31014>	2024-10-17 09:50:19 +00:00
Georg Lehmann	1371a8fe2b	nir/opt_move_discards_to_top: handle ddx/ddy intrinsics Fixes: `daa97bb41a` ("amd: switch to derivative intrinsics") Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31014>	2024-10-17 09:50:19 +00:00
Marek Olšák	948f94b8c5	nir/opt_varyings: pack TCS inputs with cross-invocation access together Unigine Heaven has a TCS that reads pos.xyz and tescoord.w from all invocations in every invocation. By putting those two in the same vec4, AMD hw can reduce the amount of shared memory that is allocated for those inputs from 2 vec4s to 1 vec4. Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31670>	2024-10-17 03:30:07 +00:00
Marek Olšák	8e93907b7c	nir/opt_varyings: assign locations of no_varying IO for TCS outputs only Skip the code for other shader stages because it doesn't do anything there. Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31670>	2024-10-17 03:30:07 +00:00
Daniel Almeida	279f38918f	nak: memstream: move into common code Move the memstream code into common code. Other Rust code interfacing with FILE pointers will find the memstream abstraction useful. Most notably, pinning is actually enforced this time with PhantomPinned. Add a .clang-format from a sibling dir (i.e.: compiler/nir) while we're at it to keep things tidy. Signed-off-by: Daniel Almeida <daniel.almeida@collabora.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30594>	2024-10-17 02:50:21 +00:00
Connor Abbott	65c0846537	nir/lower_input_attachments: Handle unscaled input attachments with no index With VK_KHR_dynamic_rendering_local_read we can have input attachments with no index, which normally correspond to depth/stencil attachments, and we have to handle this here when determining whether we need to emit an unscaled fragcoord for FDM. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31261>	2024-10-17 00:30:44 +00:00
Connor Abbott	4bd506a7f3	spirv: Make the default input attachment index ~0 This will let us know when an input attachment doesn't have an InputAttachmentIndex, which used to be illegal but is now allowed and meaningful with VK_KHR_dynamic_rendering_local_read. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31261>	2024-10-17 00:30:44 +00:00
Job Noorman	4556b18f51	nir: add shuffle_{xor,up,down}_uniform_ir3 intrinsics These are like shuffle_{xor,up,down} except they expect a dynamically uniform index. This is necessary since the ir3 shfl instruction does not work with a divergent index. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31501>	2024-10-16 22:05:10 +00:00
Danylo Piliaiev	7b09fc98fb	nir/opt_16b_tex_image: Sign extension should matter for texel buffer txf Texel buffer could be arbitrary large, so the assumption being made in the following comment is wrong: "Zero-extension (u16) and sign-extension (i16) have the same behavior here - txf returns 0 if bit 15 is set because it's out of bounds and the higher bits don't matter." Sign extension should matter for GLSL_SAMPLER_DIM_BUF. This fixes the case of doing texelFetch with u16 offset: uniform itextureBuffer s1; uint16_t offset = some_ssbo.offset; value = texelFetch(s1, offset).x; If the offset is higher than s16 optimization incorrectly left it as 16b. In spirv the above glsl is translated into: %22 = OpLoad %ushort %21 %23 = OpUConvert %uint %22 %24 = OpBitcast %int %23 %26 = OpImageFetch %v4int %16 %24 Cc: mesa-stable Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31664>	2024-10-16 10:10:00 +00:00
Timothy Arceri	aa7c59e02c	nir/glsl: set deref cast mode for blocks during function inlining More cast fixes this time for UBO and SSBO. Which were missing testing previously. Fixes: `d681cf96fb` ("nir/glsl: set deref cast mode during function inlining") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11587 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31668>	2024-10-16 06:25:57 +00:00
Marek Olšák	0727634443	nir/opt_load_store_vectorize: vectorize load_smem_amd radeonsi+ACO with the new vectorization callback: TOTALS FROM AFFECTED SHADERS (19508/58918) VGPRs: 708672 -> 708864 (0.03 %) Code Size: 31458688 -> 31217160 (-0.77 %) bytes Max Waves: 305960 -> 305952 (-0.00 %) Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>	2024-10-15 05:50:24 +00:00
Marek Olšák	a44e5cfccf	nir/opt_load_store_vectorize: allow a 4-byte hole between 2 loads If there is a 4-byte hole between 2 loads, drivers can now optionally vectorize the loads by including the hole between them, e.g.: 4B load + 4B hole + 8B load --> 16B load All vectorize callbacks already reject all holes, but AMD will want to allow it. radeonsi+ACO with the new vectorization callback: TOTALS FROM AFFECTED SHADERS (25248/58918) VGPRs: 871116 -> 871872 (0.09 %) Spilled SGPRs: 397 -> 407 (2.52 %) Code Size: 43074536 -> 42496352 (-1.34 %) bytes Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>	2024-10-15 05:50:24 +00:00
Marek Olšák	80c156422d	nir/opt_load_store_vectorize: allow overfetching, merge overfetched loads New load merging transformations (first, second), examples: (vec4, vec3) ==> vec8(read=0x7f) (because NIR doesn't have vec7) (vec1, vec8(read=0x7f)) ==> vec8(read=0xff) - the unused component at the end of vec8 is dropped Not merged: vec8(read=0xfe) + vec1 - unused components at the beginning are kept Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>	2024-10-15 05:50:24 +00:00
Marek Olšák	65ace5649b	nir: reject unsupported component counts from all vectorize callbacks If you allow an unsupported component count in the callback for loads, nir_opt_load_store_vectorize will align num_components to the next supported vector size, essentially overfetching. This changes all callbacks to reject it. AMD will enable it in a later commit. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>	2024-10-15 05:50:24 +00:00
Marek Olšák	02923e237d	nir: add hole_size parameter into the vectorize callback It will be used to allow merging loads with a hole between them. Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>	2024-10-15 05:50:24 +00:00
Marek Olšák	8ce43b7765	nir/opt_load_store_vectorize: add entry::num_components We will represent vec6..vec7, vec9..vec15 loads with 8 and 16 components respectively, so we need to track how many components we really use. This is a prerequisite for optimal merging up to vec16. Example: Step 1: vec4 + vec3 ==> vec7as8 (last component unused) Step 2: vec1 + vec7as8 ==> vec8 (last unused component dropped) Without using the number of components read, the same example would end up doing: Step 1: vec4 + vec3 ==> vec8 Step 2: vec1 + vec8 ==> vec9 (fail) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>	2024-10-15 05:50:24 +00:00
Alyssa Rosenzweig	e9303c0952	nir: extract round component helper another nir pass will use this. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>	2024-10-15 05:50:24 +00:00
Marek Olšák	64c4d29e65	nir/opt_vectorize_io: fix stack buffer overflow with 16-bit output stores uncovered by unrelated work Fixes: `2514999c9c` - nir: add nir_opt_vectorize_io, vectorizing lowered IO Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31644>	2024-10-15 03:59:17 +00:00
Timothy Arceri	46facf9037	nir/glsl: set cast mode for image during function inlining Fixes: `d681cf96fb` ("nir/glsl: set deref cast mode during function inlining") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11980 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31554>	2024-10-15 03:13:24 +00:00
Konstantin Seurer	70a1453537	nir/print: Fix the alignment of 8-bit definitions Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31612>	2024-10-14 21:21:04 +00:00
Adam Jackson	605d6aaf13	vtn: Handle SPV_INTEL_optnone We don't advertise this in rusticl (and probably shouldn't, at least until we can honor the request) but DPC++ emits this regardless so we may as well ignore it. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31592>	2024-10-11 15:39:45 +00:00
Rhys Perry	67ad7359ff	nir/divergence_analysis: disable phi undef optimization by default If the backend does not implement this too, or some other future transform modifiess the phi so that this isn't the case (replace the phi with a bcsel or replace undef with zero), then it will not actually be uniform. This keeps it enabled to some degree for RADV/ACO. fossil-db (navi31): Totals from 76 (0.10% of 79395) affected shaders: Instrs: 195008 -> 195282 (+0.14%) CodeSize: 1012592 -> 1015884 (+0.33%) Latency: 3892826 -> 3898843 (+0.15%); split: -0.00%, +0.15% InvThroughput: 460681 -> 460964 (+0.06%) Copies: 13508 -> 13516 (+0.06%) Branches: 5244 -> 5412 (+3.20%) PreVGPRs: 5092 -> 5096 (+0.08%) VALU: 116177 -> 116197 (+0.02%) SALU: 23449 -> 23785 (+1.43%) fossil-db (navi21): Totals from 76 (0.10% of 79395) affected shaders: Instrs: 164471 -> 164981 (+0.31%) CodeSize: 883988 -> 888420 (+0.50%) Latency: 4074287 -> 4082043 (+0.19%) InvThroughput: 783783 -> 784276 (+0.06%); split: -0.00%, +0.06% Branches: 5262 -> 5430 (+3.19%) PreVGPRs: 5100 -> 5104 (+0.08%) VALU: 116375 -> 116381 (+0.01%) SALU: 23589 -> 23925 (+1.42%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30211>	2024-10-10 14:59:26 +00:00
Caio Oliveira	c06a55fd39	spirv: Update SPIR-V grammar to use aliases For enumerants and instruction names, instead of duplicating the values now the grammar will use an aliases field to list the alternative names. Update the Python scripts for that. The new SPIR-V files correspond to d92cf88c371424591115a87499009dfad41b669c ("Add "aliases" fields to the grammar and remove duplicated (#447)") in https://github.com/KhronosGroup/SPIRV-Headers. Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31369>	2024-10-10 02:48:00 +00:00
Alyssa Rosenzweig	6287c8251d	nir: add bounds_agx opcode used to facilitate bounds checking optimization Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31532>	2024-10-05 18:30:11 +00:00
Timothy Arceri	065b45e4dc	glsl: remove linker.cpp All functionality has now been converted to NIR or moved elsewhere. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31500>	2024-10-04 00:10:59 +00:00
Timothy Arceri	e4c3e7e0d8	glsl: rename link_shaders() -> link_shaders_init() And move it to the linker util file. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31500>	2024-10-04 00:10:59 +00:00
Timothy Arceri	37ac8f5e79	glsl: move shader cache lookup call to st The nir shader cache read call is just below this call now so the code is easier to follow. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31500>	2024-10-04 00:10:59 +00:00
Timothy Arceri	b663eb83fe	glsl: move error and warning helpers to util file These functions are already defined in linker_util.h so moving them here is logical. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31500>	2024-10-04 00:10:59 +00:00
Timothy Arceri	19c27c39b4	glsl/mesa: remove ir_uniform.h We moved its contents elsewhere in a previous patch. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31500>	2024-10-04 00:10:59 +00:00
Timothy Arceri	13301e2509	glsl: move resource_name_updated() to linker_util.cpp We want to remove the old linker.cpp file in following patches so move this util function to the util code file. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31500>	2024-10-04 00:10:58 +00:00
Timothy Arceri	08e25e091b	glsl/mesa: move uniform related shader structs to shader_types.h This is where all the other uniform related structs are and these were the only structs used by both the compiler and gl api that were defined in the compiler code so lets move them to where everything else is defined. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31500>	2024-10-04 00:10:58 +00:00
Iago Toral Quiroga	aac1c074cc	nir: make fclamp_pos_mali and fsat_signed_mali opcodes generic V3D can use these too. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31480>	2024-10-03 09:02:07 +00:00
Timothy Arceri	1c58f513c4	glsl: fix gl_{Clip,Cull}Distance error messages The error message in the linker that checked gl_MaxCombinedClipAndCullDistances would never be issued because the compiler was already doing the check. I think the compiler might have been done this way in the original commit `d656736b` as the linker only sets the size when the clip/cull outputs are written so the piglit test for this wouldn't have been triggered as it does not write to the outputs. Here we move the error to the compiler and fix things up so the correct messages are triggered. Fixes: `d656736bbf` ("glsl: Add arb_cull_distance support (v3)") Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31471>	2024-10-03 06:59:47 +00:00
Faith Ekstrand	62a4fe861a	nir: Add an option to lower quad vote Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31470>	2024-10-02 21:10:32 +00:00
Marek Olšák	f546df95a6	nir/opt_vectorize_io: fix skipped output vectorization if inputs were vectorized Fixes: `2514999c9c` - nir: add nir_opt_vectorize_io, vectorizing lowered IO Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31454>	2024-10-02 20:26:23 +00:00
Job Noorman	4d50504b26	nir/lower_int64: add nir_intrinsic_rotate Can simply be split into 32b ops. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31455>	2024-10-02 06:35:49 +00:00
Job Noorman	e6a5c342da	nir/lower_int64: add nir_intrinsic_read_invocation_cond_ir3 Can simply be split into 32b ops. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31455>	2024-10-02 06:35:49 +00:00
Job Noorman	584b63ecab	nir/load_store_vectorize: fix division by zero Don't use glsl_get_explicit_stride as it may return 0 for vector types, use nir_deref_instr_array_stride instead. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31460>	2024-10-02 05:53:57 +00:00
Rhys Perry	be64454710	nir/tests: test opt_loop_peel_initial_break with derefs in header block Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31324>	2024-10-01 12:24:22 +00:00
Rhys Perry	0484044b1a	nir/opt_loop: rematerialize header block derefs in their use blocks Otherwise, we could end up with phis of derefs. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Fixes: `6b4b044739` ("nir/opt_loop: add loop peeling optimization") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31324>	2024-10-01 12:24:22 +00:00
Christian Gmeiner	1421319dcf	compiler/rust: Copy MappedInstrs from NAK Rename it to SmallVec, make it more generic and switch NAK to it. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31409>	2024-10-01 11:33:35 +00:00
Gert Wollny	f19f1ec17b	nir/opt_algebraic: Allow two-step lowering of ftrunc@64 to use ffract@64 If ftrunc@64 is lowered by nir_lower_doubles it is turned into a comparable long series of 32 bit operations. If the hardware supports ffract@64 then nir_opt_algebraic can first lower ftrunc@64 to use some combinations with ffloor@64. They can then be turned into a combination of fsub@64 and ffract@64 resulting in less all-over instructions. Fixes: `5218cff34b` nir/algebraic: avoid double lowering of some fp64 operations Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29281>	2024-09-30 23:51:02 +00:00
Kenneth Graunke	0b34a7aff0	nir: Don't generate single iteration loops to zero-initialize memory If the stride we're adding to our loop counter is larger than the total amount of shared local memory we're trying to initialize, we know the loop will run at most one time. So we can skip emitting a loop. Loop unrolling appears to be unable to detect this currently. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31312>	2024-09-30 05:27:17 +00:00
Georg Lehmann	bb7e8d51b6	nir: delete nir_opt_reuse_constants Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31031>	2024-09-27 05:19:16 +00:00
Georg Lehmann	60776f87c3	nir/opt_remove_phis: rematerialize constants Foz-DB Navi31: Totals from 749 (0.94% of 79395) affected shaders: Instrs: 1224359 -> 1223722 (-0.05%); split: -0.07%, +0.02% CodeSize: 6468392 -> 6466296 (-0.03%); split: -0.06%, +0.03% Latency: 9764410 -> 9766457 (+0.02%); split: -0.01%, +0.03% InvThroughput: 1017401 -> 1017380 (-0.00%); split: -0.03%, +0.03% VClause: 19902 -> 19873 (-0.15%); split: -0.16%, +0.02% SClause: 38441 -> 38424 (-0.04%); split: -0.05%, +0.01% Copies: 86880 -> 86304 (-0.66%); split: -0.73%, +0.06% Branches: 34206 -> 34159 (-0.14%); split: -0.14%, +0.01% PreSGPRs: 45557 -> 45527 (-0.07%); split: -0.08%, +0.01% PreVGPRs: 32406 -> 32408 (+0.01%) VALU: 671633 -> 671533 (-0.01%); split: -0.02%, +0.01% SALU: 155284 -> 154675 (-0.39%); split: -0.40%, +0.00% VMEM: 27303 -> 27271 (-0.12%) SMEM: 67490 -> 67455 (-0.05%) Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31031>	2024-09-27 05:19:16 +00:00
Georg Lehmann	40fc85c15b	nir: make nir_instr_clone usable with load_const and undef Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31031>	2024-09-27 05:19:16 +00:00
Georg Lehmann	a9f8089240	nir: replace nir_opt_remove_phis_block with a single source version This is what callers actually want, and it simplifies nir_opt_remove_phis because we can assume dominance meta data is valid. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31031>	2024-09-27 05:19:16 +00:00
Georg Lehmann	41e82b8b8e	nir: sink is_subgroup_invocation_lt_amd Having it closer to the branches means we can eliminate an exec copy. Foz-DB Navi31: Totals from 11615 (14.63% of 79395) affected shaders: Instrs: 6804372 -> 6804903 (+0.01%); split: -0.04%, +0.05% CodeSize: 33684672 -> 33680584 (-0.01%); split: -0.07%, +0.05% VGPRs: 578616 -> 578604 (-0.00%) SpillSGPRs: 1506 -> 1304 (-13.41%) Latency: 29817034 -> 29821320 (+0.01%); split: -0.03%, +0.05% InvThroughput: 3581587 -> 3581217 (-0.01%); split: -0.02%, +0.01% VClause: 124826 -> 124782 (-0.04%); split: -0.04%, +0.00% SClause: 187916 -> 187645 (-0.14%); split: -0.27%, +0.13% Copies: 520969 -> 510027 (-2.10%); split: -2.20%, +0.10% PreSGPRs: 442584 -> 421344 (-4.80%) VALU: 3810755 -> 3810267 (-0.01%); split: -0.01%, +0.00% SALU: 763402 -> 752650 (-1.41%); split: -1.48%, +0.07% Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31184>	2024-09-26 14:29:14 +00:00

1 2 3 4 5 ...

9774 commits