fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-19 22:18:18 +02:00

Author	SHA1	Message	Date
Marek Olšák	8e93907b7c	nir/opt_varyings: assign locations of no_varying IO for TCS outputs only Skip the code for other shader stages because it doesn't do anything there. Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31670>	2024-10-17 03:30:07 +00:00
Connor Abbott	65c0846537	nir/lower_input_attachments: Handle unscaled input attachments with no index With VK_KHR_dynamic_rendering_local_read we can have input attachments with no index, which normally correspond to depth/stencil attachments, and we have to handle this here when determining whether we need to emit an unscaled fragcoord for FDM. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31261>	2024-10-17 00:30:44 +00:00
Connor Abbott	4bd506a7f3	spirv: Make the default input attachment index ~0 This will let us know when an input attachment doesn't have an InputAttachmentIndex, which used to be illegal but is now allowed and meaningful with VK_KHR_dynamic_rendering_local_read. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31261>	2024-10-17 00:30:44 +00:00
Job Noorman	4556b18f51	nir: add shuffle_{xor,up,down}_uniform_ir3 intrinsics These are like shuffle_{xor,up,down} except they expect a dynamically uniform index. This is necessary since the ir3 shfl instruction does not work with a divergent index. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31501>	2024-10-16 22:05:10 +00:00
Danylo Piliaiev	7b09fc98fb	nir/opt_16b_tex_image: Sign extension should matter for texel buffer txf Texel buffer could be arbitrary large, so the assumption being made in the following comment is wrong: "Zero-extension (u16) and sign-extension (i16) have the same behavior here - txf returns 0 if bit 15 is set because it's out of bounds and the higher bits don't matter." Sign extension should matter for GLSL_SAMPLER_DIM_BUF. This fixes the case of doing texelFetch with u16 offset: uniform itextureBuffer s1; uint16_t offset = some_ssbo.offset; value = texelFetch(s1, offset).x; If the offset is higher than s16 optimization incorrectly left it as 16b. In spirv the above glsl is translated into: %22 = OpLoad %ushort %21 %23 = OpUConvert %uint %22 %24 = OpBitcast %int %23 %26 = OpImageFetch %v4int %16 %24 Cc: mesa-stable Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31664>	2024-10-16 10:10:00 +00:00
Timothy Arceri	aa7c59e02c	nir/glsl: set deref cast mode for blocks during function inlining More cast fixes this time for UBO and SSBO. Which were missing testing previously. Fixes: `d681cf96fb` ("nir/glsl: set deref cast mode during function inlining") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11587 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31668>	2024-10-16 06:25:57 +00:00
Marek Olšák	0727634443	nir/opt_load_store_vectorize: vectorize load_smem_amd radeonsi+ACO with the new vectorization callback: TOTALS FROM AFFECTED SHADERS (19508/58918) VGPRs: 708672 -> 708864 (0.03 %) Code Size: 31458688 -> 31217160 (-0.77 %) bytes Max Waves: 305960 -> 305952 (-0.00 %) Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>	2024-10-15 05:50:24 +00:00
Marek Olšák	a44e5cfccf	nir/opt_load_store_vectorize: allow a 4-byte hole between 2 loads If there is a 4-byte hole between 2 loads, drivers can now optionally vectorize the loads by including the hole between them, e.g.: 4B load + 4B hole + 8B load --> 16B load All vectorize callbacks already reject all holes, but AMD will want to allow it. radeonsi+ACO with the new vectorization callback: TOTALS FROM AFFECTED SHADERS (25248/58918) VGPRs: 871116 -> 871872 (0.09 %) Spilled SGPRs: 397 -> 407 (2.52 %) Code Size: 43074536 -> 42496352 (-1.34 %) bytes Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>	2024-10-15 05:50:24 +00:00
Marek Olšák	80c156422d	nir/opt_load_store_vectorize: allow overfetching, merge overfetched loads New load merging transformations (first, second), examples: (vec4, vec3) ==> vec8(read=0x7f) (because NIR doesn't have vec7) (vec1, vec8(read=0x7f)) ==> vec8(read=0xff) - the unused component at the end of vec8 is dropped Not merged: vec8(read=0xfe) + vec1 - unused components at the beginning are kept Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>	2024-10-15 05:50:24 +00:00
Marek Olšák	65ace5649b	nir: reject unsupported component counts from all vectorize callbacks If you allow an unsupported component count in the callback for loads, nir_opt_load_store_vectorize will align num_components to the next supported vector size, essentially overfetching. This changes all callbacks to reject it. AMD will enable it in a later commit. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>	2024-10-15 05:50:24 +00:00
Marek Olšák	02923e237d	nir: add hole_size parameter into the vectorize callback It will be used to allow merging loads with a hole between them. Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>	2024-10-15 05:50:24 +00:00
Marek Olšák	8ce43b7765	nir/opt_load_store_vectorize: add entry::num_components We will represent vec6..vec7, vec9..vec15 loads with 8 and 16 components respectively, so we need to track how many components we really use. This is a prerequisite for optimal merging up to vec16. Example: Step 1: vec4 + vec3 ==> vec7as8 (last component unused) Step 2: vec1 + vec7as8 ==> vec8 (last unused component dropped) Without using the number of components read, the same example would end up doing: Step 1: vec4 + vec3 ==> vec8 Step 2: vec1 + vec8 ==> vec9 (fail) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>	2024-10-15 05:50:24 +00:00
Alyssa Rosenzweig	e9303c0952	nir: extract round component helper another nir pass will use this. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>	2024-10-15 05:50:24 +00:00
Marek Olšák	64c4d29e65	nir/opt_vectorize_io: fix stack buffer overflow with 16-bit output stores uncovered by unrelated work Fixes: `2514999c9c` - nir: add nir_opt_vectorize_io, vectorizing lowered IO Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31644>	2024-10-15 03:59:17 +00:00
Timothy Arceri	46facf9037	nir/glsl: set cast mode for image during function inlining Fixes: `d681cf96fb` ("nir/glsl: set deref cast mode during function inlining") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11980 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31554>	2024-10-15 03:13:24 +00:00
Konstantin Seurer	70a1453537	nir/print: Fix the alignment of 8-bit definitions Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31612>	2024-10-14 21:21:04 +00:00
Rhys Perry	67ad7359ff	nir/divergence_analysis: disable phi undef optimization by default If the backend does not implement this too, or some other future transform modifiess the phi so that this isn't the case (replace the phi with a bcsel or replace undef with zero), then it will not actually be uniform. This keeps it enabled to some degree for RADV/ACO. fossil-db (navi31): Totals from 76 (0.10% of 79395) affected shaders: Instrs: 195008 -> 195282 (+0.14%) CodeSize: 1012592 -> 1015884 (+0.33%) Latency: 3892826 -> 3898843 (+0.15%); split: -0.00%, +0.15% InvThroughput: 460681 -> 460964 (+0.06%) Copies: 13508 -> 13516 (+0.06%) Branches: 5244 -> 5412 (+3.20%) PreVGPRs: 5092 -> 5096 (+0.08%) VALU: 116177 -> 116197 (+0.02%) SALU: 23449 -> 23785 (+1.43%) fossil-db (navi21): Totals from 76 (0.10% of 79395) affected shaders: Instrs: 164471 -> 164981 (+0.31%) CodeSize: 883988 -> 888420 (+0.50%) Latency: 4074287 -> 4082043 (+0.19%) InvThroughput: 783783 -> 784276 (+0.06%); split: -0.00%, +0.06% Branches: 5262 -> 5430 (+3.19%) PreVGPRs: 5100 -> 5104 (+0.08%) VALU: 116375 -> 116381 (+0.01%) SALU: 23589 -> 23925 (+1.42%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30211>	2024-10-10 14:59:26 +00:00
Alyssa Rosenzweig	6287c8251d	nir: add bounds_agx opcode used to facilitate bounds checking optimization Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31532>	2024-10-05 18:30:11 +00:00
Iago Toral Quiroga	aac1c074cc	nir: make fclamp_pos_mali and fsat_signed_mali opcodes generic V3D can use these too. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31480>	2024-10-03 09:02:07 +00:00
Faith Ekstrand	62a4fe861a	nir: Add an option to lower quad vote Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31470>	2024-10-02 21:10:32 +00:00
Marek Olšák	f546df95a6	nir/opt_vectorize_io: fix skipped output vectorization if inputs were vectorized Fixes: `2514999c9c` - nir: add nir_opt_vectorize_io, vectorizing lowered IO Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31454>	2024-10-02 20:26:23 +00:00
Job Noorman	4d50504b26	nir/lower_int64: add nir_intrinsic_rotate Can simply be split into 32b ops. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31455>	2024-10-02 06:35:49 +00:00
Job Noorman	e6a5c342da	nir/lower_int64: add nir_intrinsic_read_invocation_cond_ir3 Can simply be split into 32b ops. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31455>	2024-10-02 06:35:49 +00:00
Job Noorman	584b63ecab	nir/load_store_vectorize: fix division by zero Don't use glsl_get_explicit_stride as it may return 0 for vector types, use nir_deref_instr_array_stride instead. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31460>	2024-10-02 05:53:57 +00:00
Rhys Perry	be64454710	nir/tests: test opt_loop_peel_initial_break with derefs in header block Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31324>	2024-10-01 12:24:22 +00:00
Rhys Perry	0484044b1a	nir/opt_loop: rematerialize header block derefs in their use blocks Otherwise, we could end up with phis of derefs. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Fixes: `6b4b044739` ("nir/opt_loop: add loop peeling optimization") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31324>	2024-10-01 12:24:22 +00:00
Gert Wollny	f19f1ec17b	nir/opt_algebraic: Allow two-step lowering of ftrunc@64 to use ffract@64 If ftrunc@64 is lowered by nir_lower_doubles it is turned into a comparable long series of 32 bit operations. If the hardware supports ffract@64 then nir_opt_algebraic can first lower ftrunc@64 to use some combinations with ffloor@64. They can then be turned into a combination of fsub@64 and ffract@64 resulting in less all-over instructions. Fixes: `5218cff34b` nir/algebraic: avoid double lowering of some fp64 operations Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29281>	2024-09-30 23:51:02 +00:00
Kenneth Graunke	0b34a7aff0	nir: Don't generate single iteration loops to zero-initialize memory If the stride we're adding to our loop counter is larger than the total amount of shared local memory we're trying to initialize, we know the loop will run at most one time. So we can skip emitting a loop. Loop unrolling appears to be unable to detect this currently. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31312>	2024-09-30 05:27:17 +00:00
Georg Lehmann	bb7e8d51b6	nir: delete nir_opt_reuse_constants Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31031>	2024-09-27 05:19:16 +00:00
Georg Lehmann	60776f87c3	nir/opt_remove_phis: rematerialize constants Foz-DB Navi31: Totals from 749 (0.94% of 79395) affected shaders: Instrs: 1224359 -> 1223722 (-0.05%); split: -0.07%, +0.02% CodeSize: 6468392 -> 6466296 (-0.03%); split: -0.06%, +0.03% Latency: 9764410 -> 9766457 (+0.02%); split: -0.01%, +0.03% InvThroughput: 1017401 -> 1017380 (-0.00%); split: -0.03%, +0.03% VClause: 19902 -> 19873 (-0.15%); split: -0.16%, +0.02% SClause: 38441 -> 38424 (-0.04%); split: -0.05%, +0.01% Copies: 86880 -> 86304 (-0.66%); split: -0.73%, +0.06% Branches: 34206 -> 34159 (-0.14%); split: -0.14%, +0.01% PreSGPRs: 45557 -> 45527 (-0.07%); split: -0.08%, +0.01% PreVGPRs: 32406 -> 32408 (+0.01%) VALU: 671633 -> 671533 (-0.01%); split: -0.02%, +0.01% SALU: 155284 -> 154675 (-0.39%); split: -0.40%, +0.00% VMEM: 27303 -> 27271 (-0.12%) SMEM: 67490 -> 67455 (-0.05%) Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31031>	2024-09-27 05:19:16 +00:00
Georg Lehmann	40fc85c15b	nir: make nir_instr_clone usable with load_const and undef Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31031>	2024-09-27 05:19:16 +00:00
Georg Lehmann	a9f8089240	nir: replace nir_opt_remove_phis_block with a single source version This is what callers actually want, and it simplifies nir_opt_remove_phis because we can assume dominance meta data is valid. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31031>	2024-09-27 05:19:16 +00:00
Georg Lehmann	41e82b8b8e	nir: sink is_subgroup_invocation_lt_amd Having it closer to the branches means we can eliminate an exec copy. Foz-DB Navi31: Totals from 11615 (14.63% of 79395) affected shaders: Instrs: 6804372 -> 6804903 (+0.01%); split: -0.04%, +0.05% CodeSize: 33684672 -> 33680584 (-0.01%); split: -0.07%, +0.05% VGPRs: 578616 -> 578604 (-0.00%) SpillSGPRs: 1506 -> 1304 (-13.41%) Latency: 29817034 -> 29821320 (+0.01%); split: -0.03%, +0.05% InvThroughput: 3581587 -> 3581217 (-0.01%); split: -0.02%, +0.01% VClause: 124826 -> 124782 (-0.04%); split: -0.04%, +0.00% SClause: 187916 -> 187645 (-0.14%); split: -0.27%, +0.13% Copies: 520969 -> 510027 (-2.10%); split: -2.20%, +0.10% PreSGPRs: 442584 -> 421344 (-4.80%) VALU: 3810755 -> 3810267 (-0.01%); split: -0.01%, +0.00% SALU: 763402 -> 752650 (-1.41%); split: -1.48%, +0.07% Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31184>	2024-09-26 14:29:14 +00:00
Georg Lehmann	bcfc5c09fa	amd: add offset to is_subgroup_invocation_lt_amd Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31184>	2024-09-26 14:29:13 +00:00
Marek Olšák	09e64e3682	nir/opt_shrink_vectors: shrink memory loads, not just IO The problem with radeonsi+ACO is that UBO loads from vec4 uniforms using only 1 component always load all 4 components. This fixes that. We are only interested in shrinking UBO and SSBO loads, but I added more intrinsics because why not. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29384>	2024-09-26 03:01:38 +00:00
Timothy Arceri	f6e7520b13	glsl: remove now unused linker code This has all be replaced by a nir based linker implementation. Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>	2024-09-25 09:39:44 +00:00
Timothy Arceri	fe9b93fc1c	nir: handle wildcard array deref Here we add handling of wildcard array derefs when attempting to mark an io as partially used rather than hitting an assert. Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>	2024-09-25 09:39:44 +00:00
Timothy Arceri	6bb6b0e5ad	nir: add nir_intrinsic_deref_implicit_array_length intrinsic This will be used to handle .length() calls on unsized arrays Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>	2024-09-25 09:39:44 +00:00
Timothy Arceri	60937b5286	nir: add implicit_conversion_prohibited field to nir_parameter Will be used in link time validation in following patches. Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>	2024-09-25 09:39:44 +00:00
Timothy Arceri	5645495156	nir: store variable mode in nir_parameter This will be used by the nir glsl linker in following patches. Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>	2024-09-25 09:39:44 +00:00
Timothy Arceri	89a2411c54	nir: serialize nir_parameter type Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>	2024-09-25 09:39:44 +00:00
Timothy Arceri	6ff3e87e5f	nir: add function in/outs to variable modes Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>	2024-09-25 09:39:44 +00:00
Timothy Arceri	1cb115abd2	nir: add nir_function_impl_clone_remap_globals() This will be use by the glsl nir linker when we are combining different shaders from the same shader stage that might have multiple declarations of global variables across the different shaders. Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>	2024-09-25 09:39:43 +00:00
Timothy Arceri	7a1061e0dd	nir: add max_ifc_array_access field to vars This will be used in following patches by the nir based glsl linker code. Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>	2024-09-25 09:39:43 +00:00
Timothy Arceri	7c5b21c032	glsl: add support for converting global instructions to NIR NIR doesn't really support global instructions such as global val initilisation. So here we add functionality to glsl_to_nir() to put these instructions into a temporary function that will be later inlined into main. We give the function a name starting with gl_mesa_tmp_ as functions starting with gl_ are reserved and will not have any clashes with user functions, we finish the name with the blake3 of the shader source to avoid conflicts with multiple shaders attached to a single stage. Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>	2024-09-25 09:39:43 +00:00
Georg Lehmann	e0bcab953d	nir: add amd shared append/consume Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31075>	2024-09-19 16:21:47 +00:00
Boris Brezillon	eeb3512498	nir/lower_ssbo: Extend the load_ssbo_address intrinsic to pass an offset On Mali(Valhall), the bounds checking can be done when in hardware, but for this to work properly, we need to pass the offset to the nir_load_ssbo_address() intrinsic. Add an offset source to the intrinsic, and adjust the lowering pass to conditionally lower the offset addition. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31164>	2024-09-18 13:45:57 +00:00
Boris Brezillon	adadb097a3	nir/lower_ssbo: Add an option to conditionally lower loads On Mali(Valhall), we have a way to load SSBO data without going through an SSBO index -> global address translation, so let's provide a way to tell nir_lower_ssbo() when it shouldn't lower loads. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31164>	2024-09-18 13:45:57 +00:00
Georg Lehmann	a3d6a770c0	nir/instr_set: fix fp_fast_math We can't just ignore the flags of the match, we need the union. Fixes: `666647acae` ("nir: track some float controls bits per instruction") Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31195>	2024-09-17 20:00:03 +00:00
Ian Romanick	6a09d33549	nir: Add a pass to generate BFI instructions from logical operations Inspired by a commit message in !30934, I set about optimizing the code generated for nir_copysign. It would be possible to just implement an opt_algebraic pattern for the specific values used by nir_copysign, but this casts a slightly larger net. As noted in a comment in the code, there may be variations of the pattern that this pass misses. The opt_algebraic pattern would miss them too. v2: Use nir_def_replace. Suggested by Alyssa. Allow more "root" instruction types. Suggested by Georg. v3: Treat extract_u16(x, 0) as (x & 0x0000ffff), and treat extract_u8(x, 0) as (x & 0x000000ff). v4: Use nir_scalar. Suggested by Georg. Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31006>	2024-09-13 00:21:00 +00:00

... 3 4 5 6 7 ...

5815 commits