fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-18 11:38:06 +02:00

Author	SHA1	Message	Date
Iago Toral Quiroga	5cec893384	broadcom/compiler: update comment on load_uniform fast-path The comment for 16-bit applies to 8-bit uniforms as well. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Iago Toral Quiroga	296fde31aa	broadcom/compiler: allow vectorization to larger scalar type Allow to vectorize operations from a smaller bit-size into scalar operations of a larger bit-size. This allows us to turn 2x8-bit into a equivalent scalar 16-bit load/store. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Iago Toral Quiroga	a248ff0b5b	broadcom/compiler: support 8-bit loads via ldunifa This generalizes the support we added for 16-bit to also handle 8-bit loads via ldunifa. The story is the same: we align the address to 32-bit downwards and we skip any bytes that are not of interest. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Iago Toral Quiroga	4630f5f016	broadcom/compiler: handle to/from 8-bit integer conversions Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Iago Toral Quiroga	1b530d948d	broadcom/compiler: support 8-bit general store access Just like with 16-bit, this mode only supports scalar access, but we are already lowering all non 32-bit accesses to scalar. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Iago Toral Quiroga	f7ff462421	broadcom/compiler: support 16-bit uniforms Since ldunif is a 32-bit instruction we need to demote these to UBO loads, like we do for indirect indexing, with the exception of scalar 16bit uniforms with an offset that is 32-bit aligned. For the exception where we can use lfdunif we read a 32-bit slot from memory where the uniform data is in the lower 16-bit and we will read garbage in the upper 16-bit which we won't use anyway. It should be noted that by using ldunif, we are consuming 32-bit from the uniform stream, but this is fine because if there is valid uniform data in the upper 16-bit (i.e. we had a ivec2 uniform aligned to a 32-bit address), since we scalarize 16-bit loads, we would see another load uniform with an unaligned offset for the second component, which we will demote to UBO. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Iago Toral Quiroga	49a8fa152c	broadcom/compiler: support f32 to f16 RTZ and RTE rounding modes These are required by VK_KHR_16bit_storage. Our hardware, however, doesn't provide any mechanism to decide on the rounding mode of the conversion and it seems to be using RTE, so we implement RTZ in software. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Iago Toral Quiroga	1f639d5310	broadcom/compiler: implement 32-bit/16-bit conversion opcodes Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Iago Toral Quiroga	bdb6201ea1	broadcom/compiler: use ldunifa with unaligned constant offset If we know we have a load with a constant offset, then even if it is not aligned to 32-bit we can still produce an aligned offset and then skip over the bytes we don't need. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Iago Toral Quiroga	2eb6910d96	broadcom/compiler: support ldunifa with some 16-bit loads Even though ldunifa is strictly 32-bit we may be able to use it to load 16-bit values that sit at 32-bit aligned addresses. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Iago Toral Quiroga	2a420bdf92	broadcom/compiler: lower packing after vectorization The vectorization pass can inject 32_2x16 (un)packing opcodes upon successful vectorization of 16-bit operations into 32-bit counterparts, so make sure we lower these to something our backend can handle. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Iago Toral Quiroga	4b24373137	broadcom/compiler: implement TMU general 16-bit load/store This allows us to implement 16-bit access on uniform and storage buffers. Notice that V3D hardware can only do general access on scalar 16-bit elements, which we currently enforce by running a lowering pass during shader compile. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Iago Toral Quiroga	2443e45e76	broadcom/compiler: better document vectorization implications Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Iago Toral Quiroga	765d9feb46	broadcom/compiler: add lowering pass to scalarize non 32-bit general load/store V3D hardware doesn't support vector access for general TMU load/store operations like the ones we use for UBO and SSBO, so we need to split these to scalar operations. It should be noted that we also have a vectorization pass (which runs later, during optimization), that may reconstruct some of these into 32-bit operations when possible (i.e. when the resulting operation is 32-bit aligned). Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Dave Airlie	ccbf700d6c	nir: remove gl.h include from nir headers. This saves a lot of pointless gl.h includes across the board, it moves the one place that needs GLenum into a separate file only used in those passes that require it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14605>	2022-01-19 21:54:58 +00:00
Thomas H.P. Andersen	c32c9014f5	broadcom/compiler: fix compile warning -Wabsolute-value fixes a compile warning with clang Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14302>	2022-01-03 20:20:37 +00:00
Alejandro Piñeiro	1c4f76672d	broadcom/compiler: avoid unneeded sint/unorm clamping when lowering stores They are being used on integer to integer stores. From Vulkan sec, final paragraph of 16.4.4 "Texel Output Format Conversion": "Each component is converted based on its type and size (as defined in the Format Definition section for each VkFormat). ... Integer outputs are converted such that their value is preserved. The converted value of any integer that cannot be represented in the target format is undefined." I didn't find a equivalent quote for OpenGL as all conversion entries are forcused on float to integer, fixed-point to integer, etc, and not on integer to integer. Didn't find any test failure with this change. We didn't get any shader-db stats change with shaderdb (even overriding to OpenGL 4.4 to get more shaders built), so as a reference Vulkan shader-db stats with the pattern dEQP-VK.image..with_format..* total instructions in shared programs: 37534 -> 36522 (-2.70%) instructions in affected programs: 12080 -> 11068 (-8.38%) helped: 241 HURT: 0 Instructions are helped. total uniforms in shared programs: 9100 -> 8550 (-6.04%) uniforms in affected programs: 3004 -> 2454 (-18.31%) helped: 229 HURT: 0 total max-temps in shared programs: 6110 -> 6014 (-1.57%) max-temps in affected programs: 402 -> 306 (-23.88%) helped: 43 HURT: 0 Max-temps are helped. total nops in shared programs: 1523 -> 1526 (0.20%) nops in affected programs: 21 -> 24 (14.29%) helped: 3 HURT: 6 Inconclusive result (value mean confidence interval includes 0). Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14194>	2021-12-15 11:53:20 +00:00
Iago Toral Quiroga	2630c8f546	broadcom/compiler: improve thrsw merge Instead of stopping the merge process when we find an instruction with an incompatible signal (such as an small immediate), keep going and see if we can merge the thrsw in a previous instruction that is compatible. total instructions in shared programs: 13409835 -> 13356648 (-0.40%) instructions in affected programs: 3556860 -> 3503673 (-1.50%) helped: 17457 HURT: 18 Instructions are helped. total max-temps in shared programs: 2353971 -> 2352956 (-0.04%) max-temps in affected programs: 13960 -> 12945 (-7.27%) helped: 703 HURT: 0 Max-temps are helped. total spills in shared programs: 12301 -> 12301 (0.00%) total sfu-stalls in shared programs: 32596 -> 32499 (-0.30%) sfu-stalls in affected programs: 225 -> 128 (-43.11%) helped: 79 HURT: 3 Sfu-stalls are helped. total nops in shared programs: 347204 -> 325234 (-6.33%) nops in affected programs: 99834 -> 77864 (-22.01%) helped: 11515 HURT: 158 Nops are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14172>	2021-12-14 09:50:17 +00:00
Juan A. Suarez Romero	fd47c939f4	st/pbo: add the image format in the download FS In the V3D driver there is a NIR lowering step for `image_store` intrinsic, where the image store format is required for doing the proper lowering. Thus, let's define it for the download FS instead of keeping it as NONE. v2 (Illia) - Use format only for drivers not supporting format-less writing. v4 (Illia): - Use PIPE_CAP_IMAGE_STORE_FORMATTED to reduce combinations. v5 (Ilia): - Use indirect array for download FS in not formatless-store support drivers. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13409>	2021-12-03 15:32:36 +00:00
Iago Toral Quiroga	cc7db1fc53	broadcom/compiler: improve documentation for Z writes Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14037>	2021-12-03 10:39:08 +00:00
Iago Toral Quiroga	a65c605365	broadcom/compiler: track passthrough Z writes In some cases we need to make the shaders write the Z value produced from rasterization (FEP). Track these instances because they are relevant to early EZ setup. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14037>	2021-12-03 10:39:08 +00:00
Iago Toral Quiroga	6d4a645c90	broadcom/compiler: emit passthrough Z write if shader reads Z Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14037>	2021-12-03 10:39:08 +00:00
Iago Toral Quiroga	996f147fef	broadcom/compiler: relax restriction on VPM inst in last thread end slot According to the documentation, only vpmwt is disallowed in the last delay slot of the thread end. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13975>	2021-11-29 14:06:43 +00:00
Iago Toral Quiroga	6923dd687c	broadcom/compiler: allow color TLB writes in last instruction Only Z writes are disallowed. total instructions in shared programs: 11578449 -> 11577369 (<.01%) instructions in affected programs: 38132 -> 37052 (-2.83%) helped: 1080 HURT: 0 Instructions are helped. total max-temps in shared programs: 2334416 -> 2334395 (<.01%) max-temps in affected programs: 218 -> 197 (-9.63%) helped: 21 HURT: 0 Max-temps are helped. total inst-and-stalls in shared programs: 11607890 -> 11606810 (<.01%) inst-and-stalls in affected programs: 38265 -> 37185 (-2.82%) helped: 1080 HURT: 0 Inst-and-stalls are helped. total nops in shared programs: 338316 -> 337236 (-0.32%) nops in affected programs: 2625 -> 1545 (-41.14%) helped: 1080 HURT: 0 Nops are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13964>	2021-11-29 06:44:07 +00:00
Alejandro Piñeiro	a9b4aef0f2	broadcom/compiler: make shaderdb debug output compatible with shaderdb's report tool Even although the option is called shaderdb, it is not really used by shaderdb (for V3D shaderdb uses the debug option "precompile"). And in fact, right now the output format is not compatible with shaderdb. This commit tries to fix that, and as we are here, also try to make the option more useful for the Vulkan case, as that debug option also works with v3dv. We can't really fully imitate shaderdb use with OpenGL (run with a set of glsl shader tests), but we can at least assign a unique name (the pipeline sha1 in text format) so we can compare executions of the same vulkan application. For that remember to disable the on-disk cache. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13938>	2021-11-24 13:02:08 +00:00
Iago Toral Quiroga	79dee14cc2	broadcom/compiler: don't move ldvary earlier if current instruction has ldunif If we did, we would have the instruction coming right after ldvary write to the same implicit destination as ldvary at the same time. We prevent this when merging instructions, but we should make sure we prevent this when we move ldvary around for pipelining too. Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13921>	2021-11-23 10:52:24 +00:00
Iago Toral Quiroga	7fec4f4135	broadcom/compiler: fix scoreboard locking checks According to the spec the hardware locks the scoreboard on the first or last thread switch (selected via shader state) and any TLB accesses executed before this are not synchronized by hardware. This change updates the logic to ensure we respect this requirement and that we don't assume that the lock is acquired automatically on the first TLB access, which is not valid at least since V3D 4.1+. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13910>	2021-11-22 12:53:43 +00:00
Iago Toral Quiroga	bd7584c16b	broadcom/compiler: don't allow RF writes from signals after thrend Writes to physical registers are not allowed after thread end. We were checking this for ALU writes, but we need to check it for signal writes too. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13910>	2021-11-22 12:53:43 +00:00
Juan A. Suarez Romero	457dbb81f5	broadcom/compiler: apply constant folding on early GS lowering This solves a case where a NIR geometry shader was storing the output in a non-constant: vec4 32 ssa_1 = load_const (0xc0800000 /* -4.000000 /, 0xc1100000 / -9.000000 /, 0x40400000 / 3.000000 /, 0x40e00000 / 7.000000 /) vec1 32 ssa_7 = load_const (0x00000000 / 0.000000 /) vec1 32 ssa_8 = load_const (0x00000001 / 0.000000 /) vec1 32 ssa_9 = iadd ssa_7, ssa_8 vec1 32 ssa_19 = mov ssa_1.x intrinsic store_output (ssa_19, ssa_9) (1, 1, 0, 160, 288) / base=1 / / wrmask=x / / component=0 / / src_type=float32 / / location=32 slots=2 gs_streams(x=0 y=0 z=0 w=0) / When lowering the VPM output we check if the destination (ssa_9 in this case) is a constant to add to the VPM offset. We run a constant folding optimization in an earlier VS lowering, and we should do the same for GS. This fixes multiple dEQP-VK.pipeline.interface_matching. failures. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13884>	2021-11-22 09:32:50 +00:00
Juan A. Suarez Romero	7b21635057	broadcom/compiler: handle array of structs in GS/FS inputs While fragment and geometry shader were handling structs as inputs, they weren't doing for it arrays of structures. This fixes multiple dEQP-VK.pipeline.interface_matching.* failures and assertions. v2: - Fix style (Iago). Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13884>	2021-11-22 09:32:50 +00:00
Iago Toral Quiroga	5e536c97a9	broadcom/compiler: fix early fragment tests setup When early fragment tests are mandated by the shader, we must use the Z value produced by the FEP even if there are elements that would typically require late fragment tests (such as discards, sample to coverage, etc). This change means we also need to be a bit more careful when we promote shaders to use early fragment tests so we don't promote anything with discards for example. Fixes: dEQP-VK.fragment_operations.early_fragment.discard_early_fragment_tests_depth dEQP-VK.fragment_operations.early_fragment.discard_early_fragment_tests_stencil Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13837>	2021-11-18 07:39:32 +00:00
Connor Abbott	508f917d8c	util/dag: Make edge data a uintptr_t Nobody was actually using it as a pointer, and I'm going to introduce a shared function which relies on it not being a pointer so let's fix this once and for all. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13722>	2021-11-17 13:41:47 +00:00
Iago Toral Quiroga	0cb58f80d2	v3d: use V3D_MAX_DRAW_BUFFERS instead of hardcoded constant Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13775>	2021-11-12 11:04:07 +00:00
Iago Toral Quiroga	3a95e25e84	v3dv,v3d: don't store swizzle pointer in shader/pipeline keys We had been storing pointers to a driver owned swizzle table rather than storing the actual swizzle value in various shader and pipeline keys on both GL and Vulkan drivers. This doesn't look very robust, particularly since we also compute sha1 hashes from these values and we may store these hashes to disk (for the disk cache). Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13738>	2021-11-10 11:24:26 +00:00
Iago Toral Quiroga	aa5a0e1dad	broadcom/compiler: copy packing when converting add to mul Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13675>	2021-11-04 13:57:39 +00:00
Iago Toral Quiroga	a794bdf953	broadcom/compiler: check that sig packing is valid when pipelining ldvary Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13641>	2021-11-03 10:49:06 +00:00
Iago Toral Quiroga	6b9bd3f038	broadcom/compiler: make opt passes set current block Typically, optimization passes go through all the blocks in a shader and make adjustments on the fly, so we always want them to update the current block or the current block pointer will become outdated. Also, we don't need to keep track of the previous current block pointer to restore it, since optimization passes run after we have completed conversion to VIR, and therefore, anything that comes after that should always set the current block before emitting code. Fixes debug assert crashes when running shader-db: vir.c:1888: try_opt_ldunif: Assertion `found \|\| &c->cur_block->instructions == c->cursor.link' failed Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13625>	2021-11-02 11:17:01 +00:00
Iago Toral Quiroga	3fbd6662b7	broadcom/compiler: rework simultaneous peripheral access checks This was not quite correct in that our checks for the allowed cases were not checking that there were no other peripheral access other than the ones allowed. For example, we allowed wrtmuc signal and TMU write other than TMUC, and we also allowed TMU read and VPM read/write. But we cannot allow wrtmuc with TMU write other than TMUC and at the same time a VPM write for example, so we can't just check if we have a combination of allowed peripherals, we still need to check that those are the only ones in use by the combined instructions. Another example is that even if we allow a TMU write (other than TMUC) with a wrtmuc signal, the resulting instruction must still have just one TMU write other than TMUC, but we were allowing the merge if one instruction signaled wrtmuc and the other wrote to tmu other than tmuc without testing if the combined result would have 2 tmu writes. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13527>	2021-10-27 06:03:12 +00:00
Iago Toral Quiroga	1561d0126a	broadcom/compiler: fix assert that current instruction must be in current block This was not considering the possibility that the driver has called nir_before_block() or nir_after_block() to update the cursor, in which case the cursor link points to the instruction list header and not to an actual instruction. Fixes incorrect debug-assert crash in: dEQP-VK.graphicsfuzz.cov-increment-vector-component-with-matrix-copy Fixes: `265515fa62` ("broadcom/compiler: check instruction belongs to current block") Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13467>	2021-10-22 05:39:05 +00:00
Iago Toral Quiroga	75bd37dc6a	broadcom/compiler: disallow tsy barrier in thrsw delay slots A TSY barrier becomes effective at the point of the next thread switch, so if we have one coming after a previous thread switch we need to be careful not to emit it in its delay slots, or we would be effectively moving the barrier earlier than intended. Fixes simulator assert crash in: dEQP-VK.graphicsfuzz.two-for-loops-with-barrier-function Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13468>	2021-10-21 12:40:00 +02:00
Alejandro Piñeiro	d50be41f8f	broadcom/compiler: remove unused macro and function definition Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13444>	2021-10-20 10:08:27 +00:00
Alejandro Piñeiro	9e41c42ed4	broadcom/compiler: remove qpu_acc helper It is really small, and used just twice, so we just call qpu_magic. We also update how it is used: * QFILE_NULL is an undef so we can just load anything. Previously we were using accumulator 0, but there isn't any real reason to use an accumulator for this. Using reg 0. * QFILE_LOAD_IMM: it seems that we don't use at all right now, so let's add an assert Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13008>	2021-09-24 08:46:06 +00:00
Alejandro Piñeiro	193898c8b0	broadcom/compiler: remove commented out vir_LOAD_IMM methods It has been commented several years now. Let's remove it to reduce the noise. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13008>	2021-09-24 08:46:06 +00:00
Juan A. Suarez Romero	d220d8cb51	broadcom/compiler: add V3D_DEBUG_NO_LOOP_UNROLL debug option Disables loop unrolling. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12803>	2021-09-13 08:51:54 +00:00
Ella-0	53ae5c3aae	v3d/compiler: Handle point_coord_upper_left Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12524>	2021-09-12 21:01:11 +00:00
Juan A. Suarez Romero	c98ddc778a	broadcom/compiler: force a last thrsw for spilling As we don't know if we are going to have spilling or not, emit always a last thrsw at the end of the shader. If later we don't have spillings and we don't need that last thrsw, we remove it and switch back to the previous one. This way we ensure all the spilling happens always before the last thrsw. v2 (Juan): - Rework the code to force a last thrsw and remove later if no spilling v3: - Merge functionality inside vir_emit_last_thrsw (Iago) - Add vir_restore_last_thrsw (Juan) v4 (Iago): - Fix/add new comments - Rename variables/parameters v5 (Iago): - Fix comments - Add assertion Cc: mesa-stable Fixes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4760 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12322>	2021-09-10 09:18:05 +00:00
Juan A. Suarez Romero	53c8b4c093	broadcom: make vir_emit_last_thrsw() private This function is only used in v3d_nir_to_vir(), so make it private. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12322>	2021-09-10 09:18:05 +00:00
Juan A. Suarez Romero	265515fa62	broadcom/compiler: check instruction belongs to current block Check in the ldunif optimization if the current instruction belongs to current block. These avoids again searching the instruction when current block is not correctly set, as it happened in https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12339 and in https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12221. v2: - Remove extra blank line (Iago) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12358>	2021-09-06 10:38:06 +00:00
Iago Toral Quiroga	3ef2ca9cbf	broadcom/compiler: don't enable early fragment tests if shader writes Z We had an optimization to auto-enable early fragment tests when a shader didn't have side effects, but of course, we cannot do that this if the shader writes Z, as in that case the fragment tests need to use the value written from the shader. Also, if the shader enables early fragment tests, then any shader Z writes should be ignored. Fixes: dEQP-VK.spirv_assembly.instruction.graphics.early_fragment.* Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12736>	2021-09-06 12:01:43 +02:00
Vinson Lee	0a4c4f4459	broadcom/compiler: Fix qpu.flags.muf typo. Fix defect reported by Coverity Scan. Same on both sides (CONSTANT_EXPRESSION_RESULT) pointless_expression: The expression inst->qpu.flags.auf != V3D_QPU_UF_NONE \|\| inst->qpu.flags.auf != V3D_QPU_UF_NONE does not accomplish anything because it evaluates to either of its identical operands, inst->qpu.flags.auf != V3D_QPU_UF_NONE. Fixes: `3f2c54a27f` ("broadcom/compiler: rewrite partial update liveness tracking") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12385>	2021-08-24 08:30:59 +00:00

1 2 3 4 5 ...

597 commits