fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-15 18:38:05 +02:00

Author	SHA1	Message	Date
Alejandro Piñeiro	cbd299b051	v3dv/device: do not compute per-pipeline limits multiplying per-stage There were two problems here: * We were multiplying by 6, when for graphics pipelines, we only support 2. * Right now we are tracking descriptors through the descriptor maps, and we have one per pipeline. So in practice there is no difference between per-stage and per-pipeline limits. So far this was not a problem, we could revisit in the future. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10207>	2021-04-14 11:00:36 +00:00
Juan A. Suarez Romero	9e5762c387	ci: Update VK-GL-CTS to 1.2.6.0 v2: - Bump up MESA_ROOTFS_TAG instead of arm_build (Michel) Acked-by: Michel Dänzer <mdaenzer@redhat.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10136>	2021-04-14 08:06:55 +00:00
Adam Jackson	80b67a3b44	glx: Lift sending the MakeCurrent request to top-level code Somewhat terrifyingly, we never sent this for direct contexts, which means the server never knew the context/drawable bindings. To handle this sanely, pull the request code up out of the indirect backend, and rewrite the context switch path to call it as appropriate. This attempts to preserve the existing behavior of not calling unbind() on the context if its refcount would not drop to zero. Of course, you can't just do this indiscriminately, because this is GLX and extant X servers have bugs and everything is terrible. To wit: - For 1.20.x prior to 1.20.6, you can bind a direct context once, but the second time you try to modify the context's binding you will get GLXBadContextTag. This includes unbinding the context. And "deleting" the context will leak memory, because it will still appear to be current. - For 1.19 and earlier, glXMakeCurrent(dpy, None, ctx) should be legal for GL 3.0+ contexts, but the server will throw BadMatch. To guard against this, we only send the request for indirect contexts unless the server is known good, and only mention one context at a time in such a request; if switching between contexts, we first unbind the old, and then bind the new. Note that the second VendorRelease() version is to catch XFree86 4.x and Xorg [67].x, which almost certainly have the above bugs. Other servers might report different version numbers here, but we can't do direct rendering against them, so this should be safe. Fixes: mesa/mesa#4418 Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9992>	2021-04-13 22:58:58 +00:00
Juan A. Suarez Romero	cbb1e2dcac	v3dv: fix assertion Ensure subpass_idx has a valid value; we use "-1" as invalid one. Fixes CID#1468096 "Macro compares unsigned to 0 (NO_EFFECT)" Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10203>	2021-04-13 16:24:37 +00:00
Juan A. Suarez Romero	64943f2063	broadcom/compiler: use VPM offsets in GS load_per_vertex input Vertex Shader has a store_out lowering pass that converts gallium driver locations in offsets inside the VPM. One of the consequences is that these offsets are consecutives; that is, if the VS stores VARYING_SLOT_VAR0.xyz and VARYING_SLOT_VAR1.xyzw, there isn't a hole in the VPM offsets for the un-stored VARYING_SLOT_VAR0.w. Thus we need to change how the VPM offset is computed in the Geometry Shader when loading the inputs. This bug is exposed by !9050. v2 (Iago): - Include explanatory comment. - Use assert. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10129>	2021-04-13 16:08:00 +00:00
Rhys Perry	a2619b97f5	nir/lower_idiv: add options to use fp32 for 8-bit division lowering Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10081>	2021-04-12 16:19:46 +00:00
Juan A. Suarez Romero	e7f4f1b582	broadcom/compiler: use signed pointers for packed condition `qpu.raddr_b` is an unsigned int, so it is always positive, even after casting to signed int. Fixes CID#1438117 "Operands don't affect result (CONSTANT_EXPRESSION_RESULT)": "result_independent_of_operands: (int)inst->qpu.raddr_b >= -16 is always true regardless of the values of its operands. This occurs as the logical first operand of "&&". v2: - Use signed pointers (Iago) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10131>	2021-04-12 15:22:05 +00:00
Iago Toral Quiroga	0a3bfacabb	broadcom/compiler: rename unifa tracking fields The term 'last' may be misleading because the offset represents the current unifa offset, which is the offset used by the last load plus 4 bytes, so rename these to use the term 'current' instead. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10100>	2021-04-09 10:31:40 +00:00
Iago Toral Quiroga	8998666de7	broadcom/compiler: sort constant UBO loads by index and offset This implements a NIR pass that groups together constant UBO loads for the same UBO index in order of increasing offset when the distance between them is small enough that it enables the "skip unifa write" optimization. This may increase register pressure because it can move UBO loads earlier, so we also add a compiler strategy fallback to disable the optimization if we need to drop thread count to compile the shader with this optimization enabled. total instructions in shared programs: 13557555 -> 13550300 (-0.05%) instructions in affected programs: 814684 -> 807429 (-0.89%) helped: 4485 HURT: 2377 Instructions are helped. total uniforms in shared programs: 3777243 -> 3760990 (-0.43%) uniforms in affected programs: 112554 -> 96301 (-14.44%) helped: 7226 HURT: 36 Uniforms are helped. total max-temps in shared programs: 2318133 -> 2333761 (0.67%) max-temps in affected programs: 63230 -> 78858 (24.72%) helped: 23 HURT: 3044 Max-temps are HURT. total sfu-stalls in shared programs: 32245 -> 32567 (1.00%) sfu-stalls in affected programs: 389 -> 711 (82.78%) helped: 139 HURT: 451 Inconclusive result. total inst-and-stalls in shared programs: 13589800 -> 13582867 (-0.05%) inst-and-stalls in affected programs: 817738 -> 810805 (-0.85%) helped: 4478 HURT: 2395 Inst-and-stalls are helped. total nops in shared programs: 354365 -> 342202 (-3.43%) nops in affected programs: 31000 -> 18837 (-39.24%) helped: 4405 HURT: 265 Nops are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10100>	2021-04-09 10:31:40 +00:00
Iago Toral Quiroga	fb2214a441	broadcom/compiler: allow compilation strategies to limit minimum thread count This adds a minimum thread count parameter to each compilation strategy with the intention to limit the minimum allowed thread count that can be used to register allocate with that strategy. For now all strategies allow the minimum thread count supported by the hardware, but we will be using this infrastructure to impose a more strict limit in an upcoming optimization. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10100>	2021-04-09 10:31:40 +00:00
Iago Toral Quiroga	4b244dc64f	broadcom/compiler: add a definition for the unifa skip distance We will be using this distance to setup another optimization in a follow-up patch. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> x# Please enter the commit message for your changes. Lines starting Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10100>	2021-04-09 10:31:40 +00:00
Iago Toral Quiroga	a45ab46563	v3dv: fix index buffer binding This can be called outside a render pass so we should not expect to have a job available. Also, we should not be emitting state here, instead we should do in the pre-draw handler with all the other draw call state. Fixes cases of crashes in RenderDoc when selecting elements in the Event Browser. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10130>	2021-04-09 10:13:46 +00:00
Juan A. Suarez Romero	cc8d4cd1ae	broadcom/compiler: fix first_component assertion first_component is an uint, and thus if it takes value 0 we can't know if it is because writemask has its first bit to 1, or all bits to 0. As we want to ensure that at least one bit is set, apply the assertion in writemask. Fixes CID#1472829 "Macro compares unsigned to 0 (NO_EFFECT)". v2: - Restore "first_component <= last_component" assertion (Iago) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10103>	2021-04-09 07:55:41 +00:00
Bas Nieuwenhuizen	580f1ac473	nir: Extract shader_info->cs.shared_size out of union. It is valid for all stages, just 0 for most of them. In particular mesh/task shaders might be using it. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10094>	2021-04-08 14:39:28 +00:00
Chad Versace	0845cabc72	vulkan: Track dependencies of Python imports The meson.build was unaware of transitive dependencies introduced by Python imports. Android still needs fixing. But I did not update the Android files lest I break the build. Ideally, we would fix this by using a Python runner that generates a depfile, similar to how meson creates depfiles for C files by passing flags -MD -MQ -MF to gcc. But this patch gets the job done, without stalling on the ideal general solution, by manually tracking the Python imports in new 'foo_depend_files' variables. CC: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1466>	2021-04-08 14:15:54 +00:00
Juan A. Suarez Romero	eddbbd8b68	v3d: use uint type in _gen_unpack_uint Use a unsigned int type in the loop to avoid unintended sign extensions. Fixes CID#1414500 (Unintended sign extension [SIGN_EXTENSION]). Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10060>	2021-04-07 09:39:42 +00:00
Alejandro Piñeiro	1e0a69afa7	vulkan: track number of bindings instead of max binding for CreateDescriptorSetLayout As that handles better, and more clear, the case of bindingCount being zero. For the case of Anvil and Turnip, this avoids allocating a non-needed binding when bindingCount is zero. Inspired on radv, that was what it was doing so far. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4526 Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Hyunjun Ko <zzoon@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9905>	2021-04-05 20:17:53 +00:00
Juan A. Suarez Romero	c1bd3d3afc	ci/broadcom: update expected list Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10041>	2021-04-05 17:40:27 +02:00
Iago Toral Quiroga	9ca0e070e7	broadcom/compiler: optimize branch emission for uniform break/continue A break/continue in a loop is typically emitted like this: if (cond) { break/continue; } else { } If cond is uniform, we'll emit code for a uniform if statement and that will emit a branch right before the if to jump directly to the else (or the block after the else in this case, since the else is empty) in case cond evaluates to false. This means we end up emitting two consecutive branch instructions, one before the if and one for the THEN block right after: branch(!cond) -> jump to else (or after else) if cond is false nop nop nop branch -> unconditional jump to break/continue nop nop nop Instead, if we are in this scenario, we can do better by emitting the conditional jump directly and avoiding the "jump to else" case: branch(cond) -> jump to break/continue if cond is true nop nop nop We need to be careful when emitting the break/continue for the case where all lanes are disabled to avoid infinite loops: if we have a break we always want to take the jump, but we don't want to take it if it is a continue. total instructions in shared programs: 13563672 -> 13557348 (-0.05%) instructions in affected programs: 348034 -> 341710 (-1.82%) helped: 1158 HURT: 10 Instructions are helped. total uniforms in shared programs: 3779137 -> 3777535 (-0.04%) uniforms in affected programs: 90583 -> 88981 (-1.77%) helped: 1169 HURT: 0 Uniforms are helped. total max-temps in shared programs: 2317670 -> 2317575 (<.01%) max-temps in affected programs: 1943 -> 1848 (-4.89%) helped: 85 HURT: 4 Max-temps are helped. total sfu-stalls in shared programs: 32247 -> 32247 (0.00%) sfu-stalls in affected programs: 69 -> 69 (0.00%) helped: 7 HURT: 9 Inconclusive result (value mean confidence interval includes 0). total inst-and-stalls in shared programs: 13595919 -> 13589595 (-0.05%) inst-and-stalls in affected programs: 350674 -> 344350 (-1.80%) helped: 1154 HURT: 11 Inst-and-stalls are helped. total nops in shared programs: 358202 -> 354325 (-1.08%) nops in affected programs: 17367 -> 13490 (-22.32%) helped: 1168 HURT: 1 Nops are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9948>	2021-04-05 06:38:19 +00:00
Iago Toral Quiroga	14843ccc33	broadcom/compiler: implement restriction for branch after setmsf Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9948>	2021-04-05 06:38:19 +00:00
Alyssa Rosenzweig	06ebbde630	vulkan: Deduplicate mesa stage conversion Across every driver... v2: Add casts to appease -fpermissive used on CI. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9477>	2021-04-03 17:34:39 +00:00
Eric Anholt	dee89af505	ci: Uprev piglit to 6a4be9e9946d ("piglit: NOTE! Default branch is now main") Along with other new tests, brings in the perf improvement for gl-1.3-texture-env so we can stop skipping it. Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9806>	2021-04-02 18:42:04 +00:00
Michel Dänzer	6652c5018c	ci: Merge ARM testing docker images to a single arm_test one The merged image contains kernels & rootfs for both arm64 & armhf baremetal test jobs, and is smaller than either arm{64,hf}_test image before. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9955>	2021-04-01 16:35:26 +00:00
Michel Dänzer	4b20bd7425	ci: Build ARM baremetal rootfs in native container Doing so in an x86 container via qemu was slow, and started failing recently after updating to a newer qemu version. This also results in smaller arm_test docker images, since we need to install fewer Debian packages in them. As a bonus, this turns some piglit tests from fail to pass (Or maybe they'll turn out to be flakes? They've passed at least 3 times in a row). Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9955>	2021-04-01 16:35:26 +00:00
Vinson Lee	ddab996589	Remove leftover dead code. Fix defect reported by Coverity Scan. Logically dead code (DEADCODE) dead_error_line: Execution cannot reach this statement: return;. Fixes: `bdf93f4e3b` ("v3dv/cmd_buffer: return early for draw commands if there is nothing to draw") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9890>	2021-03-31 21:04:50 -07:00
Juan A. Suarez Romero	4323279984	broadcom/cle: do not leak spec Fixes CID#1474553 "Resource leak (RESOURCE_LEAK)". Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9944>	2021-03-31 10:13:33 +00:00
Juan A. Suarez Romero	5737cecd45	ci/v3dv: update flaky tests Acked-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9926>	2021-03-31 07:46:02 +00:00
Iago Toral Quiroga	8f7640293d	broadcom/compiler: try to fill up delay slots after unconditional branch If we have an unconditional branch then we can try to fill up its delay slots with the initial instructions of its successor block by copying them into the delay slots and adjusting the branch offset to skip the copied instructions. total nops in shared programs: 365640 -> 364471 (-0.32%) nops in affected programs: 15416 -> 14247 (-7.58%) helped: 462 HURT: 0 Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9918>	2021-03-31 05:51:22 +00:00
Iago Toral Quiroga	e266e6c634	broadcom/compiler: try to fill up delay slots after a branch instruction For this we do something similar to what we do with thrsw where we try to move the branch instruction earlier so the previous instructions execute in the delay slots of the branch. Generally, we can do this with any instruction except: - If the instruction reads a uniform: since our branches do as well and uniforms come from an ordered FIFO stream. - If the instruction writes flags, since our branch instruction will probably read them. - If the instruction is in the delay slots of another thread switch, branch, or unifa write, which is disallowed. total instructions in shared programs: 13648140 -> 13613972 (-0.25%) instructions in affected programs: 2209552 -> 2175384 (-1.55%) helped: 6765 HURT: 0 Instructions are helped. total max-temps in shared programs: 2318687 -> 2318436 (-0.01%) max-temps in affected programs: 5046 -> 4795 (-4.97%) helped: 152 HURT: 0 Max-temps are helped. total inst-and-stalls in shared programs: 13680494 -> 13646326 (-0.25%) inst-and-stalls in affected programs: 2220394 -> 2186226 (-1.54%) helped: 6765 HURT: 0 Inst-and-stalls are helped. total nops in shared programs: 399818 -> 365640 (-8.55%) nops in affected programs: 127311 -> 93133 (-26.85%) helped: 6765 HURT: 0 Nops are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9918>	2021-03-31 05:51:22 +00:00
Iago Toral Quiroga	f33ca092da	broadcom/compiler: add a NOP count stat to shader-db Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9918>	2021-03-31 05:51:22 +00:00
Iago Toral Quiroga	062eee7d33	broadcom/compiler: dump instruction index when failing to pack instructions Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9918>	2021-03-31 05:51:22 +00:00
Juan A. Suarez Romero	1f90d51749	v3dv: fix unused value Do not assign to a variable that won't be used. Fixes CID#1468098 "Unused value (UNUSED_VALUE)". Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9910>	2021-03-30 14:15:43 +00:00
Juan A. Suarez Romero	cc1f070a27	broadcom/compiler: fix unused value Do not assign to a variable that won't be used. Fixes CID#1451708 and CID#1451710 "Unused value (UNUSED_VALUE)". Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9910>	2021-03-30 14:15:43 +00:00
Juan A. Suarez Romero	528d66eaa2	ci/v3d: run full GLES3 and GLES31 testsuite There is margin in the time budget to run the full GLES3 and GLES31 CTS instead of only 50%. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9899>	2021-03-30 08:03:16 +00:00
Juan A. Suarez Romero	dc859bb5bb	ci/broadcom: update piglit expected results Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9899>	2021-03-30 08:03:16 +00:00
Bas Nieuwenhuizen	83c92a48b7	vulkan: Fix descriptor set creation with zero bindings. MAX2(count * struct size, 1) results in 1 for count=0, not the size of a struct. Since this MAX only seems to exist so we can keep using NULL for error reporting, just refactor to return a VkResult. Fixes: `ad241b15a9` ("vk: consolidate dynamic descriptor binding sorting") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4522 Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9880>	2021-03-29 23:32:50 +00:00
Iago Toral Quiroga	c3c251d98f	broadcom/compiler: flag TMU reads with a read dependency on last TMU config We were using a write dependency to ensure ordering since LDTMUs sequences are ordered, but by using a write dependency with TMU config we were also preserving ordering with TMU config writes that are not a sequence terminator, which is not required and reduces scheduling flexibility. Instead, use a write dependency to ensure strict ordering of TMU reads, but only a read depdency with TMU config. With this change we also need to update CS barriers to also have a write dependency with TMU reads to ensure that we don't move TMU reads around CS barriers. total instructions in shared programs: 13602500 -> 13597851 (-0.03%) instructions in affected programs: 2681428 -> 2676779 (-0.17%) helped: 6567 HURT: 4960 Instructions are helped. total max-temps in shared programs: 2317927 -> 2317914 (<.01%) max-temps in affected programs: 13861 -> 13848 (-0.09%) helped: 355 HURT: 300 Inconclusive result (value mean confidence interval includes 0). total sfu-stalls in shared programs: 32074 -> 32247 (0.54%) sfu-stalls in affected programs: 848 -> 1021 (20.40%) helped: 160 HURT: 327 Inconclusive result (%-change mean confidence interval includes 0). total inst-and-stalls in shared programs: 13634574 -> 13630098 (-0.03%) inst-and-stalls in affected programs: 2703041 -> 2698565 (-0.17%) helped: 6558 HURT: 5020 Inst-and-stalls are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9856>	2021-03-29 06:21:22 +00:00
Iago Toral Quiroga	cbf9a7c3c1	broadcom/compiler: flag TMU read dependencies against last TMU config Instead of last TMU write. According to the documentation, the entries in the output FIFO are pushed with the final input write for the lookup, which is the one terminating the sequence. We flag these with last_tmu_config. This will allow us to move all TMU register writes for a lookup except the last one ahead of the LDTMUs for the previous lookup, possibly allowing us to pair up these writes the wrtmuc instructions for the same lookup, turning code like this: nop ; nop ; wrtmuc (tex[0].p0 \| 0x3) nop ; nop ; wrtmuc (tex[2].p1 \| 0x1) nop ; nop ; ldunif (ubo[2]+0xe0) fadd r4, rf33, rf51 ; mov unifa, r5 ; ldunif (ubo[2]+0x110) fmax rf34, 0, r4 ; nop nop ; mov tmut, rf11 nop ; mov tmus, rf0 into: nop ; mov tmut, rf11 ; wrtmuc (tex[0].p0 \| 0x3) nop ; nop ; wrtmuc (tex[2].p1 \| 0x1) nop ; nop ; ldunif (ubo[2]+0xe0) fadd r4, rf33, rf51 ; mov unifa, r5 ; ldunif (ubo[2]+0x110) fmax rf34, 0, r4 ; nop nop ; mov tmus, rf0 total instructions in shared programs: 13648140 -> 13602500 (-0.33%) instructions in affected programs: 3497402 -> 3451762 (-1.30%) helped: 12044 HURT: 3484 Instructions are helped. total max-temps in shared programs: 2318687 -> 2317927 (-0.03%) max-temps in affected programs: 17234 -> 16474 (-4.41%) helped: 615 HURT: 198 Max-temps are helped. total sfu-stalls in shared programs: 32354 -> 32074 (-0.87%) sfu-stalls in affected programs: 1462 -> 1182 (-19.15%) helped: 461 HURT: 188 Sfu-stalls are helped. total inst-and-stalls in shared programs: 13680494 -> 13634574 (-0.34%) inst-and-stalls in affected programs: 3514405 -> 3468485 (-1.31%) helped: 12062 HURT: 3486 Inst-and-stalls are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9856>	2021-03-29 06:21:22 +00:00
Alejandro Piñeiro	ce98967274	v3dv: define a default attribute values with float type We are providing a BO with the default attribute values for the GL_SHADER_STATE_RECORD, that contains 16 vec4. Such default value for each vec4 is (0, 0, 0, 1). As the attribute format could be int or float, the "1" value needs to take into account the attribute format. But in the practice, the most common case is all floats. So we create one default attribute values BO assuming that all attributes will be floats, and we store it at v3dv_device and only create a new one if a int format type is defined. That allows to reduce the amount of BOs needed. Note that we could still try to reduce the amount of BOs used by the pipelines if we create a bigger BO, and we just play with the offsets. But as mentioned, that's not the usual, and would add an extra complexity,so it is not a priority right now. This makes the following test passing when disabling the pipeline cache support: dEQP-VK.api.object_management.max_concurrent.graphics_pipeline Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9845>	2021-03-26 15:00:05 +00:00
Iago Toral Quiroga	e790c20403	broadcom/compiler: try to fill up delay slots after a thrsw The way we handle thrsw instructions is that we try to merge them back into previously scheduled instructions to fill up its delay slots. This is generally safe, because the thrsw won't happen until after the delay slots, so we are not really changing the execution order of the instructions and we just need to make sure we don't violate a few specific restrictions. If we have not managed to fill up all delay slots after doing this, then we emit as many NOPs as needed to fill them. This is to ensure that we don't schedule an instruction that needs to execute after the thread switch before the thread switch happens. However, doing this can lead to inefficient code, since some times the instructions we schedule after a thrsw are indepdent of the thrsw and could be safely executed in its delay slots. This change removes the fixed NOP emission after a thrsw to fill delay slots and instead adds code to ensure that our instruction scheduling is aware of when it is scheduling instructions in the delay slots of a previous thrsw to avoid selecting conflicting instructions. The only case were we still emit fixed NOPs is for the thread end that we emit to terminate the program after scheduling all instructions because we can't end the instruction stream before the thread end is properly executed. total instructions in shared programs: 13691004 -> 13648140 (-0.31%) instructions in affected programs: 4345951 -> 4303087 (-0.99%) helped: 19645 HURT: 652 Instructions are helped. total max-temps in shared programs: 2319317 -> 2318687 (-0.03%) max-temps in affected programs: 10510 -> 9880 (-5.99%) helped: 532 HURT: 9 Max-temps are helped. total sfu-stalls in shared programs: 31752 -> 32354 (1.90%) sfu-stalls in affected programs: 840 -> 1442 (71.67%) helped: 7 HURT: 467 Sfu-stalls are HURT. total inst-and-stalls in shared programs: 13722756 -> 13680494 (-0.31%) inst-and-stalls in affected programs: 4335590 -> 4293328 (-0.97%) helped: 19453 HURT: 758 Inst-and-stalls are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9825>	2021-03-26 07:13:07 +00:00
Iago Toral Quiroga	f68f209e39	broadcom/compiler: add a v3d_qpu_writes_accum helper We have helpers to check if an instruction writes to specific accumulators. This one will check if it writes any of the general purpose accumulators, which will come in handy in a follow-up patch. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9825>	2021-03-26 07:13:07 +00:00
Iago Toral Quiroga	22a979be65	broadcom/compiler: convert add to mul when possible to allow merge Integer add/sub can be implemented as either an add or a mul instruction but we always emit them as add instructions at VIR level. We can use this flexibility to improve our QPU scheduling so we can be more effective at instruction merging by converting these to mul instructions when we are attempting to merge them with another add instruction. total instructions in shared programs: 13721549 -> 13691004 (-0.22%) instructions in affected programs: 3340493 -> 3309948 (-0.91%) helped: 12805 HURT: 1656 Instructions are helped. total max-temps in shared programs: 2319528 -> 2319317 (<.01%) max-temps in affected programs: 5285 -> 5074 (-3.99%) helped: 195 HURT: 3 Max-temps are helped. total sfu-stalls in shared programs: 31616 -> 31752 (0.43%) sfu-stalls in affected programs: 469 -> 605 (29.00%) helped: 52 HURT: 161 Sfu-stalls are HURT. total inst-and-stalls in shared programs: 13753165 -> 13722756 (-0.22%) inst-and-stalls in affected programs: 3340383 -> 3309974 (-0.91%) helped: 12782 HURT: 1666 Inst-and-stalls are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9769>	2021-03-25 09:51:42 +00:00
Alejandro Piñeiro	bdf93f4e3b	v3dv/cmd_buffer: return early for draw commands if there is nothing to draw So for example, on v3dv_CmdDrawIndexed we can return early if instanceCount is 0. This fixes failures when using the simulator with tests with the following pattern: dEQP-VK.draw.instanced.draw_indexed_vk_primitive_topology* Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9820>	2021-03-25 09:38:04 +00:00
Iago Toral Quiroga	bb201733ac	v3dv/pipeline_cache: fix assert Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Fixes: `e354c5280` ('3dv/pipeline: try to get the shader variant directly from the cache') Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9824>	2021-03-25 09:25:27 +00:00
Eric Anholt	3cc390bf7d	broadcom: Disbale CLIF dumping when libexpat isn't available. Given what a niche developer tool CLIF dumps are, no sense requiring libexpat just for that. Acked-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9764>	2021-03-24 17:25:07 +00:00
Alejandro Piñeiro	74785346b4	v3dv: Add support for the on-disk shader cache Quoting Jason's commit message (`afa8f5892`), that also applies here: "The Vulkan API provides a mechanism for applications to cache their own shaders and manage on-disk pipeline caching themselves. Generally, this is what I would recommend to application developers and I've resisted implementing driver-side transparent caching in the Vulkan driver for a long time. However, not all applications do this and, for some use-cases, it's just not practical." Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9403>	2021-03-22 17:10:47 +00:00
Alejandro Piñeiro	cf71280d74	v3dv/device: avoid unused-result warning with asprintf Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9403>	2021-03-22 17:10:47 +00:00
Alejandro Piñeiro	2bee6ffec3	v3dv/pipeline: compute sha1 for no-op fragment shaders correctly We should use the nir shader, as with internal vkShaderModule, instead of just the name. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9403>	2021-03-22 17:10:47 +00:00
Alejandro Piñeiro	9a4099858b	v3dv/pipeline: don't create a variant if compilation failed Also return the proper Vulkan result for this case, that is somewhat tricky. Technically Create[Graphics/Compute]Pipeline only allow OOM errors. So for this case, there is only the alternative of the generic VK_ERROR_UNKNOWN, even if we known the cause of the error. From spec: "VK_ERROR_UNKNOWN will be returned by an implementation when an unexpected error occurs that cannot be attributed to valid behavior of the application and implementation. Under these conditions, it may be returned from any command returning a VkResult" Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9403>	2021-03-22 17:10:47 +00:00
Alejandro Piñeiro	e354c52801	v3dv/pipeline: try to get the shader variant directly from the cache Until now we were always doing a two-step cache lookup, as we were using the NIR shaders to fill up the key to lookup for the compiled shaders. But since we were already generating the sha1 key with the original SPIR-V shader (or its internal NIR representation) any info we were collecting from from NIR is already implicit in the original shader, so we can avoid using the NIR in most cases. Because the v3d_key that is used to compile a shader is populated with data coming directly from the NIR shader or produced during NIR lowerings, we can't use it directly as part of the pipeline cache entry. We could split them, but that would be confusing, so we add a new struct, v3dv_pipeline_key used specifically to search for the compiled shaders on the pipeline cache. v3d_key would be still used to compile the shaders. As we are using the same sha1 key for all compiled shaders in a pipeline, we can also group all of them in the same cache entry, so we don't need a lookup for each stage. This also allows to cache pipeline data shared by all the stages (like the descriptor maps). While we are here, we also create a single BO to store the assembly for all the pipeline stages. Finally, we remove the link to the variant on the pipeline stage struct, to avoid the confusion of having two links to the same data. This mostly means that we stop to use the pipeline stage structures after the pipeline is created, so we can freed them. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9403>	2021-03-22 17:10:47 +00:00

1 2 3 4 5 ...

1417 commits