fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-20 02:38:07 +02:00

Author	SHA1	Message	Date
Iago Toral Quiroga	370495abd1	v3d: disable GLSL loop unrolling again We had re-enabled this because of some test regressions: KHR-GLES31.core.geometry_shader.limits.max_input_components and ext_transform_feedback-max-varyings failed to register allocate, but now that we support indirect indexing on vertex shader outputs natively this is no longer an issue. Piglit's max-samplers tests failed. These tests use indirect indexing on samplers which is not supported and fail to link with this error message: "Failed to link: error: sampler arrays indexed with non-constant expressions is forbidden in GLSL 110". This is expected. The reason these were passing before is that loop unrolling was able to turn indirect indexing into direct indexing. We add them to the expected fail list. Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10723>	2021-05-11 09:31:31 +00:00
Iago Toral Quiroga	f0fef41917	broadcom/compiler: don't unroll due to indirect indexing of outputs We can handle this natively now, so there is no point. Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10723>	2021-05-11 09:31:31 +00:00
Iago Toral Quiroga	9f5481cf78	v3dv: don't lower indirect derefs on output variables Our backend compiler can handle this for all supported shader stages now. Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10723>	2021-05-11 09:31:31 +00:00
Iago Toral Quiroga	0235ed18a7	broadcom/compiler: don't use nir_src_is_dynamically_uniform Now that we have divergence analysis we should use that. Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10723>	2021-05-11 09:31:31 +00:00
Iago Toral Quiroga	cb39dca2d3	broadcom/compiler: make vir_VPM_WRITE_indirect handle non-uniform offsets Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10723>	2021-05-11 09:31:31 +00:00
Iago Toral Quiroga	f71893a942	broadcom/compiler: implement non-uniform offset on vertex outputs Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10723>	2021-05-11 09:31:31 +00:00
Iago Toral Quiroga	067ad7eccc	broadcom/compiler: move vertex shader output handling to its own function Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10723>	2021-05-11 09:31:31 +00:00
Juan A. Suarez Romero	54ec9c95cf	broadcom/compiler: fix dynamic-stack-buffer-overflow error When spilling a register, the number of temps can be increased when introducing a temporal variable. Those nodes are not elegible to be spilled, but we need to take care of no accessing out-of-bounds of the arrays defined with a size equal to the original number of temps. Fixes address sanitizer error on KHR-GLES3.shaders.uniform_block.random.all_shared_buffer.14 (and many others). v2 (Iago): - Add clarification in assertion. - Use `vir_get_temp` to increase num_temps. v3 (Iago): - Update clarification Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10643>	2021-05-11 07:46:17 +00:00
Juan A. Suarez Romero	ea463f9bff	ci/broadcom: update expected results Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10694>	2021-05-10 10:11:09 +00:00
Iago Toral Quiroga	d81a6e5f1d	broadcom/compiler: change register allocation policy for accumulators The current policy is to always favor accumulators if possible, however, this is not always optimal. Particularly, accumulators play a crucial role in enabling QPU instruction merges, since these are limited to both the ADD and the ALU instructions addressing at most 2 physical registers. For 2-src instructions, this means that to be able to merge we need them to address at least 2 accumulators. While favoring accumulators does help the case for instruction merges in general, it is risky to assign accumulators to variables that have long life spans. Doing so will make the accumulator unavailable for any other instructions during that life span, and since we only have a few accumulators, we can quickly run out and losing our capacity to merge instructions for large parts of the qpu program. On the other hand, we also want to avoid the extreme case were we keep allocating physical registers to the point we run out, even if we have accumulators available, since accumulators have additional restrictions and may not be suitable for everything. This change continues the policy of favoring accumulators, but it only does so if the life span of the temps is short, to ensure that we can recycle accumulators often across instructions and avoid running out for sections of the QPU code, unless we are already running out of physical registers. total instructions in shared programs: 13654647 -> 13336921 (-2.33%) instructions in affected programs: 11015919 -> 10698193 (-2.88%) helped: 39758 HURT: 17325 Instructions are helped. total threads in shared programs: 412046 -> 412038 (<.01%) threads in affected programs: 16 -> 8 (-50.00%) helped: 0 HURT: 4 Threads are HURT. total uniforms in shared programs: 3745726 -> 3746003 (<.01%) uniforms in affected programs: 17296 -> 17573 (1.60%) helped: 76 HURT: 99 Uniforms are HURT. total max-temps in shared programs: 2364430 -> 2359942 (-0.19%) max-temps in affected programs: 109117 -> 104629 (-4.11%) helped: 2893 HURT: 772 Max-temps are helped. total spills in shared programs: 5727 -> 5746 (0.33%) spills in affected programs: 221 -> 240 (8.60%) helped: 1 HURT: 2 total fills in shared programs: 13121 -> 13139 (0.14%) fills in affected programs: 466 -> 484 (3.86%) helped: 1 HURT: 2 total sfu-stalls in shared programs: 33432 -> 34491 (3.17%) sfu-stalls in affected programs: 18219 -> 19278 (5.81%) helped: 4459 HURT: 5087 Inconclusive result total inst-and-stalls in shared programs: 13688079 -> 13371412 (-2.31%) inst-and-stalls in affected programs: 11030017 -> 10713350 (-2.87%) helped: 39630 HURT: 17429 Inst-and-stalls are helped. total nops in shared programs: 335753 -> 333708 (-0.61%) nops in affected programs: 112659 -> 110614 (-1.82%) helped: 8726 HURT: 7383 Inconclusive result Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10686>	2021-05-08 13:15:42 +02:00
Iago Toral Quiroga	c11e479852	broadcom/compiler: specify maximum thread count in compile strategies Once we have exhausted compile strategies at 4 threads and we start enabling lower thread counts, there is no point in starting compiles with 4 threads for them, we know these will fail, so let's start at 2 in these cases. This also has another nice implication: if the driver compiles at 4 threads and fails to register allocate, we were allowing it to try with 2 threads, but this would only retry the register allocation process and would not really recompile the shader with 2 threads. This is not optimal, because at 2 threads we have more TMU fifo space for each thread and we can do more TMU pipelining, so we were missing that opportunity. This improves performance in Sponza by ~1.5% and also seems to help UE4 slightly. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10647>	2021-05-06 12:27:06 +02:00
Iago Toral Quiroga	d19ce36ff2	broadcom/compiler: refactor compile strategies Until now, if we can't compile at 4 threads we would lower thread count with optimizations disabled, however, lowering thread count doubles the amount of registers available per thread, so that alone is already a big relief for register pressure so it makes sense to enable optimizations when we do that, and progressively disable them until we enable spilling as a last resort. This can slightly improve performance for some applications. Sponza, for example, gets a ~1.5% boost. I see several UE4 shaders that also get compiled to better code at 2 threads with this, but it is more difficult to assess how much this improves performance in practice due to the large variance in frame times that we observe with UE4 demos. Also, if a compiler strategy disables an optimization that did not make any progress in the previous compile attempt, we would end up re-compiling the exact same shader code and failing again. This, patch keeps track of which strategies won't make progress and skips them in that case to save some CPU time during shader compiles. Care should be taken to ensure that we try to compile with the default NIR scheduler at minimum thread count at least once though, so a specific strategy for this is added, to prevent the scenario where no optimizations are used and we skip directly to the fallback scheduler if the default strategy fails at 4 threads. Similarly, we now also explicitly specify which strategies are allowed to do TMU spills and make sure we take this into account when deciding to skip strategies. This prevents the case where no optimizations are used in a shader and we skip directly to the fallback scheduler after failing compilation at 2 threads with the default NIR scheduler but without trying to spill first. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10647>	2021-05-06 12:27:06 +02:00
Iago Toral Quiroga	296fe4daa6	broadcom/compiler: add a compiler strategy to disable loop unrolling Loop unrolling can increase register pressure significantly, leading to lower thread counts and spilling. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10647>	2021-05-06 12:27:06 +02:00
Iago Toral Quiroga	4742300e6b	v3d: move NIR compiler options to GL driver The Vulkan driver was already creating and using its own set of options, so the ones defined in the compiler are only used with GL, which is confusing. Move them to the GL driver. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10647>	2021-05-06 12:27:06 +02:00
Iago Toral Quiroga	db3fa1cc8c	v3dv: setup loop unrolling We set the maximum at 16 iterations (the GL compiler chooses 32 iterations for the GLSL front-end loop unrolling pass) because we have observed a bunch of shaders from Sascha Willems that spill significantly with 32, leading to massive performance degradation, while 16 avoids spilling and doesn't seem to cause visible performance degradation compared to cases that unroll 32 without spilling. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10647>	2021-05-06 12:25:22 +02:00
Iago Toral Quiroga	ec72b876fe	broadcom/compiler: add a loop unrolling pass Right now this is useful for Vulkan onnly, because GL gets loop unrolling from the GLSL compiler and/or mesa state tracker NIR front-ends. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10647>	2021-05-06 12:24:29 +02:00
Iago Toral Quiroga	f099fc3e07	v3d: choose a larger CSD supergroup size if possible Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10541>	2021-05-04 15:53:23 +00:00
Iago Toral Quiroga	3ce249e65e	broadcom/common: move CSD supergroup sizing to a common helper We want to use this in GL too. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10541>	2021-05-04 15:53:23 +00:00
Iago Toral Quiroga	afc33a7430	v3dv: limit supergroup size in presence of TSY barriers When a TSY barrier is hit, the entire supergroup will be synchronized. If the supergoup is large and uses all available QPU threads it would mean that we would sychronize and stall all running threads until all of them reach the barrier, which may be inefficient. This patch makes it so that if the compute shader has any such barriers we limit the supergroup size so each supergroup only takes half of the QPU threads available at most, so that if one supergroup hits a barrier we have at least one other supergroup we can run, reducing idle QPU time. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10541>	2021-05-04 15:53:23 +00:00
Iago Toral Quiroga	f514280524	broadcom/compiler: track if a shader has control barriers in prog_data Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10541>	2021-05-04 15:53:23 +00:00
Iago Toral Quiroga	2e0f6e5705	v3dv: choose a larger CSD supergroup size if possible Each supergroup executes a number batches. Each batch has 16 elements (one per QPU lane), except possibly the last batch which might be incomplete. Until now, we packed a single workgroup in each supergroup, which can lead to more incomplete batches and less efficient use of the QPUs depending on the configuration of workgroups being dispatched. This patch computes a number of workgroups per supergroup so that we reduce or completely eliminate incomplete batches if possible. It should be noted however, that TSY barriers act on supergroups, so larger supergroups lead to larger syncpoints on barriers too. A follow-up patch will try to find a good balance for compute shaders that use such barriers. This improves performance of the Sascha Willem's computecloth demo by ~13%. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10541>	2021-05-04 15:53:23 +00:00
Jose Maria Casanova Crespo	ab1d66a111	ci/v3d: Update piglit expectations. As piglit job is manual, I forgot to update three new test passing at spec@ext_image_dma_buf_import subgroup after merging https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10524 Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10536>	2021-04-30 10:22:53 +02:00
Juan A. Suarez Romero	33f9b06b0e	v3dv: check dest bitsize in color blit Otherwise, if src_bit_size > 0 and dst_bit_size == 0, we end up doing a bad shift in `1 << (dst_bit_size - 1)`, as `dst_bit_size - 1` is a negative value (in this case would be MAX_UINT32). Fixes CID#1468134 "Bad bit shift operation (BAD_SHIFT)": "large_shift: In expression 1 << dst_bit_size - 1U, left shifting by more than 31 bits has undefined behavior. The shift amount, dst_bit_size - 1U, is 4294967295." v2: - Use an assertion instead (Iago) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10251>	2021-04-29 10:31:11 +00:00
Juan A. Suarez Romero	fd8d71ce41	v3dv: rename VC5 to V3D As we are not using anymore references to the old VC5, let's rename definitions from VC5 to V3D in the Vulkan driver. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10402>	2021-04-29 11:22:12 +02:00
Juan A. Suarez Romero	26618dfb87	broadcom/simulator: change references to VC5 We are referring the driver as V3D instead old VC5; so let's update the references. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10402>	2021-04-29 11:22:12 +02:00
Juan A. Suarez Romero	a77002584d	broadcom/qpu: rename from VC5 to V3D Get rid of old references to VC5. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10402>	2021-04-29 11:22:12 +02:00
Juan A. Suarez Romero	dfe50e02c9	ci/broadcom: update expected results Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10501>	2021-04-28 15:21:28 +00:00
Alejandro Piñeiro	79e4451430	v3dv: move extensions table to v3dv_device So one less python generator. Based on anv (MR#8792) and radv (MR#8900). With this change v3dv doesn't have any more a custom python code generator. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10484>	2021-04-28 09:13:55 +00:00
Alejandro Piñeiro	8d72992ed5	v3dv: remove custom icd json generation Most of the stuff needed was moved to vk util. So one less python generator to maintain. anv and radv already migrated. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10484>	2021-04-28 09:13:55 +00:00
Juan A. Suarez Romero	2da01e6123	ci/v3dv: update flakes Add dEQP-VK.synchronization.op.single_queue.binary_semaphore.write_copy_buffer_read_ssbo_compute.buffer_16384 to the flake list. v2: - Add a new flake (jasuarez) Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10479>	2021-04-28 07:38:53 +00:00
Juan A. Suarez Romero	9be055a22a	ci/v3d: fix typo in job name Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10479>	2021-04-28 07:38:53 +00:00
Iago Toral Quiroga	d636c5660c	v3dv: implement wsi hook to decide if we can present directly on device This will prevent the driver to take the prime blit path for presentation in scenarios where it can avoid it, which can substantially improve performance, particularly at high resolutions. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5917>	2021-04-27 06:37:43 +00:00
Juan A. Suarez Romero	c93bd731f8	v3dv/pipeline_cache: bail out in case of error Currently, in GetPipelineCacheData() function, in several cases if there is an error the blob is finished and cache unlocked, but code continues executing, which can lead to multiple `pthread_mutex_unlock()` calls. Instead, if there's an error just bail out to finish the blob and unlock the cache directly. Fixes CID#1468147 "Double unlock (LOCK)". v2: - Rename "bail_out" by "done" (apinheiro) Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10404>	2021-04-22 11:40:40 +02:00
Juan A. Suarez Romero	84c64a0713	ci/v3d: execute all piglit tests Most of the regressions we found are with the piglit testsuite. The difference between executing all tests versus quick_gl + quick_shader are minimal, in the sense that we would need the same number of jobs to execute and be in the 10 minutes budget. Hence replace "quick_gl" + "quick_shader" by "all". Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10378>	2021-04-22 07:53:39 +00:00
Juan A. Suarez Romero	796cb1e9d5	v3dv: check returned values Check if v3dv_ioctl() or v3dv_bo_map() fail, and print a proper error message. This check happens in the rest of the code, so it makes sense to apply here too. Fixes CID#1468162 "Unchecked return value (CHECKED_RETURN)". v2: - Fix message error (Iago) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10380>	2021-04-22 07:39:24 +00:00
Juan A. Suarez Romero	ccfe3e4af5	ci/broadcom: add EGL testing jobs Reviewed-by: Eric Anholt <eric@anholt.net Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10352>	2021-04-21 07:33:35 +00:00
Juan A. Suarez Romero	37e7725a5e	ci/vc4: add KHR-GLES2.* job test Reviewed-by: Eric Anholt <eric@anholt.net Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10352>	2021-04-21 07:33:35 +00:00
Juan A. Suarez Romero	79a0eee2fb	ci/broadcom: update expected results Reviewed-by: Eric Anholt <eric@anholt.net Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10352>	2021-04-21 07:33:35 +00:00
Anuj Phogat	12099d51f6	intel: Rename gen_10 to ver_10 Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241>	2021-04-20 20:06:34 +00:00
Juan A. Suarez Romero	0dde87457e	ci/v3d: add KHR-GLES test jobs Add (manually launched) jobs for KHR-GLES2., KHR-GLES3. and KHR-GLES31.* Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10287>	2021-04-20 12:39:27 +02:00
Alejandro Piñeiro	f5133f6bce	v3dv/pipeline: track descriptor maps per stage, not per pipeline One of the conclusions of our recent clean up on the limits was that the pipeline limits needed to be the per-stage limits multiplied by the number of stages. But until now we only have a set of descriptor maps for the full pipeline. That would work if we could set the same limit per pipeline that per stage, but that is not the case. So if, for example, we have the fragment shader using V3D_MAX_TEXTURE_SAMPLERS textures, and then the vertex shader, with a different descriptor set, using one texture, we would get an index greater that V3D_MAX_TEXTURE_SAMPLERS. We assert that index as an error on the vulkan backend, but fwiw, it would be also asserted on the compiler. With this commit we track and allocate a descriptor map per stage, although we reuse the vertex shader descriptor map for the vertex bin. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10272>	2021-04-19 23:10:35 +00:00
Iago Toral Quiroga	6c80b084f2	v3dv: better tracking of dirty push constant state Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10283>	2021-04-16 12:29:11 +00:00
Iago Toral Quiroga	30f125f04f	v3dv: dirty viewport doesn't affect fragment shaders The uniform state for the viewport is only used with geometry stages. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10283>	2021-04-16 12:29:11 +00:00
Iago Toral Quiroga	35ff75701f	v3dv: improve dirty descriptor set state tracking We were using the pipeline layout to discard uniform updates for stages that don't use descriptors, but we can do better by keeping track of the stages used by the specific dirty descriptor sets and only update uniforms for stages that are included in those. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10283>	2021-04-16 12:29:11 +00:00
Juan A. Suarez Romero	d29b5b9f20	v3dv: avoid dereferencing null value Fixes CID#1468079 "Dereference null return value (NULL_RETURNS)" Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10280>	2021-04-16 11:12:31 +00:00
Iago Toral Quiroga	1cf36797bf	v3dv: fix sRGB blending workaround This workaround needs to set a flag in the current job but it was implemented at pipeline binding time, which can happen outside a render pass. Move it to the pre-draw handler, where it belongs. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4645 Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10255>	2021-04-16 06:05:59 +00:00
Adam Jackson	fc9b3b260e	Revert "glx: Lift sending the MakeCurrent request to top-level code" This provokes crashes in Cinnamon for some reason that I haven't diagnosed yet. This reverts commit `80b67a3b44`. Fixes: `80b67a3b44` glx: Lift sending the MakeCurrent request to top-level code Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4639 Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10260>	2021-04-15 19:48:10 +00:00
Michel Dänzer	d200f45875	Use explicit break instead of fall-through to break-only case clang generates a warning if there's no explicit break or fall-through annotation. The latter would be kind of silly in this case, and not robust against any future changes turning the fall-through invalid. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>	2021-04-15 16:01:22 +00:00
Iago Toral Quiroga	bed3f31fc6	v3dv: don't use a dedicated BO for each occlusion query Dedicated BOs waste memory and are also a significant cause of CPU overhead when applications use hundreds of them per frame due to all the work the kernel has to do to page in all these BOs for a job. The UE4 Vehicle demo was hitting this causing it to freeze and stutter under 1fps. The hardware allows us to setup groups of 16 queries in consecutive 4-byte addresses, requiring only that each group of 16 queries is aligned to a 1024 byte boundary. With this change, we allocate all the queries in a pool in a single BO and we assign them different offsets based on the above restriction. This eliminates the freezes and stutters in the Vehicle sample. One caveat of this solution is that we can only wait or test for completion of a query by testing if the GPU is still using its BO, which basically means that we can only wait for all active queries in a pool to complete and not just the ones being requested by the API. Since the Vulkan recommendation is to use a different query pool per frame this should not be a big issue though. If this ever becomes a problem (for example if an application does't follow the recommendation and instead allocates a single pool and splits its queries between frames), we could try to group queries in a pool into a number of BOs to try and find a balance, but for now this should work fine in most cases. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10253>	2021-04-15 12:45:07 +00:00
Iago Toral Quiroga	917049e7d6	v3dv: fix array sizes when tracking BOs during uniform setup The resource indices we get point to descriptor map entries that include all shader stages, so we need to size the arrays to account for more than just one stage. For now we only support up to 2 stages in a pipeline, so we use that. Fixes: `002304482c` ('v3dv: avoid redundant BO job additions for UBO/SSBO') Fixes: `fa170dab4c` ('v3dv: avoid redundant BO job additions for textures and samplers') Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10252>	2021-04-15 11:26:04 +00:00

1 2 3 4 5 ...

1479 commits