This gives a nice boost, +20% at this time on my Vega 56. Shader
ballot should be enabled by default at some point but it reduces
performance a bit (-6%) with Wolfeinstein II. Enable it only for
Youngblood at the moment, like what we did for Talos in the past.
As a bonus point, it gets rid of some minor artifacts that only
happens when ballot is disabled for some reasons.
Cc: 19.2 <mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit a6ad9e8ccf)
Shader ballot will be enabled by default for Wolfenstein
Youngblood. This follows what we did for sisched.
Cc: 19.2 <mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit f202ac27a9)
Otherwise hangs are possible. This register was already set for
GS and NGG.
Fixes: 5eaed7ecfc "radv/gfx10: enable support for NAVI10, NAVI12 and NAVI14"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit e04761d0f9)
Should take the max of the 2.
Fixes: ea337c8b7e "radv/gfx10: fix VS input VGPRs with the legacy path"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 2e763f7c87)
The compute paths in vl are a bit AMD-specific. For example, they (on
nouveau), try to use a BGRX8 image format, which is not supported.
Fixing all this is probably possible, but since the compute paths aren't
in any way better, it's difficult to care.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213
Fixes: 9364d66cb7 (gallium/auxiliary/vl: Add video compositor compute shader render)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 958390a9bf)
We need two different values of the register, one for NGG and one for
legacy, in order to fix edge flags for the legacy pipeline.
Passing the ngg flag to emit_clip_regs would be too complicated,
so CONTEXT_REG_RMW is used for partial register updates.
ra_get_best_spill_node is what other users of the mesa register
allocator use.
Switching to it now also fixes an infinite loop issue with ppir regalloc
with the ppir control flow patchset, and also provides a small gain over
the previous herusitic on number of spilled nodes testing with
shader-db.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
The SSE2 executor was removed in 4eb3225b38 ("Remove tgsi_sse2.")
Acked-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
This was fixed in 912ed84f83 ("tgsi: move to using vector for system
values.")
Acked-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
The GLES2 CTS takes about 8 minutes of total runtime (at parallel 4 is
~2 minutes in the test stage if runners are free), while GLES3 takes
about 25. Since the GLES3 run is pretty expensive, just do a cheap
touch test of 1 out of every 10 tests in the test list on MRs, until
we can get the runtime down.
v2: Drop the full run for now until we can bring runtime down or bring
up a dedicated mesa runner.
Reviewed-by: Eric Engestrom <eric@engestrom.ch> (v1)
Reviewed-By: Gert Wollny <gert.wollny@collabora.com> (v1)
This fixes the regression introduced on "mesa: refactor
compressed_tex_sub_image function" that started to crash
KHR-GLES2.texture_3d.compressed_texture.negative_compressed_tex_sub_image
Fixes: 7df233d68d ("mesa: refactor compressed_tex_sub_image function")
Reviewed-by: Eric Anholt <eric@anholt.net>
In a descriptor set inline uniform blocks don't use up any bindings.
However, the presence of any inline uniform blocks doed require the
use of the descriptor buffer, which takes up one binding.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This pass expects the shader to be in LCSSA form.
The algorithm is based on 'The Simple Divergence Analysis' from
Diogo Sampaio, Rafael De Souza, Sylvain Collange, Fernando Magno Quintão Pereira.
Divergence Analysis. ACM Transactions on Programming Languages and Systems (TOPLAS)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
ACO depends on LCSSA phis for divergent booleans to work correctly.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Co-authored-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
The implementation introduced in "tgsi_to_nir: be careful about not
losing any TGSI properties silently (v2)" updates all the TGSI properties,
but it didn't take into account that the shader_info structure uses a union
to store the different attributes for each shader stage.
Now we only update the attributes if they affect current shader stage,
avoiding to overwrite members of the union that should be overwritten.
This has created hundreds of regressions in v3d.
For example the TGSI_PROPERTY_VS_BLIT_SGPRS_AMD was overwritting the
same position used by TGSI_PROPERY_CS_FIXED_BLOCK_DEPTH.
Fixes: e300365197 ("tgsi_to_nir: be careful about not losing any TGSI properties silently (v2)")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
If a drawbuffer is an fbo without an attachment then its 'Height' will be zero,
and we have to take its 'DefaultGeometry.Height' into account.
Fixes on softpipe (with the exception of tests that use multisample):
dEQP-GLES31.functional.fbo.no_attachments.*
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Create separate SURFACE_STATE for render target read in order to support
non coherent framebuffer fetch on broadwell.
Also we need to resolve framebuffer in order to support CCS_D.
v2: Add outputs_read check (Kenneth Graunke)
v3: 1) Import Curro's comment from get_isl_surf
2) Rename get_isl_surf method
3) Clean up allocation in case of failure
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>