In a3268599f3, I attempted to fix nir_repair_ssa for unreachable
blocks. However, that commit missed the possibility that the use is in
a block which, itself, is unreachable. In this case, we can end up in
an infinite loop trying to replace a def with itself. Even though a
no-op replacement is a fine operation, it keeps extending the end of the
uses list as we're walking it. Instead of explicitly checking for the
group of conditions, just check if the phi builder gives us a different
def. That's guaranteed to be 100% reliable and, while it lacks symmetry
with the is_valid checks, should be more reliable.
Fixes: a3268599 "nir/repair_ssa: Repair dominance for unreachable..."
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit d63162cff0)
Currently there is no way to make no context current w/gallium + osmesa.
The non-gallium version of osmesa does this if the context and buffer
passed to `OSMesaMakeCurrent` are both null. This small change makes it
so that this is also the case with the gallium version.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Hal Gentz <zegentzy@protonmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 57c894334e)
Some shaders do not use 'invariant' in vertex and (possibly) geometry
shader stages on some outputs that are intended to be invariant. For
various reasons, this optimization may not be fully applied in all
shaders used for different rendering passes of the same geometry. This
can result in Z-fighting artifacts (at best). For now, disable this
optimization in these stages.
In tessellation stages applications seem to use 'precise' when
necessary, so allow the optimization in those stages.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111490
Fixes: 09705747d7 ("nir/algebraic: Reassociate fadd into fmul in DPH-like pattern")
All Gen8+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16194726 -> 16344745 (0.93%)
instructions in affected programs: 2855172 -> 3005191 (5.25%)
helped: 6
HURT: 20279
helped stats (abs) min: 1 max: 3 x̄: 1.33 x̃: 1
helped stats (rel) min: 0.44% max: 1.00% x̄: 0.54% x̃: 0.44%
HURT stats (abs) min: 1 max: 32 x̄: 7.40 x̃: 7
HURT stats (rel) min: 0.14% max: 42.86% x̄: 8.58% x̃: 6.56%
95% mean confidence interval for instructions value: 7.34 7.45
95% mean confidence interval for instructions %-change: 8.48% 8.67%
Instructions are HURT.
total cycles in shared programs: 364471296 -> 365014683 (0.15%)
cycles in affected programs: 32421530 -> 32964917 (1.68%)
helped: 2925
HURT: 16144
helped stats (abs) min: 1 max: 403 x̄: 18.39 x̃: 5
helped stats (rel) min: <.01% max: 22.61% x̄: 1.97% x̃: 1.15%
HURT stats (abs) min: 1 max: 18471 x̄: 36.99 x̃: 15
HURT stats (rel) min: 0.02% max: 52.58% x̄: 5.60% x̃: 3.87%
95% mean confidence interval for cycles value: 21.58 35.41
95% mean confidence interval for cycles %-change: 4.36% 4.52%
Cycles are HURT.
(cherry picked from commit 92f70df8c3)
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 5b5c5bf833)
Later generations support bindless for samplers, images, and buffers and
thus per-stage descriptors are not limited by the binding table size.
However, gen8 doesn't support bindless images and thus needs to report a
lower per-stage limit so that all combinations of descriptors that fit
within the advertised limits are reported as supported by
vkGetDescriptorSetLayoutSupport.
Fixes test dEQP-VK.api.maintenance3_check.descriptor_set
Fixes: 79fb0d27f3 ("anv: Implement SSBOs bindings with GPU addresses in the descriptor BO")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 5ec5fecc26)
Commit d1e1563bb6 added a NULL check for eglGetSyncAttribKHR
but eglGetSyncAttrib does not do this. Patch adds same check to
happen with eglGetSyncAttrib.
Fixes crashes in (when exposing EGL 1.5):
dEQP-EGL.functional.fence_sync.invalid.get_invalid_value
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 99cbec0a5f)
The current code can create functions with a width of 32, which is not
supported by our hardware. Add some code to simplify how we express
what we want and prevent such cases.
For some unknown reason, all the tests I could run seem to work even
with these unsupported MOVs.
Fixes: b0858c1cc6 "intel/fs: Add a couple of simple helper opcodes"
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
(cherry picked from commit 8e614c7a29)
Akin to 1a25980c46 ("egl: drop incorrect pkg-config file for
glvnd") and b01524fff0 ("meson: don't build libGLES*.so with
GLVND") , removes a pkg-config file that shouldn't have been there in
the first place, but was needed because of that GLVND bug.
Now that the glvnd bug has been fixed, it was apparent that this gl.pc
pkg-config file was forgotten to be removed, so let's do just that :)
Suggested-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit a1de3011f3)
No GS copy shader if a pipeline enables NGG GS.
This fixes
dEQP-VK.pipeline.executable_properties.graphics.*geometry_stage*.
Fixes: 86864eedd2 ("radv: Implement radv_GetPipelineExecutablePropertiesKHR.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 99c186fbbe)
--oneline shortens hashes, while --oneline=pretty doesn't, otherwise
they are the same. Having full hashes is convenient as that is the
format that the bin/.cherry-ignore script requires to work correctly.
We have to load 2 32-bit integer and to cast correctly.
This fixes crashes with gs-double-interpolator.vk_shader_test.
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111734
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 68820007fd)
Fixes: 89785e2d56 ("i965: add support for sampling from AYUV")
Fixes: 7cab8d3661 ("i965: Add support for sampling from XYUV images")
Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 8a9b81ab9d)
If we have fsin or fcos trigonometric operations with constant values
as inputs, we will multiply the result by 0.99997 in
brw_nir_apply_trig_workarounds, making the result wrong.
Adjusting the rules so they do not apply to const values we let a
later constant fold to deal with it.
v2:
- Do not early constant fold but only apply the trig workaround for
non constants (Caio).
- Add fixes tag to commit log (Caio).
Fixes: bfd17c76c1 "i965: Port INTEL_PRECISE_TRIG=1 to NIR."
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 3c474f8513)
Otherwise it never gets closed, this fixes errors seen with deqp-egl
where we end up opening 1024 files.
Fixes: 2dce0e94 ("iris: Initial commit of a new 'iris' driver for Intel Gen8+ GPUs.")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 631255387f)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: f8d0337299 ('radv: add multiple streams support for the GS copy shader')
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit ffabcbba60)
This was meant to include up to version 23.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 0616b7ac90 ("vulkan: add vk_x11_strict_image_count option")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111522
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit dcf13fbac9)
This is embarrasing...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 04dc6074cf ("driconfig: add a new engine name/version parameter")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
(cherry picked from commit 10206ba17b)
A filed of nir_variable.location may be equel to -1.
That may cause copying to invalid address of list-node,
making some internal fields corrupted.
Patch fixes segfault during freeing context due to
corrupted address of ralloc_header.destructor.
v2: copy data if var is constant (Connor Abbott)
CC: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: b6d4753568 (nir/large_constants: De-duplicate constants)
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111676
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
(cherry picked from commit c7b2a2fd36)
This option strictly allocate the minImageCount given by the
application at swapchain creation.
This works around application that do not deal with the fact that the
implementation allocates more images than the minimum specified.
v2: Add values in default drirc (Bas)
v3: specify engine name/version (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111522
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 0616b7ac90)
Vulkan applications can register with the following structure :
typedef struct VkApplicationInfo {
VkStructureType sType;
const void* pNext;
const char* pApplicationName;
uint32_t applicationVersion;
const char* pEngineName;
uint32_t engineVersion;
uint32_t apiVersion;
} VkApplicationInfo;
This enables the Vulkan implementations to apply workarounds based off
matching this description.
Here we add a new parameter for matching the driconfig options with
the following :
<device driver="anv">
<application engine_name_match="MyOwnEngine.*" engine_versions="10:12,40:42">
<option name="blaaah" value="true" />
</application>
</device>
v2: switch engine name match to use regexps
v3: Verify that the regexec returns REG_NOMATCH for match failure (Eric)
v4: Add missing bit that went to the following commit (Eric)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 04dc6074cf)
We'll use this later for a new driconfig matching parameter.
v2: Avoid leak in device creation error case (Bas)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 6d5f11ab34)
It was calloc'd to 0 which is PIPE_PRIM_POINTS, which means that we
fail to notice an initial primitive of points being new, and fail at
updating the "primitive is points or lines" field.
We do not need to reset this on device loss because we're tracking
the last primitive mode sent to us on the CPU via draw_vbo, not the
last primitive mode sent to the GPU.
Fixes several tests:
- dEQP-GLES3.functional.clipping.point.wide_point_clip
- dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_center
- dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_corner
Fixes: dcfca0af7c ("iris: Set XY Clipping correctly.")
(cherry picked from commit c9fb704f72)
streamout_buffers is assigned after that function, so the previous
fix was completely wrong. This probably fix something when streamout
buffers and push constants are used/inlined in the same shader.
Fixes: 378e2d2414 ("radv: fix computing number of user SGPRs for streamout buffers")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
(cherry picked from commit 8137df3a46)
[Juan A. Suarez: fix the structure usage]
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Fixes: 0f2d1dfe65 ("v3d: use the GPU to record primitives written to transform feedback")
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit b9a07eed00)
Translating TGSI_INTERPOLATE_COLOR as INTERP_MODE_SMOOTH made
it for drivers impossible to have flatshaded color inputs.
Translate it to INTERP_MODE_NONE which drivers interpret as
smooth or flat depending on flatshading state.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111467
Fixes: 770faf54 ("tgsi_to_nir: Improve interpolation modes.")
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 175c32e9bd)
This is unsupported by meson and may become a hard error in the future.
Fixes: 5adfc8602c
("lima/ppir: move sin/cos input scaling into NIR")
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
(cherry picked from commit 52cf2d05a7)
This is a pipe format, not a boolean.
Fixes: 5849e0612c ("gallium/auxiliary: Add util_format_get_depth_only() helper.")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit c6d40b5182)
Don't use the UNMAPPED_UNIFORM_LOC (-1) to set the unsigned
max_uniform_location. Those unmapped uniforms don't have to be
accounted at this point.
Fixes: 7a9e5cdfbb ("nir/linker: Add gl_nir_link_uniforms()")
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
(cherry picked from commit 4f33f96c45)
Since the system value refactor, we've accidentally only been setting
cbuf->buffer_size in the UBO case, and not in the uploaded-constants
case. We use cbuf->buffer_size to fill out the SURFACE_STATE entry,
so it needs to be initialized in both cases.
Fixes: 3b6d787e40 ("iris: move sysvals to their own constant buffer")
(cherry picked from commit 077a1952cc)
Loops like:
block block_0:
vec1 32 ssa_2 = load_const (0x00000020)
vec1 32 ssa_3 = load_const (0x00000001)
loop {
vec1 32 ssa_7 = phi block_0: ssa_3, block_4: ssa_9
vec1 1 ssa_8 = ige ssa_2, ssa_7
if ssa_8 {
break
} else {
}
vec1 32 ssa_9 = iadd ssa_7, ssa_1
}
Were treated as having more than 1 iteration and after unrolling
produced wrong results, however such loop will exit during
the first iteration if not unrolled.
So we check if loop will actually loop.
Fixes tests/shaders/glsl-fs-loop-while-false-02.shader_test
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit e71fc7f238)
NIR shaders use GLSL types (note: these live outside libglsl), and
nine needs to properly initialize these just like the other state
trackers. This fixes an assertion failure when TTN is used.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
This reverts commit 74470baebb.
This change introduces some significant performance regressions. We
are fixing those on master, but the follow up work is large enough
not to backport to 19.2 .
Fixes: 74470baebb "ac/nir: Lower large indirect variables to scratch"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111576
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
The dead_cf pass calls into the CF manipulation helpers which attempt to
keep NIR's SSA form sane. However, when the only break is removed from
a loop, dominance gets messed up anyway because the CF SSA clean-up code
only looks at phis and doesn't consider the case of code becoming
unreachable. One solution to this would be to put the loop into LCSSA
form before we modify any of its contents. Another (and the approach
taken by this pass) is to just run the repair_ssa pass afterwards
because the CF manipulation helpers are smart enough to keep all the
use/def stuff sane; they just don't always preserve dominance
properties.
While we're here, we clean up some bogus indentation.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111405
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111069
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit c832820ce9)
NIR currently assumes that unreachable blocks are trivially dominated by
everything. However, when considering well-formed SSA, there is no path
from any block to an unreachable block. Therefore, we can break any
use-def chains where the use is in an unreachable block. This removes
any dependencies on code created by uses in unreachable blocks and lets
DCE do a better job of cleaning it up.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit a3268599f3)
We already bail and don't split the vars but we were passing a NULL to
_mesa_hash_table_search which is not allowed.
Fixes: f1cb3348f1 "nir/split_vars: Properly bail in the presence of ..."
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 37cdb7fc44)