Also fix the leading curly for the new function definitions.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19570>
should not change the output; avoids an additional printf()
for the separator.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19570>
Make the shader_info printing less verbose by skipping the fields that
are likely not used (being zero).
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19570>
This is a refactoring, it is not supposed to change the printed output.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19570>
On cmd_buffer_emit_scissor(), if VkViewport height or width are set to
a value lower than 1.0, y_max or x_max can be attributed negative values,
causing an overflow. That leads to ScissorRectangleYMax or
ScissorRectangleXMax to be set to values on an unsupported range.
Clamping x_max and y_max in the valid range solves the problem.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7471
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20200>
Android may use either DRM or some downstream solution, KGSL is a
downstream kernel driver for Adreno. Don't enable DRM when we want
Turnip to use KGSL instead of DRM.
Fixes: 09ac29cca9
("meson: Enable system_has_kms_drm for android")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20168>
We need to track those states in the cache and flush the cache
if the next glBitmap call uses different states.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19859>
This will be required with the next change, which will remove
the rasterizer state dependency on _NEW_PROGRAM.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19859>
_mesa_update_state() receives the _NEW_PROGRAM flag, so we can handle
any shader changes there.
There may be some overhead reduction because gfx_shaders_may_be_dirty
is removed.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19859>
In a tesselation control shader where an input array is accessed using
the index gl_InvocationID, we can end up accessing elements beyond the
number of input vertices specified in the shader key.
This happens because of the lowering in nir_lower_indirect_derefs().
This lowering will affect compact variables which happens in this
case :
in gl_PerVertex {
vec4 gl_Position;
float gl_ClipDistance[1];
} gl_in[gl_MaxPatchVertices];
The lowered code produced by NIR is somewhat ineffecient (implements a
binary seach) :
if (gl_InvocationID < 16) {
if (gl_InvocationID < 8) {
if (gl_InvocationID < 4) {
vec4 vals = load_at_offset(0);
value = bcsel(vals, gl_InvocationID);
} else {
vec4 vals = load_at_offset(4);
value = bcsel(vals, gl_InvocationID - 4);
}
} else {
if (gl_InvocationID < 12) {
vec4 vals = load_at_offset(8);
value = bcsel(vals, gl_InvocationID - 8);
} else {
vec4 vals = load_at_offset(12);
value = bcsel(vals, gl_InvocationID - 12);
}
}
} else {
if (gl_InvocationID < 24) {
...
} else {
...
}
}
By default the gl_MaxPatchVertices must be set at 32 items and that's
what the lowering code will use to divide the access into chunks of 4.
But when running with 3 input vertices, this means we'll pull one more
item than what was delivered in the shader payload.
This triggers issues further down the register scheduling where the
g5UD (register for the 4th item) is overwritten by a previous SEND,
leading the URB read to use an invalid handle.
This pass clamps any access load_per_vertex_input intrinsic vertex
indice to (input_vertices - 1).
Fixes issues with tests like :
dEQP-VK.clipping.user_defined.clip_distance.vert_tess.*
Also fixes a hang with zink/anv on :
KHR-GL46.draw_elements_base_vertex_tests.AEP_shader_stages
v2: Don't replace source register
v3: Implement in NIR
v4: Clamp per vertex array sizes in NIR (Jason)
v5: Move the clamping on the intel compiler
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9749>
If points or lines are drawn using the polygon mode, the guardband
should be adjusted for large points/lines.
Cc: 22.3 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20185>
While running VK-CTS with valgrind, the application hit the max
thread count of 500. After further investigation, this was due to
multiple instances being created with the disk cache spinning up
worker threads which wouldn't be cleaned as disk_cache_destroy
wasn't being called.
Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20178>
We lost the race in a recent MR of mine.
Fixes: 381e0b43d6 ("mesa: Add test to prevent windows.h to be included in shared headers")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20170>
The `coord_offset` pass is responsible for upgrading any eligible
texture loads into prefetches, but a texture prefetch's capabilities
are limited and cannot handle any interpolation modes aside from
`smooth`.
An exception is carved out for `flat` interpolation modes, but this
doesn't exclude upgrading `noperspective` texture loads and results
in perspective-corrected samples being provided that can severely
break applications depending on this behaviour.
Fixes incorrect lighting projection on Super Mario Odyssey on
Skyline Emulator.
Fixes incorrect dirt texture mapping on Portal 2 trace on Turnip and
Zink on Turnip.
Fixes incorrect lighter shadowing on Half Life 2 trace on Turnip.
Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19842>
`coord_offset` is called on the source of `alu` instructions and
it returns -1 for failures, this not explicitly checked for and
as a result the fetch can incorrectly be upgraded to a prefetch
when it isn't appropriate to do so.
Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19842>
Otherwise, making a CS using the memory will use the uninitialized .map
value (when checking the size of the CS in in begin's tu_cs_is_empty()
check), causing valgrind noise in
dEQP-VK.binding_model.descriptorset_random.sets4.dynindexed.ubolimitlow.sbolimitlow.sampledimghigh.lowimgsingletex.iublimitlow.nouab.vert.noia.0
(thanks to vi_info->vertexBindingDescriptionCount==0).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20173>
This reduces the number of instructions in task shaders when payload
size is not aligned to vec4 and payload_in_shared WA is enabled,
because nir_lower_task_shader will not need to handle the unaligned
size case.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20080>