otherwise &container_of(..)->foo won't work, need extra parens. gcc version is
fine.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38313>
This adds from_wsi field to v3dv_image, so we can apply simulator stride
alignment only to WSI images.
Handling VK_STRUCTURE_TYPE_WSI_IMAGE_CREATE_INFO_MESA at
v3dv_GetPhysicalDeviceFormatProperties2 also removes debug warnings like:
MESA: debug: v3dv_GetPhysicalDeviceImageFormatProperties2: ignored VkStructureType Unknown VkStructureType value.(1000001002)
Fixes: 562bb8b62b ("v3dv: align width to 256 when using simulator")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38374>
Even though "shader" uniquely means something here and we don't need to
specify what stage, "cs" is still shorter, obvious, and matches what we
do all over the 3D code so it saves some cognitive load when bouncing
back and forth.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38403>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38403>
Previously the type info for nested values was copied from the source
operand, rather than propagating the new type from the destination
operand.
Fixes: 4c363acf94 ("vtn: Allow for OpCopyLogical with different but compatible types")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38248>
VGT_OUTPRIM_TYPE should be programmed correctly when PointMode is only
set in TCS with ESO.
Fixes dEQP-VK.shader_object.tessellation.hlsl.point_mode.
Fixes: c6d9b9b4e0 ("radv: support more tessellation parameters with TCS for ESO unlinked shaders"
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38376>
Non-fast-clear path to clear LRZ is rather slow, plus without LRZ
fast-clear we cannot enable concurrent binning.
VK_EXT_fragment_density_map_offset makes us add the maximum possible
tile size to the depth image size, because we don't know the tile
size that will be selected later on for a framebuffer. In practice,
this caused LRZ fast-clear to be disabled in many cases due to
tile_max_w/tile_max_h being rather large.
Now, instead of the most pessimitic case we can do the following:
- Calculate the biggest possible tile size that could be added to
the depth image that won't disable LRZ fast-clear.
- When calculating tiling config use the info about maximum tile
size from the images, or in case of image-less framebuffer
recalculate maximum possible tile size and it.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38218>
this code is invalid after the refcounting rework
Fixes: b3133e250e - gallium: add pipe_context::resource_release to eliminate buffer refcounting
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38329>
This is what happens when you leave MR unreviewed for months.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: d39e443ef8 ("anv: add infrastructure for common vk_pipeline")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38400>
These should be fixed now.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38364>
For GS streamout, we need the following LDS scratch space:
- Repacking streamout vertices takes 1 dword per 4 waves per stream
(max 16 bytes for Wave64, max 32 bytes for Wave32)
- 1 dword per stream for buffer info
(16 bytes)
- 1 dword per buffer for buffer info
(16 bytes)
Previously, the space used for buffer info aliased with the
space for repacking the output vertices in ngg_gs_finale(),
and there was no barrier in between, which caused a race
condition, resulting in random failure.
Fix this by allocating a few more LDS dwords so that aliasing
is not required, which also allows us to remove an extra
workgroup barrier.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12705
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38364>
Blob generates this with the glmark2:texture benchmark on STM32MP257.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38363>
LAVA jobs already have a global 1h timeout in GitLab. This exists because
GitLab jobs must start before we can determine whether a device is
available for testing.
Jobs themselves do not normally run that long, most of the delay comes
from waiting in the LAVA queue.
Dropping these overrides for pre-merge jobs fixes cases where the LAVA
job isn't picked up in time.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38395>
This is used by Steam Link VR (driver_vrlink) to avoid doing YUV conversion itself.
Signed-off-by: Autumn Ashton <misyl@froggi.es>
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37500>
1. The prolog needs to have a null check. Libraries don't have prologs.
2. We only need to print the shaders actually included in this pipeline.
Libraries were already printed separately.
3. The traversal shader was wrongly omitted from the output.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38355>
Add the string length parameter to the set_name(),
set_value() function to remove the conversion from
char* to std::string which takes extra work like
calling strlen() to compute the string length.
From the callback sampling in the perfetto tracing,
the ratio of trace_payload_as_extra_intel_end_draw_indexed
to intel_ds_end_draw_indexed drops from 63.80% to 59.65%
with this change.
v2: Add the data of the callback sampling to the description.
Signed-off-by: Andy Hsu <hwandy@google.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38073>
Adds build instructions and workarounds documentation.
Workarounds documentation only has the biggest offenders and
there are probably way more in code that need yet to be
documented.
Reviewed-by: Arcady Goldmints-Orlov <arcady@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38232>
This is more robust than smashing the variable to mediump and then
asking for mediump to be lowered later. It's also faster because it
only involves one compiler pass, not two.
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38379>
On Mali, we need not only clamp but also convert to float16 on Valhall+.
We could have a separate pass for this but it fits in nicely with the
rest of nir_lower_point_size() so we might as well put it there.
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38379>
Per ARB_vertex_program spec result registers are 4-component and initially
undefined, and the FF fragment program expects its intputs to be
4-component too. So, if the client's vertex program does not write the
whole vector it will cause misrenderings unless the same client also
supplies fragment program that expects less than 4 componens.
This commit adds a workaround that initializes results to vec4(0, 0, 0, 1)
which seems to be an expected behavior for such clients.
Cc: mesa-stable
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38295>
NIR is actually pretty good at optimizing UBO, SSBO, and shared memory
access but in order to do so, we actually have to run the optimizations
before we lower it all. Same for I/O. By doing all our lowering in
panvk before we ever run the optimization loop, we risk hampering it
significantly.
Ignoring loop changes (several get unrolled now), fossil-db on Sascha
Willems demos and a few others looks lik
Instrs: 189054 -> 187802 (-0.66%); split: -0.67%, +0.01%
CodeSize: 1756160 -> 1747072 (-0.52%); split: -0.52%, +0.01%
Estimated normalized CVT cycles: 771.367106999997 -> 766.0311719999971 (-0.69%); split: -1.05%, +0.36%
Estimated normalized SFU cycles: 1407.21875 -> 1406.9375 (-0.02%); split: -0.03%, +0.01%
Estimated normalized Load/Store cycles: 17477.0 -> 16917.0 (-3.20%)
Maximum number of threads: 1257 -> 1213 (-3.50%); split: +0.08%, -3.58%
Number of hardware loops: 283 -> 278 (-1.77%)
Totals from 186 (19.81% of 939) affected shaders:
Instrs: 102588 -> 101336 (-1.22%); split: -1.23%, +0.01%
CodeSize: 834432 -> 825344 (-1.09%); split: -1.10%, +0.02%
Estimated normalized CVT cycles: 463.226562 -> 457.890627 (-1.15%); split: -1.74%, +0.59%
Estimated normalized SFU cycles: 1021.84375 -> 1021.5625 (-0.03%); split: -0.05%, +0.02%
Estimated normalized Load/Store cycles: 8425.0 -> 7865.0 (-6.65%)
Maximum number of threads: 334 -> 290 (-13.17%); split: +0.30%, -13.47%
Number of hardware loops: 63 -> 58 (-7.94%)
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayern@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38334>
We need to lower outputs to get rid of output reads and so that we can
fix up layer writes on Bifrost. However, there's really no point in
lowering reads besides moving them to the top. Even then, NIR can
probably copy propagate the copies and we'll end up reading straight
from the input variable anyway.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayern@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38334>
Neither nir_lower_io() nor nir_lower_indirect_derefs() know what to do
with copy_deref so we need to get rid of those first. Also, there are
some NIR passes which can insert more copy_deref or propagate an
indirect load to the I/O variable so we want to lower those away right
before lowering I/O.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayern@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38334>
These two passes are a prerequisite for basically anything that
optimizes on variables.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayern@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38334>