Weird that only RENOIR fails given that ASTC/ETC2 aren't natively
supported too.
Needs to be investigated but SDMA supports these formats to some
extent it seems.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39230>
Move tfu_supports_tex_format() and
get_internal_type_bpp_for_output_format() from v3dvx_private.h
to v3dvx_format_table.h.
Move v3dv_format_plane and v3dv_format struct from v3dv_private.h
to v3dv_format_table.h.
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38732>
If the main CS is SDMA and the gang CS is ACE, this would emit a
SDMA_FENCE packet on ACE which just hangs.
Fixes: b1938901d0 ("radv: Use SDMA fence packet when flushing gang semaphores")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39211>
Add a custom build target to generate and install the compressed
adreno_pm4.xml file, needed by rnn that's utilized through different tools.
This used to be generated in the generic loop of XML files, but was then
left out after adreno_pm4.xml handling was special-cased.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Fixes: 950f07748a ("meson: Use adreno-pm4-pack.xml.h instead of custom definitions")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39213>
This should be exact, even for all special values:
fsqrt(NaN) -> NaN
fsqrt(-0.0) -> 0.0
fsqrt(-Inf) -> NaN
fsqrt(negative finite) -> NaN
So all of these get saturated to +0.0
All numbers >= 1.0 will have a square root >= 1.0,
which will be saturate to 1.0
Moving the fsat guarantees that it can use an output modifier
for hardware that has those, and shouldn't harm other hardware either.
Foz-DB Navi21:
Totals from 255 (0.31% of 82151) affected shaders:
Instrs: 664906 -> 664194 (-0.11%)
CodeSize: 3623500 -> 3619188 (-0.12%)
Latency: 11336397 -> 11335688 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 2716430 -> 2715726 (-0.03%); split: -0.03%, +0.00%
VALU: 442603 -> 441891 (-0.16%)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39202>
The format parameters should come from the buffer itself,
not be taken from the process_properties,
because the buffer used for geometric scaling does not
originate from an externally provided buffer.
Signed-off-by: Peyton Lee <peytolee@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38948>
The current code walks the instructions, and when needed,
it will scan to find the next "end of scope" and sometimes
the next "end of block". It also has a separate patching
logic for HALTs.
The new code collects the necessary scope information up front,
then walks the instruction backwards, making avoiding the need
to scan for the end of scope. It will also walk only the
relevant instructions that were previously collected. It also
replaces the previous HALT-specific patching logic.
With this new change, many cases that were jumping to
intermediate HALTs, will now jump straight to the end of
scope (or the "end of the program" section). E.g. in
```
if
...
(...) HALT
...
(...) HALT
endif
```
both HALTs now will jump to the end of the scope, instead of the
first HALT jumping into the second one.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38914>
Missing f in other cases seems to be caught either elsewhere in the
script or by the C compiler.
Fixes: c49d6e0480 ("nir/algebraic: Elide range clamping of f2u sources")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39031>
Without this, nir_algebraic.py was treating "f2i{int_sz}_sat" as the
literal opcode name when it should have been "f2i8_sat" or similar.
Fixes: c49d6e0480 ("nir/algebraic: Elide range clamping of f2u sources")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39031>
The users of exportable might have different expectations for what can
be exported, and some are more tight. So we need a new exportable_dmabuf
flag to track where dmabuf is actually needed.
If the underlying driver does not advertise dmabuf extension, requesting
dmabuf export violates the spec VU:
> VUID-VkMemoryGetFdInfoKHR-handleType-00671
>
> handleType must have been included in
> VkExportMemoryAllocateInfo::handleTypes when memory was created
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38439>
This converts from 1D workgroups to 2D ray launch IDs entirely via
shader ALU, including handling partial/cut-off workgroups optimally.
Doing this entirely in-shader means it Just Works(TM) with indirect
dispatches as well. Previous approaches manipulating various things on
CPU depending on the dispatch size couldn't handle indirect dispatches.
The swizzle implemented here also swizzles with a recursive Z-order
pattern, which should be a little more optimal than arranging
invocations linearly within the wave.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39142>
Most of instructions follow the basic formats (1, 2 and 3 src), so
consolidate their emission code in generator.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38878>
Move validation, noting that LRP only supports BRW_TYPE_F -- the
previous assert had DF because it also was used by MAD in the past.
With that change, ALU3F can be replaced by ALU3 for LRP.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38878>
When repctrl is used, the swizzle/chansel is ignored. Instead of setting
a swizzle that has all zeros and encode that, don't encode anything.
For context see e7598c5a62 ("intel/compiler: Set swizzle to BRW_SWIZZLE_XXXX
for scalar region").
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38878>
Otherwise, the following crash is observed on the host:
"Unhandled Vulkan structure type Unhandled VkStructureType [1000010002], aborting"
which corresponds to PHYSICAL_DEVICE_PRESENTATION_PROPERTIES_ANDROID.
We shouldn't be sending those structs down to the host. Don't
post-process vkGetPhysicalDeviceProperties2, pre-process it to
filter the guest-only structs.
Reviewed-by: David Gilhooley <djgilhooley.gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39205>
This helps Meson track when dependencies are modified. If they
are modified, running ninja -C actually re-generates the code.
Beforehand, this was not the case and contrary to the user
expectation.
Reviewed-by: David Gilhooley <djgilhooley.gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39205>
Drop the assignment entirely (and fallback to the default of 1024).
Fixes GL_OUT_OF_MEMORY errors when calling e.g., glTexStorage2D.
Fixes: 24ba57259f ("mesa: remove MaxTextureMbytes, use the cap instead")
Signed-off-by: Alyssa Milburn <amilburn@zall.org>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39143>
The Vulkan feature fillModeNonSolid is used to implement OpenGL API
glPolygonMode(), which does not exist in OpenGL ES and the hardware
support is missing in many mobile GPUs.
The use of this Vulkan feature is only triggered when glPolygonMode() is
really called, and among current gallium drivers at least lima and
panfrost do not properly handle polygon modes either.
Only warn about this feature being missing when it's really needed,
instead of warning at screen initialization time. This will prevent the
warning from being raised when running OpenGL ES on Zink.
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38897>
Set the per-pixel mask based on the value of skip_helpers.
This slightly increase the performance on several traces.
fps_avg helped: gl_gfxbench_trex.trace: 22.30 -> 22.79 (2.20%)
total fps_avg in all runs: 55.18 -> 55.71 (0.97%)
total fps_avg in affected (through threshold) runs: 22.30 -> 22.79 (2.20%)
helped: 1
HURT: 0
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38759>
Just changing the intrinsic for load_push_constant is wrong, as nothing
guarantees they will have the same indices in the future.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38759>
It will be used with image loads to enable or disable helper invocations.
This fixes a Vulkan CTS test that perform an imageLoad() inside a
fwidth() operation.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38759>
When a time delta is a float, the minutes and seconds can produce a weird
output between 0.5 and 0.9 with strings like 1m60s. Just forcing a cast
to an integer, the bug is solved.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39134>
Added in 411110f7 as part of !39105 an argument to define the polling period
to monitor a pipeline and check if there are jobs to be enabled. Part of this
MR, 8cf2c50e, also includes changes to improve the experience when using this
tool within a GitLab job. But the pretty_wait method, meant to show a
heartbeat to the user, is disturbing the job traces as '\r' is useless in a
non-terminal console.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39134>
Move the enabled storage_8bit property toggle into the base a7xx GPUProps
class. This enables storageBuffer8BitAccess Vulkan feature on all a7xx
hardware, much like the proprietary driver does. It's also a required
feature with Vulkan 1.4.
Fixes: dEQP-VK.info.device_mandatory_features on pre-a750 a7xx hardware.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39124>
The header should be 0 for older sdma as well. This fixes
DRI_PRIME support for radeonsi.
Fixes: f5ecc5ffd5 ("ac,radv,radeonsi: add ac_emit_sdma_copy_tiled_sub_window()")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39019>
We already optimize the case where the destination format does not
contain alpha. However, there are a few more cases around formats and
blend constants which we can optimize. In particular, float blending
doesn't support constants so we really want to check if the client hands
us a 0/1 constant.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39171>