This converts from 1D workgroups to 2D ray launch IDs entirely via
shader ALU, including handling partial/cut-off workgroups optimally.
Doing this entirely in-shader means it Just Works(TM) with indirect
dispatches as well. Previous approaches manipulating various things on
CPU depending on the dispatch size couldn't handle indirect dispatches.
The swizzle implemented here also swizzles with a recursive Z-order
pattern, which should be a little more optimal than arranging
invocations linearly within the wave.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39142>
Most of instructions follow the basic formats (1, 2 and 3 src), so
consolidate their emission code in generator.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38878>
Move validation, noting that LRP only supports BRW_TYPE_F -- the
previous assert had DF because it also was used by MAD in the past.
With that change, ALU3F can be replaced by ALU3 for LRP.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38878>
When repctrl is used, the swizzle/chansel is ignored. Instead of setting
a swizzle that has all zeros and encode that, don't encode anything.
For context see e7598c5a62 ("intel/compiler: Set swizzle to BRW_SWIZZLE_XXXX
for scalar region").
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38878>
Otherwise, the following crash is observed on the host:
"Unhandled Vulkan structure type Unhandled VkStructureType [1000010002], aborting"
which corresponds to PHYSICAL_DEVICE_PRESENTATION_PROPERTIES_ANDROID.
We shouldn't be sending those structs down to the host. Don't
post-process vkGetPhysicalDeviceProperties2, pre-process it to
filter the guest-only structs.
Reviewed-by: David Gilhooley <djgilhooley.gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39205>
This helps Meson track when dependencies are modified. If they
are modified, running ninja -C actually re-generates the code.
Beforehand, this was not the case and contrary to the user
expectation.
Reviewed-by: David Gilhooley <djgilhooley.gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39205>
Drop the assignment entirely (and fallback to the default of 1024).
Fixes GL_OUT_OF_MEMORY errors when calling e.g., glTexStorage2D.
Fixes: 24ba57259f ("mesa: remove MaxTextureMbytes, use the cap instead")
Signed-off-by: Alyssa Milburn <amilburn@zall.org>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39143>
The Vulkan feature fillModeNonSolid is used to implement OpenGL API
glPolygonMode(), which does not exist in OpenGL ES and the hardware
support is missing in many mobile GPUs.
The use of this Vulkan feature is only triggered when glPolygonMode() is
really called, and among current gallium drivers at least lima and
panfrost do not properly handle polygon modes either.
Only warn about this feature being missing when it's really needed,
instead of warning at screen initialization time. This will prevent the
warning from being raised when running OpenGL ES on Zink.
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38897>
Set the per-pixel mask based on the value of skip_helpers.
This slightly increase the performance on several traces.
fps_avg helped: gl_gfxbench_trex.trace: 22.30 -> 22.79 (2.20%)
total fps_avg in all runs: 55.18 -> 55.71 (0.97%)
total fps_avg in affected (through threshold) runs: 22.30 -> 22.79 (2.20%)
helped: 1
HURT: 0
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38759>
Just changing the intrinsic for load_push_constant is wrong, as nothing
guarantees they will have the same indices in the future.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38759>
It will be used with image loads to enable or disable helper invocations.
This fixes a Vulkan CTS test that perform an imageLoad() inside a
fwidth() operation.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38759>
When a time delta is a float, the minutes and seconds can produce a weird
output between 0.5 and 0.9 with strings like 1m60s. Just forcing a cast
to an integer, the bug is solved.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39134>
Added in 411110f7 as part of !39105 an argument to define the polling period
to monitor a pipeline and check if there are jobs to be enabled. Part of this
MR, 8cf2c50e, also includes changes to improve the experience when using this
tool within a GitLab job. But the pretty_wait method, meant to show a
heartbeat to the user, is disturbing the job traces as '\r' is useless in a
non-terminal console.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39134>
Move the enabled storage_8bit property toggle into the base a7xx GPUProps
class. This enables storageBuffer8BitAccess Vulkan feature on all a7xx
hardware, much like the proprietary driver does. It's also a required
feature with Vulkan 1.4.
Fixes: dEQP-VK.info.device_mandatory_features on pre-a750 a7xx hardware.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39124>
The header should be 0 for older sdma as well. This fixes
DRI_PRIME support for radeonsi.
Fixes: f5ecc5ffd5 ("ac,radv,radeonsi: add ac_emit_sdma_copy_tiled_sub_window()")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39019>
We already optimize the case where the destination format does not
contain alpha. However, there are a few more cases around formats and
blend constants which we can optimize. In particular, float blending
doesn't support constants so we really want to check if the client hands
us a 0/1 constant.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39171>
This actually enables blending for 4 of the supported float formats.
Technically, RGB16F blending is possible as well, using RGBA16F
internally but we only support FORMAT_R16G16B16_SFLOAT for vertex
buffers so there's really no point. This elimiates a lot of blend
shaders and improves the performance of the 3DMark Wild Life benchmark
by about 5 FPS (7-8%) on my MediaTek Chromebook.
Reviewed-by: Aksel Hjerpbakk <aksel.hjerpbakk@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39171>
Blend equations that work on float are treated a bit differently, hence
the new is_float on pan_blend_equation.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39171>
This is duplicated between the two drivers and about to get more
complicated.
Reviewed-by: Aksel Hjerpbakk <aksel.hjerpbakk@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39171>
Valhall adds float color target support in hardware, including hardware
blending. This commit just adds the XML and doesn't enable it in the C
code. Annoyingly, even though there's enough bits to do otherwise, the
hardware re-interprets the color (writeback) format field in the render
target descriptor based on the internal format. The easiest way to
handle this in the XML is to just have two different enums and fields in
the Render Target structs which alias. This seems to be the least
duplication while still encoding the necessary information.
Reviewed-by: Aksel Hjerpbakk <aksel.hjerpbakk@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39171>
Add 'force_robustness' to 'MESA_DEBUG_KK' to force robustness in all
shaders.
Reviewed-by: Arcady Goldmints-Orlov <arcady@lunarg.com>
Signed-off-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38881>
Replaying a dump file requires the VM state in order to feed the
replay tool with the necessary VMA properties that described the hang,
however, these properties are not necessarily useful once the replay
tool re-runs said traces, however, this patch makes this optional.
Signed-off-by: Carlos Santa <carlos.santa@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34829>
The tool now can seamlessly support GPU hang dump files from
either the i915 or the Xe drivers.
Signed-off-by: Carlos Santa <carlos.santa@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34829>
The changes as part of the Contexts state now include:
**** Contexts ****
[HWCTX].replay_offset: 0x0
[HWCTX].replay_length: 0xd000
and the changes as part of the VM state now include:
**** VM state ****
VM.uapi_flags: 0x1
[40000].length: 0x2000
[40000].properties: read_write|bo|mem_region=0x1|pat_index=2|cpu_caching=1
[40000].data: &-)\3!!E9mzzzzzzzzzz
In order to be able to replay a GPU hang from a devcore dump file
new properties have been added describing the offset and the length
of the affected hw context as well as a global VM flag and
several VMA property types: memory region, bo caching, pat index,
memory permission and memory type.
Signed-off-by: Carlos Santa <carlos.santa@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34829>
Before bringing support for Xe let's create a lib so that
the common code can live there.
Signed-off-by: Carlos Santa <carlos.santa@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34829>
initial refactoring of the i915 code in preparation
for Xe. No functional changes.
Signed-off-by: Carlos Santa <carlos.santa@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34829>
SHA1_DIGEST_LENGTH was changed to refect BLAKE3 exposed with SHA1 functions - switch
to BUILD_ID_EXPECTED_HASH_LENGTH.
Fixes: 492a176cbb ("util: increase SHA1_DIGEST_LENGTH to 32 (BLAKE3_KEY_LEN)")
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39192>