Fixes the following building error happening with clang:
FAILED: src/amd/vulkan/libvulkan_radeon.so.p/radv_pipeline_rt.c.o
...
../src/amd/vulkan/radv_pipeline_rt.c:1050:44: error: use of GNU empty initializer extension [-Werror,-Wgnu-empty-initializer]
struct radv_shader_stage prolog_stage = {};
^
1 error generated.
Fixes: afe51940 ("radv: Rewrite the RT prolog in NIR")
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40893>
In all other cases, we immediately set atomic cap after acquiring a drm
fd:
- wsi_init
- AcquireXlib
- AcquireDrm
wsi_GetDrmDisplayEXT is the only function left where the atomic cap is
set by wsi_display_get_connector. This commit moves the set atomic cap
into wsi_GetDrmDisplayEXT.
Signed-off-by: Yuxuan Shui <yshui@codeweavers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39166>
Usually connector property IDs are acquired in
wsi_display_get_connector, which is called by wsi_get_connectors, and in
turn by vkGetPhysicalDeviceDisplayProperties2KHR and
vkGetPhysicalDeviceDisplayPlanePropertiesKHR. Except if the drm fd is
not available when these functions are called. Which will be the case if
vkAcquireXlibDisplayEXT is not called first.
So it goes like this. First, the display is created in
vkGetRandROutputDisplayEXT. Then it's used in
vkGetPhysicalDeviceDisplayPlanePropertiesKHR, but since the drm fd is
not available at this point, connector property IDs are not initialized.
Later, this display is used in vkAcquireXlibDisplayEXT, which also
doesn't touch the property IDs. Finally in drm_atomic_commit, the
atomic commit fails with EINVAL, specifically because of the
uninitialized ID of the "CRTC_ID" property. Since it's one of the
properties drm_atomic_commit tries to set.
This commit makes sure that find_connector_properties is called in
vkAcquireXlibDisplayEXT to initialize the property IDs.
Fixes: 513ffea1d3 ("wsi/display: use atomic mode setting")
Signed-off-by: Yuxuan Shui <yshui@codeweavers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39166>
Without this, printf messages that were sent before a crash will be
lost, which makes shader printf pretty useless for debugging crashes.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40816>
Calling this varaible `i` made it very easy for it to shadow a loop
variable in the enclosing scope, which became an issue if `src` were an
expression referencing a different variable `i`. Rename the variable to
make shadowing less likely.
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40639>
The more general address space we used to have cannot be
implemented on top of ROOT_TABLE because of ROOT_TABLE's bank pattern.
Instead, adjust the address space so it provides a less general index
into dynamic_buffers.
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40639>
Jay is a new SSA-based compiler for Intel GPUs. This is an early
work-in-progress. It isn't ready to ship, but we'd like to move development in
tree rather than rebasing the world every week. Please don't bother testing yet
- we know the status and we're working on it!
Jay's design is similar to other modern NIR backends, particularly ACO, NAK and
AGX. It is fully SSA, deconstructing phis after RA. We use a Colombet register
allocator similar to NAK, allowing us to handle Intel's complex register
regioning restrictions in a straightforward way. Spilling logical registers is
straightforward with Braun-Hack.
Thanks to the SSA-based design, the entire backend is essentially linear time,
regardless of register pressure, addressing brw's excessive compile time when
especially spilling with brw.
In this current early draft, we support a limited subset of all three APIs on
Xe2. A lot works and a lot doesn't. The core compiler is there (spilling,
scoreboarding, SIMD32, etc should more or less work), but there are details to
fill in for both performance and correctness. We essentially pass conformance on
OpenGL ES 3.0 and OpenCL 3.0, and we're busy iterating on Vulkan.
Likewise, additional hardware support will come down the line. There's nothing
fundamentally Xe2-specific here. I just have a Lunarlake laptop on my desk, Ken
has a Battlemage card, and we had to pick _something_ as the first target.
Co-authored-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>
In the future this might even do something clever.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>
While Jay supports subgroups, efficient reductions are TODO so it's probably
better not to run this pass yet.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>
Jay will need more work to handle these payloads properly especially in SIMD32.
For now just disable the optimization for Jay for correctness.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>
This is a 2x16 bitpacked version of load_pixel_coord which maps directly to the
hardware value and is much easier for Jay to consume due to the sadness that is
true 16-bit on Intel. Jay will lower to this internally.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>
Jay will use this to lower & optimize subgroup shuffles. This is closer to
how Intel hardware works but still much higher level than the hardware
primitive. This gets us NIR optimizations on the multiply however.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>
This exposes the underlying render target write message directly, which Jay will
use to lower RT writes in NIR. I'm still on the fence about what exactly this
should look like but this is good enough for GLES3.0 (so, multiple render
targets but not necessarily dual source blending).
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>
This maps directly to what Intel's thread payload gives us, allowing us to
optimize out frcp's in some cases. Jay will use this.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>
These lowered versions map to what Jay can deal with. The hardware is more
flexible but we're not due to data model restrictions. We choose to lower to get
us off the ground, we can revisit later.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40835>
The immediate fs_clear_color shader uses IMM[0] but still declares
CONST[0][0]. That can make drivers try to read a fragment constant
buffer even though one is never uploaded on this path. Only declare
CONST[0][0] when the shader actually uses a constant buffer.
Fixes: 2ff9fa8b72 ("gallium/u_blitter: add a new fs_color_clear variant")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40760>
On big-endian hosts, r300 handles A8R8G8B8 and X8R8G8B8 by using
DWORD swap and programming component order as the matching B8G8R8A8 or
B8G8R8X8 formats. Reuse the same mapping when packing CBZB clear colors.
Fixes the bad lower-screen colors in Extreme TuxRacer on RV350.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40885>
The ordered atomic commits the post-add offset to memory, but overflow was computed using the pre-add offset, causing partial overflows to be missed and counters to become corrupted.
Fixes: "KHR-GL46.transform_feedback_overflow_query_ARB.multiple-streams-one-buffer-per-stream" based on the postwrite buffer offset, rather than the offset before the current workgroups writes.
Reviewed-by: Marek Olsak <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40745>
Some video decoders spit out AFBC(16x16,sparse,split) images. Advertise
support for this modifier so we can import such images.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40746>
Remove unused includes or heavy includes (e.g. `tu_common.h`) when
we could have done with lighter ones.
iwyu was used to find these cases.
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40853>