Vulkan shaderdb stats with pattern dEQP-VK.image.*.with_format.*.*:
total instructions in shared programs: 35993 -> 33245 (-7.63%)
instructions in affected programs: 21153 -> 18405 (-12.99%)
helped: 394
HURT: 1
Instructions are helped.
total uniforms in shared programs: 8550 -> 7418 (-13.24%)
uniforms in affected programs: 5136 -> 4004 (-22.04%)
helped: 399
HURT: 0
Uniforms are helped.
total max-temps in shared programs: 6014 -> 5905 (-1.81%)
max-temps in affected programs: 473 -> 364 (-23.04%)
helped: 58
HURT: 0
Max-temps are helped.
total nops in shared programs: 1515 -> 1504 (-0.73%)
nops in affected programs: 46 -> 35 (-23.91%)
helped: 14
HURT: 2
Inconclusive result (%-change mean confidence interval includes 0).
FWIW, that one HURT on the instructions count is for just one
instruction.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25726>
Since v71, broadcom hw include specific packing/conversion
instructions, so this commit adds opcodes to be able to make use of
them, specially for image stores:
* pack_2x16_to_unorm_2x8 (on backend vftounorm8/vftosnorm8):
2x16-bit floating point to 2x8-bit unorm/snorm
* f2unorm_16/f2snorm_16 (on backend ftounorm16/ftosnorm16):
floating point to 16-bit unorm/snorm
* pack_2x16_to_unorm_2x10/pack_2x16_to_unorm_10_2 (on backend
vftounorm10lo/vftounorm10hi): used to convert a floating point to
a r10g10b10a2 unorm
* pack_32_to_r11g11b10 (on backend v11fpack): packs 2 2x16 FP into
R11G11B10.
* pack_uint_32_to_r10g10b10a2 (on backend v10pack): pack 2 2x16
integer into R10G10B10A2
* pack_4x16_to_4x8 (on backend v8pack): packs 2 2x16 bit integer
into 4x8 bits.
* pack_2x32_to_2x16 (on backend vpack): 2x32 bit to 2x16 integer
pack
For the latter, it can be easly confused with the existing
pack_32_2x16_split. But note that this one receives two 16bit integer,
and packs them on a 32bit integer. But broadcom opcode takes two 32bit
integer, takes the lower halfword, and packs them as 2x16 on a 32bit
integer.
Interestingly broadcom also defines a similar one that packs the
higher halfword. Not used yet.
Note that at this point we use agnostic names, even if we add a _v3d
suffix as they are only available for broadcom, in order to follow
current NIR conventions.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25726>
This is loosely based on PAL. This seems to fix 3D PRT support with
RADV on Polaris10. THIN means the tile is a 2D slice. THICK means the
tile is a 3D box.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26260>
No need to handle f2f16 specially for OpenGL, and we can vectorize
f2f16 when using ACO.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25990>
To fix si_compute_blit created nir code compilation with ACO.
Two 16bit vector ops are used in it:
con 16x2 %11 = u2u16 %10.xy
con 16x2 %25 = f2f16 %22.xy
which is not supported by ACO yet.
PS. now ACO supports vec2 f2f16.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25990>
The code in `emit_load_var` that will attempt to read indirect inputs
expects the entire array of inputs to be there. Additionally the code
that populates `bld->inputs_array` will populate the array using the count
of `inputs_read`, without ensuring the inputs it copies are the ones read.
This change populates `bld->inputs_array` with the entire contents of bld->inputs
so indirect reads will always match up.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26153>
For Wayland wsi allocations, v3dv used the wl_drm protocol, which is now
being phased out in favor of dmabuf feedback.
wl_drm is used to figure out the display device (in v3dv assumed to be
vc4) and then to authenticate with the Wayland compositor in order to
allocate scanout-able buffers (in this case, dumb buffers) directly at
the display device.
Recent commit 88c03ddd34 changed the behavior of the wsi code, and
wl_drm is now passing the render device instead, which broke Wayland
wsi.
It turns out that the authentication code is not really needed and since
we would like to remove wl_drm usage and the master device is assumed to
be vc4 anyway, we can just remove some unneeded device-specific wsi code
and get Vulkan Wayland wsi back to work.
Fixes: 88c03ddd34 ("egl/drm: get compatible render-only device fd for kms-only device")
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26200>
The SDMA IP is independent from the GFX IP, so it is technically
wrong to program it based on the GFX level.
This patch changes the RADV SDMA code to use SDMA IP versions
where possible.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26110>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26110>
The SDMA IP is independent from the GFX IP, so it is technically
wrong to program it based on the GFX level.
This patch adds a new enum for SDMA IP version and uses that
to determine functionality such as compression and sparse
support.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26110>
The vast majority of AMD GPUs (except the very first GCN) have
the same SDMA packet format, so let's just call it SDMA instead
of CIK_SDMA.
(And leave the oldest GPUs with SI_SDMA as they are now.)
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26110>
This naming is more accurate and closer to the HW.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26110>
Meson produces version scripts with an empty global node for disabled
drivers. This is reported as syntax error by the linker.
The root cause of the problem is that the version scripts are
accumulated in the out of foreach `pipe_loader_link_args` variable
although they should be only used once for their driver specific loader
library.
Fixes build errors when some of the drivers are disabled like on arm64
which disables i915 due to missing dependencies.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10166
Fixes: 667de678a0 ("gallium: Fix undefined symbols in version scripts")
Signed-off-by: Janne Grunau <j@jannau.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26268>
_eglFindDevice() will fail if it's not provided a render node:
the EGLDevice list only contains one entry per render node, plus
the special software device. Passing a primary node for a
display-only device will not work.
Signed-off-by: Simon Ser <contact@emersion.fr>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10142
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Leandro Ribeiro <leandro.ribeiro@collabora.com>
Tested-by: Iago Toral Quiroga <itoral@igalia.com>
Tested-by: Alejandro Piñeiro <apinheiro@igalia.com>
Fixes: 2be404f557 ("egl: error out if we can't find an EGLDevice in _eglFindDevice()")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26205>
dri2_setup_device() will depend on the extensions being set up in
the next commit.
None of the code in-between depends on disp->Device AFAIU.
Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Leandro Ribeiro <leandro.ribeiro@collabora.com>
Tested-by: Iago Toral Quiroga <itoral@igalia.com>
Tested-by: Alejandro Piñeiro <apinheiro@igalia.com>
Backport-to: 23.3
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26205>
Extract the logic responsible for populating disp->Device via
_eglFindDevice(). This isn't much for now but will grow in a
following commit.
No functional changes.
Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Leandro Ribeiro <leandro.ribeiro@collabora.com>
Tested-by: Iago Toral Quiroga <itoral@igalia.com>
Tested-by: Alejandro Piñeiro <apinheiro@igalia.com>
Backport-to: 23.3
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26205>
DRM_XE_EXEC_QUEUE_SET_PROPERTY is the offset,
while DRM_IOCTL_XE_EXEC_QUEUE_SET_PROPERTY is the real number.
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26253>
This adds barycentric related bits and various others.
We still need to figure out the bits between 640..672, 800..1024 and checks
some "reserved" bits. (especially around the GS passthrough bit)
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26224>
We can probably do slightly better than this if we take advantage of the
predicate destination in SHFL but not by much. All of the insanity is
still required (nvidia basically emits this), we just might be able to
save ourslves a few comparison ops.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26264>
Nothing should currently hit this path.
The next commit adds code to nir_pack_bits and nir_unpack_bits that can
lead to this path being hit.
v2: Change nir_u2uN(..., 8) to nir_u2u8(...). Suggested by Alyssa.
v3: Don't generate nir_extract_u8 if the driver has set
lower_extract_byte. These instructions were causing some problems for
dozen.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> [v2]
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> [v2]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24741>
It should not be possible for this to happen now as the nir_pack_32_4x8
instruction that is being lowered shouldn't exist. A later commit will
change this.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24741>