It's ill-defined at best since it doesn't even initialize the
vk_object_base and its only use was NVK and that use is now gone.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35624>
This struct gathers up all the sampler state from VkSamplerCreateInfo
and its pNext chain into a single struct. The struct is has no pointers
and has uses -Wpadded to ensure no holes. This means it's hashable and
mem-comparable. We also make give vk_ycbcr_conversion_state -Wpadded
because vk_sampler_state has a copy of vk_ycbcr_conversion_state
embedded in it.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35624>
This removes a very old hack which will also allow us to enable DCC
for multiplanar formats eventually and to reduce the combined
image+sampler descriptor size from 96 to 48 on RDNA3+.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35457>
This will allow us to remove the ycbcr hack and to reduce the number
of bytes written for combined image+sampler from 80 to 64.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35457>
This isn't done in validate_cfg() because that's called less frequently.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35202>
We should update linear_preds so that the predecessors we can remove are
actually removed.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35202>
It might be the case that both the branch and exec mask write in a
divergent branch block are removed. try_remove_simple_block() might then
try to remove it, but fail because it has multiple logical successors.
Instead, just skip these blocks.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 25.1
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35202>
ACO modifies the NIR, most importantly it updates divergence analysis.
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32990>
The important change here is that we're no longer using state to
determine whether or not we're multiview. We just assume that layer_id
is zero when in multiview (this appears to be the case when SET_RT_LAYER
is set to CONTROL_V_SELECTS_LAYER) and that view_index is zero when not
in multiview (this is required by Vulkan) and add the two together.
This fixes a long-standing bug where multiview input attachments didn't
work properly with shader objects.
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35551>
This hoists all the annoyance of figuring out the current pixel's input
attachment coordinates to the driver. The pass still deals with all the
annoyance of turning an image instruciton into a texture instruction but
it gives the driver more control over the position. For most drivers,
this will be something like ivec3(int(gl_FragCoord.xy), gl_Layer) or
similar, some drivers need something more nuanced. Turnip, for
instance, needs unscaled coordinates for some attachments and NVK
doesn't really want gl_Layer or gl_ViewIndex for the layer. It's better
to just have a new system value that drivers can make what they want.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35551>
The SPIR-V spec is pretty clear that coordinates on subpass attachments
are relative to the current pixel. They're required to be zero but we
should stay consistent with ourselves (we already do this for image
intrinsics) and with the spec.
Fixes: 84b08971fb ("nir/lower_input_attachments: lower nir_texop_fragment_{mask}_fetch")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35551>
There's nothing in NIR which guarantees that the deref is the first
source or that the coordinate is the second. Use
nir_tex_instr_src_index() to get the actual indices.
Fixes: 84b08971fb ("nir/lower_input_attachments: lower nir_texop_fragment_{mask}_fetch")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35551>
Turnip when cross-compiled for i386 needs to be built with SSE2 as a
minimum spec, as it uses clflush unconditionally. Make sure to pass in
the sse2_args, which will be empty on Arm64 targets.
Fixes: 7231eef630
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35621>
Given that the intrinsic will be CSEed at the NIR level, we don't need to
preemptively set it up at the top of the shader. No change in HSW shader-db.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
Unless I've seriously missed something, we have the Z in the payload
(which we can always request if we need access to it and it's not already
passed to us due other WM IZ settings).
total instructions in shared programs: 4408303 -> 4408186 (<.01%)
instructions in affected programs: 1164 -> 1047 (-10.05%)
total cycles in shared programs: 142485036 -> 142484566 (<.01%)
cycles in affected programs: 26820 -> 26350 (-1.75%)
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
NIR catches that if you're just doing something like adding two smooth
inputs, we can do the multiply once on the result instead of on each
input. BRW shader-db results:
total instructions in shared programs: 4409146 -> 4408303 (-0.02%)
instructions in affected programs: 800761 -> 799918 (-0.11%)
total cycles in shared programs: 143203198 -> 142485036 (-0.50%)
cycles in affected programs: 79081682 -> 78363520 (-0.91%)
total sends in shared programs: 363044 -> 363042 (<.01%)
sends in affected programs: 33 -> 31 (-6.06%)
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
This moves some conversions to NIR that may get eliminated, and also
distinguishes gl_FragCoord.z/w loads at the shader info level so we don't
need to flag uses_src_depth/uses_src_w when only gl_FragCoord.xy get used
(as is typical). This reduces thread payload setup on many shaders.
Also, interestingly, blorp shaders stop reserving space for z/w despite
not putting them in the payload (since PS_EXTRA isn't filled out for z/w).
HSW shader-db is noise:
total instructions in shared programs: 9942649 -> 9942997 (<.01%)
instructions in affected programs: 143167 -> 143515 (0.24%)
total cycles in shared programs: 314768862 -> 314299112 (-0.15%)
cycles in affected programs: 62951452 -> 62481702 (-0.75%)
LOST: 44
GAINED: 26
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
This will be used for representing gl_FragCoord in NIR and reducing
payload registers pushed.
HSW results:
total instructions in shared programs: 9940636 -> 9948574 (0.08%)
instructions in affected programs: 852560 -> 860498 (0.93%)
total cycles in shared programs: 314804525 -> 314900080 (0.03%)
cycles in affected programs: 39786599 -> 39882154 (0.24%)
LOST: 5
GAINED: 11
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
This was leftover dead code from 4bb6e6817e ("intel: Use a system value
for gl_FragCoord") -- the sysval doesn't do any interpolation and doesn't
have sources that could use a barycentric.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
This generates one extra instruction to set the rounding mode to RTE due
to f2f16_rtne in the lowering. This changes the result for
fquantize2f16(65505.0) from 65536 to 65504, which fixes SPIR-V
conformance for this value:
If Value is positive with a magnitude too large to represent as a
16-bit floating-point value, the result is positive infinity. If Value
is negative with a magnitude too large to represent as a 16-bit
floating-point value, the result is negative infinity.
SPIR-V doesn't specify whether this overflow check is before or after
rounding, but IEEE specifies rounding first, which is what produces our
65504.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25552>
There's no need for a per driver HMI implementation since the
vk_icdGetInstanceProcAddr implementation can well populate the required
entrypoints for Android icd.
Changes have to be done in this single commit for simplicity. Otherwise,
I would have to create a separate android shared library in the runtime
like how vk_instance is handled today, so that the target is able to
check per driver enablement def. However, after all drivers have
migrated over within this MR, we still have to clean those up. So I
decided to just do those in a single commit instead.
v2: avoid preloading u_gralloc in vulkan hal open
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v1)
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35561>
Drop explicit u_gralloc init since it will be initialized upon the
initial vk_android_get_ugralloc.
Reviewed-by: Corentin Noël <corentin.noel@collabora.com>
Acked-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35561>