These generator scripts use the `write` function that, unlike `print`,
doesn't print a trailing newline. So let's add one to the template.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35697>
The spec says (emphasis mine):
If the color attachment is fixed-point, the components of the source and
destination values **AND BLEND FACTORS** are each clamped to [0,1] or [-1,1]
respectively for an unsigned normalized or signed normalized color attachment
prior to evaluating the blend operations. If the color attachment is
floating-point, no clamping occurs.
However, neither the CTS nor any hardware implement this semantic.
For unsigned normalized formats, the definitions are roughly equivalent (except
perhaps around constant colours). 0 <= x <= 1 implies that 0 <= 1 - x <= 1.
Therefore if the source/destination colours are clamped to [0, 1], then their
complements are also in [0, 1], so clamping any blend factor (except constant
colour) has no effect if the source/dest were already clamped.
For signed normalized formats, however, this difference matters. -1 <= x <= 1
implies that 0 <= 1 - x <= 2... so to implement the spec text faithfully, we
would need to clamp again the complemented colour blend factors to return back
to signed normalized range. Software blending implementations can of course do
that... but doing so causes CTS fails, as the CTS reference renderer does not do
this.
This commit adjusts nir_lower_blend to match what actual hardware does, what CTS
requires, and what the spec should have said.
See https://gitlab.khronos.org/vulkan/vulkan/-/issues/4293 for the spec
resolution.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35519>
This adds a variant of nir_steal_tex_src() which is for derefs as well
as versions that just return the source without removing it.
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35623>
When dumping nir validation errors, flush stderr before
calling abort. Otherwise the errors might not be emitted.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35665>
e4m3fn: 8bit floating point format with 4bit exponent, 3bit mantissa
and no infinities (finite only)
e5m2: 8bit floating point format with 5bit exponent, 2bit mantissa
and with infinities.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35434>
We're not currently using RANGE_BASE and we'll use BASE for offset
optimizations on Xe2+.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
This hoists all the annoyance of figuring out the current pixel's input
attachment coordinates to the driver. The pass still deals with all the
annoyance of turning an image instruciton into a texture instruction but
it gives the driver more control over the position. For most drivers,
this will be something like ivec3(int(gl_FragCoord.xy), gl_Layer) or
similar, some drivers need something more nuanced. Turnip, for
instance, needs unscaled coordinates for some attachments and NVK
doesn't really want gl_Layer or gl_ViewIndex for the layer. It's better
to just have a new system value that drivers can make what they want.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35551>
The SPIR-V spec is pretty clear that coordinates on subpass attachments
are relative to the current pixel. They're required to be zero but we
should stay consistent with ourselves (we already do this for image
intrinsics) and with the spec.
Fixes: 84b08971fb ("nir/lower_input_attachments: lower nir_texop_fragment_{mask}_fetch")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35551>
There's nothing in NIR which guarantees that the deref is the first
source or that the coordinate is the second. Use
nir_tex_instr_src_index() to get the actual indices.
Fixes: 84b08971fb ("nir/lower_input_attachments: lower nir_texop_fragment_{mask}_fetch")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35551>
There are currently two places where we have to handle values that are
logically 64b: 64b atomics and 64b global addresses. For the former, we
ingest the values as 64b from NIR, while the latter uses 2x32b values.
This commit makes things more consistent by using 64b NIR values for
global addresses as well.
Of course, we could also go the other way around and use 2x32b values
everywhere, which would make things consistent as well. Given that ir3
doesn't actually have 64b registers, and 64b values are represented by
collected 2x32b registers, this could actually make more sense.
In the end, both methods are mostly equivalent and it probably doesn't
matter too much one way or the other. However, the reason I have a
slight preference for ingesting things as 64b is that it allows us to
use more of the generic NIR intrinsics, which use 1-component values for
64b addresses or atomic values. This commit already makes
global_atomic(_swap)_ir3 obsolete and I'm planning to create generic
intrinsics to support ldg.a/stg.a as well.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33797>
nir_unsigned_upper_bound is good enough that this isn't needed anymore.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35514>
If nir_opt_vectorize_io isn't called, 16-bit IO is broken.
This is a workaround to keep RADV working and consume incorrect NIR
while other drivers consume correct NIR.
Hopefully this will be removed ASAP.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35315>
NIR represents low bits of 16-bit IO as a separate vec4, and high bits as
another separate vec4. There is no representation that allows vec8.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35315>