Add the intrinsic range_base value to the image intrinsics and add
the option to store the image array offset into range_base instead
of adding it to the image array index if the driver requests it.
v2: Always initialize range_base
v3: fix for bindless intrinsics
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19980>
When querying capabilities or creating views using a scoped aspect
mask, we want to return the format for the correct single-channel
format, but when actually creating the resource (aspect mask 0),
we want to use the typeless format, since the single-channel formats
don't report multisampling support.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20614>
Only add SV_SampleIndex if there exists a sample-rate var that has either flat
interpolation or centroid (and therefore can't force sample rate implicitly),
unless there is also a sample-rate var that doesn't have those properties.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20614>
There's VK tests that have mismatching interpolation specifiers between FS
and the previous stage. For structs, that resulted in different types, which
breaks DXIL validation.
We could link the shaders and have that overwrite the interpolation field from
the previous shader, but we could also just not care and always use float.
I don't see any regressions from that.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20614>
As we've done for the AMD one, to prevent any codegen regression from switching
the Midgard lowering.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19350>
NIR has a fsin instruction that takes an argument in radians. Midgard instead
has an fsinpi argument that takes an argument in multiples of pi. So, we had a
NIR pass that would change fsin(x) to fsin(x / pi) and then map fsin to fsinpi
in the backend.
But that's invalid! In NIR, the opcode fsin is well-defined. fsin(x) means
something very different than fsin(x / pi). They won't usually be equal. The
transform fsin(x) -> fsin(x / pi) is fundamentally unsound.
It did work before, by accident. Most NIR passes don't care about the semantics
of ALU instructions. fsin(x) and fsin(x / pi) are both well-defined but
fundamentally different NIR shaders. So while rewriting is wrong -- the NIR we
get out is not equivalent to the NIR we put in, and the Midgard ops we generate
are not equivalent to the NIR -- but if we don't run any passes that care about
the definition of fsin the two wrongs will cancel out to make a right.
However, some NIR passes do care about the definitions of ALU instructions,
instead of treating them as named black boxes. In particular, constant folding
(nir_opt_constant_fold) evaluates ALU instructions when their inputs are
constants, according to the definition in nir_opcodes.py. So our little charade
will only work if we don't call nir_opt_constant_fold, or if all the fsin
instructions have non-constant inputs. At the beginning of this series, that is
the case. With the later scalarization change, that's no longer the case, and
the unsoundness translates to real failing tests rather than a quibble of NIR's
semantics.
To mitigate, we define a new NIR opcode with the semantics we want and translate
fsin(x) = fsin_mdg(x / pi), where that equivalence does hold mathematically. So
the new translation is sound and doesn't rely on lucky pass ordering.
This matches the approach already used for AMD and AGX, which have fsin_amd and
fsin_agx opcodes respectively.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19350>
We already do this in NIR since a04aa4bc08
and 3f97306b95 so there is no effect
for the mesa state tracker now that it can not emit TGSI any more.
This leaves only nine when RADEON_DEBUG=use_tgsi is set. D3D9 however
requires that sin and cos inputs already have the proper range.
This is super important when the nine shader uses relative adressing
and therefore needs all 256 constants we have. If we add our extra
constants for the fixup, we get over the limit and fail compilation.
v2: vertex shaders only
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18933>
This was already handled by apply_extract but missing from
can_apply_extract, therefore may not be properly applied everywhere.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17924>
Depths 24 and 30 happen to have uniform bpc but 16 does not. Pull the
real channel width out of the format description instead. This is still
a bit ignorant of channel order though.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20673>
This allows ACO to combine the addition into the store without checking
for wraparound.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20296>
Currently etnaviv disables TS on all MC1.0 GPUs, since the TS unit
doesn't properly take into account the linear window offset with
MC1.0, creating address aliases on MMUv1 that aren't properly dealt
with.
MMUv2 however doesn't have a linear window, so we can safely enable
TS on those GPUs.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20552>
Add a new environment varible
MESA_DISK_CACHE_READ_ONLY_FOZ_DBS_DYNAMICE_LIST that specifies a text
file containing a list of RO fossilize caches to load. The list file
is modifiable at runtime to allow for loading RO caches after
initialization unlike MESA_DISK_CACHE_READ_ONLY_FOZ_DBS.
The implementation spawns an updater thread that uses inotify to monitor
the list file for modifications, attempting to load new foz dbs added to
the list. Removing files from the list will not evict a loaded cache.
MESA_DISK_CACHE_READ_ONLY_FOZ_DBS_DYNAMIC_LIST takes an absolute path.
The file must exist at initialization for updating to occur.
File names of foz dbs in the list file are new-line separated and take
relative paths to the default cache directory like
MESA_DISK_CACHE_READ_ONLY_FOZ_DBS.
The maximum number of RO foz dbs is kept to 8 and is shared between
MESA_DISK_CACHE_READ_ONLY_FOZ_DBS_DYNAMIC_LIST and
MESA_DISK_CACHE_READ_ONLY_FOZ_DBS.
The intended use case for this feature is to allow prebuilt caches
to be downloaded and loaded asynchronously during app runtime.
Prebuilt caches be large (several GB) and depending on network
conditions would otherwise present extended wait time for caches
to be availible before app launch.
This will be used in Chrome OS.
Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19328>