The immediate addition can easily be handled by nir_opt_offsets, which
will also take any driver limits into account.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>
In ir3, SSBO offsets are in units of the accessed type size so we want
to start using the new offset_shift index.
Even though the shift is implicit for the ir3 intrinsics, we use
nir_intrinsic_copy_const_indices when creating them so we need to make
sure our indices match the ones used by the generic intrinsics.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>
For intrinsics supporting offset_shift, dealing with their offset is a
bit tricky as we cannot simply add a byte offset to it anymore (which is
what most passes want to do). This commit adds some helpers to add byte
offsets (and adjusting offset_shift accordingly) so that individual
passes don't have to worry about this.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>
For load/store intrinsics that take an offset, this specifies the amount
the offset is shifted left to calculate the final offset:
offset = (offset_src + base) << offset_shift
This is useful for backends that have memory operations that use offset
units other than bytes (i.e., where the shift is implicit).
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>
Values are taken from minStorageBufferOffsetAlignment and
minUniformBufferOffsetAlignment.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>
Predicate registers can be written from the scalar ALU by using a
special cat2 encoding: if the dst is encoded as a0.c, the instruction
will execute on the scalar ALU and write to p0.c.
This commit follows the blob and disassembles scalar predicates as
up0.c. The "u" presumably stands for "uniform".
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36614>
Predicate registers can be written from the scalar ALU by using a
special cat2 encoding: if the dst is encoded as a0.c, the instruction
will execute on the scalar ALU and write to p0.c.
This commit makes the ir3 backend aware of scalar predicates. A new
register flag (IR3_REG_UNIFORM) is added that can be used to mark
predicate dsts as being written by the scalar ALU. For such dsts, the
same synchronization rules apply as for shared registers written by the
scalar ALU (e.g., (ss) is needed to read them from the vector ALU).
Scalar predicates can be used in the early preamble, which makes control
flow available there.
In many ways, the backend treats IR3_REG_UNIFORM the same as
IR3_REG_SHARED. A new flag was added because IR3_REG_SHARED is mainly
used to denote a separate register file, not as a flag to indicate usage
by the scalar ALU. Scalar predicates still use the normal predicate
register file but allow it to be written from the scalar ALU.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36614>
New type was not handled in the switch which lead to hitting following
assert when running tests with pipeline cache:
deqp-vk: ../src/compiler/glsl_types.c:3334: decode_type_from_blob: Assertion `!"Cannot decode type!"' failed.
Fixes: 9e5d7eb88d ("compiler/types: add a bfloat16 type")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36833>
[WHY]
It was found that the caller may call with stream_count = 0, while
streams array is some garbage.
it randomly ends up output_ctx being modified and leading to validation
failure.
[HOW]
Add checking to the stream_count.
Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Signed-off-by: Roy Chan <Roy.Chan@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36809>
[WHY]
For further debugging need to know about the build cmd variables.
[HOW]
Added these input and output paramaters to vpe events.
Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Signed-off-by: Muhammad Ansari <Muhammad.Ansari@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36809>
Various small optimizations that have been accumulating, deal with them
in one commit:
- Add erase functionality for vector util, remove memsets for time opt.
- Update should_gen_cmd_info to take in any stream variables.
- Program funcs should directly program - update mpcc mux hook func to
take in blend_mode.
- Add reserved bits for debug flags.
Signed-off-by: Brendan Steven, Leder <BrendanSteven.Leder@amd.com>
Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36809>
Calculation for the worst case scenario in bufs_req should also include
predication command size.
Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Signed-off-by: Andrzei Okenczyc <Andrzej.Okenczyc@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36809>
../src/gallium/frontends/va/config.c(574): error C2059: syntax error: '}'
MSVC 2019 doesn't support for it yet
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36843>
Usually JIP will be valid, but as part of other changes, it will be
possible to have a shader that have multiple EOT messages and end with
and ENDIF instruction. Its JIP will point after the program ends.
This is fine but was tripping up the compaction code.
Change compaction to not read its internal structures beyond the last
instruction.
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36822>
BITFIELD_MASK() returns a 32-bit unsigned integer, and Clang complains
if we assign it to a 16-bit unsigned integer without a cast. Let's add
that cast.
While we're at it, add an assert() to make it clear to the compiler that
the condition in BITFIELD_MASK() can be optimized away.
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Tested-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36606>
BITFIELD_MASK() returns a 32-bit unsigned integer, and Clang complains
if we assign it to a 16-bit unsigned integer without a cast. Let's add
that cast.
While we're at it, add an assert() to make it clear to the compiler that
the condition in BITFIELD_MASK() can be optimized away.
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Tested-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36606>
The enum pan_earlyzs is just enum mali_pixel_kill under a different
name, which was needed because the enum was missing from common.xml.
However, because pan_earlyzs_lut is used in files that are both included
with PAN_ARCH unset and set to values including values lower than 6, we
get issues with the way genxml/common_pack.h gets included, resulting in
the enum not being defined.
We don't really depend on the values for this, only on the size. So
let's just use unsigned values in the struct instead, to side-step the
issue.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Tested-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36606>
While this was also using translate_zs_format() before the commit in
question, that's didn't lead to any real issues, because only a single
value was legal here before. While it's not entirely in-spec to use
other values, it seems the HW doesn't mind.
But when this logic was reworked, the typed field was used instead. This
lead to a compiler warning on Clang.
Let's correct this properly here, rather than papering over the compiler
warning.
Fixes: 7a763bb0a3 ("pan/genxml: Rework the RT/ZS emission logic")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Tested-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36606>
To generate a nir_component_mask_t, we should use nir_component_mask,
not BITFIELD_MASK()...
But we're also generating the same mask twice here, so let's just
store that to a variable and reuse the mask when shifting it while we're
at it.
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Tested-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36606>
Currently LRZ writes are disabled when depth writes are enabled but the
fragment shader is using discard. Additionally, LRZ writes should be
disabled when fragment shader is outputting sample coverage or the pipeline
state is enabling alpha-to-coverage which behaves as a discard.
This fixes rendering problems on Assetto Corsa. Conditions now used for
disabling LRZ writes match one set of conditions under which the
EARLY_Z_LATE_Z z-test mode is used. It was assumed that in that mode the
LRZ writes in binning will not happen until the late-Z phase, but that's
apparently not the case.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36848>