This reverts commit 9e67d3f237.
We do not, in fact, implement texture barriers. Texture barriers are
supposed to allow non-overlapping rendering feedback loops. We cannot
support that at non-tile boundaries when texture compression is enabled
without some kind of downgrade path or other special handling.
Fixes Emacs corruption on X/Glamor.
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23832>
Switch our frontends from generating sample_mask_agx to discard_agx, and
switching from legalizing sample_mask_agx to lowering discard_agx to
sample_mask_agx. This is a much easier problem and is done here in a way that is
simple (and inefficient) but obviously correct.
This should fix corruption in Darwinia.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23832>
We discovered today that these (probably) trigger depth/stencil testing, which
has significant implications for the correct/performant use.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23832>
sample_mask_agx corresponds directly to the hardware's 2-source instruction, but
it's hard to use correctly and even harder to legalize after the fact, since
it's responsible for not only discard but also late depth/stencil testing. For
our various high-level lowering passes, it's easier to use a one-source discard
(where we don't have to worry about sample masks), which the compiler will
internally lower to the two-source instruction. Introduce such an instruction.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23832>
The preprocess buffer is the buffer used to generate the cmdbuf. It
was aligned to 256 bytes but the correct alignment is actually
ac_gpu_info::ib_alignment.
Otherwise, if a DGC IB is executed like a IB1, this hits an assertion
in radv_amdgpu_cs_submit() because the alignment is incorrect.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23764>
Somewhat counterintuitively, dynamic_state.set contains the bits that
have been loaded from static state, i.e. those that are _not_ dynamic.
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23590>
Issue: H.264 high 10 profile is currently not supported, but is shown as
supported in vainfo.
Reason: Kernel reported capabilities for video encoder/decode doesn't
consider the actual profile (only using reduced profile).
Solution: Use kernel reported capabilities only for basic H.264/HEVC
profiles. Other profiles (e.g. 10 bits) should be checked based on HW.
Fixes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9242
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23824>
Backends never see these instructions.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Suggested-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23831>
Cleanup of various fields with redundant information.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23597>
With this change, ACO is going to rely on the caller to set
the HW stage and will no longer guess it from the input shaders.
This will help enable compiling merged shaders separately,
but that will need further changes in instruction selection.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23597>
ACO will rely on this field instead of guessing
the stage internally.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23597>
ACO will rely on this field instead of guessing
the stage internally.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23597>
Unused in this commit, but this is going to replace the shader
stage selection inside ACO after the drivers set it correctly.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23597>
The new ac_hw_stage is going to be used by drivers as well.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23597>
This is going to be shared between RADV, RadeonSI and ACO.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23597>
The introduction of a workaround adding lots of MI_NOOPs broke our
computation.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: b9aa66d5d0 ("anv: disable preemption for 3DPRIMITIVE during streamout")
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23792>
gpu_id is not decodable from chip_id in general case,
so we should use chip_id to search for fd_dev_info and get
GPU generation from that.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23828>
gpu_id is obsolete, chip_id doesn't encode the GPU generation.
Thus we have to manually specify the GPU gen instead of inferring it.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23828>
Header from freedreno/common is used without linking with its
implementation. It worked before because all called functions
were header only, which would change soon.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23828>
Shaders are the largest thing we hash now, so they benefit from a faster
hash.
Change the field name from `sha1` to `hash` to avoid tying the definition
to a particular algorithm. This doubles down as a precaution against
callers still assuming a 20-byte hash (in which case the compilation will
error out).
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22571>
If we consider clearing the group flag of a vec4 register that is used as
source for some instruction we have to take into account that the parent
of the register element may also be part of a group in the parent instruction.
In this case we must not clear the group flag.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9118
Fixes: f3415cb26a (r600/sfn: copy propagate register load chains)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23813>
Cuts down serialized size from 2850288 to 1377780 bytes.
Reduces clinfo with Rusticl time by 40% for debug builds.
(Old data, but the point stands)
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15996>
Some drivers (llvmpipe) postpone some screen initialization until the
first context is created.
Signed-off-by: Karol Herbst <git@karolherbst.de>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15996>
- Fix null deref. VkPipelineLayoutCreateInfo::pSetLayouts is allowed to
contain VK_NULL_HANDLE.
- The loop 'break' was misplaced.
Fixes crash in
dEQP-VK.pipeline.pipeline_library.graphics_library.fast.0_00_11_11 after
VK_EXT_graphics_pipeline_library is enabled in a later patch.
Fixes: 91966f2eff ("venus: extend lifetime of push descriptor set layout")
Signed-off-by: Lina Versace <linyaa@google.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Dawn Han <dawnhan@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23810>
Now that Intel and R600 both do their own modifier propagation, the only
backends that still lower modifiers in NIR are:
* nir-to-tgsi
* lima
* etnaviv
* a2xx
The latter 3 backends do not support integers, and certainly do not support
fp64. So they don't use these.
TGSI in theory supports integer negate modifiers but NTT doesn't use them, so
they're unused there too.
Since they're unused, we remove NIR support for integer and 64-bit modifiers,
leaving only 16/32-bit float modifiers. This will reduce the scope needed for a
replacement to NIR modifiers, being pursued in !23089.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23782>
Venus sync feedback design relies on concurrent host device resource
access. To avoid device flush overwriting host writes, we must
suballocate the slots with a minimum size of the buffer alignment.
Cc: mesa-stable
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23633>
VE_COND_MUX_GTE4 is a nice match for the TGSI CMP opcode, however
there is a big limitation due to the general shortcoming of the
vertex shader engine that any instruction can read only two different
temporary registers. So we still have to lower in some cases.
Shader-db RV530:
total instructions in shared programs: 130872 -> 130333 (-0.41%)
instructions in affected programs: 29854 -> 29315 (-1.81%)
helped: 294
HURT: 83
total temps in shared programs: 16747 -> 16775 (0.17%)
temps in affected programs: 407 -> 435 (6.88%)
helped: 10
HURT: 38
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23691>
I have been running ci stress tests during the last few days
and nights and this is what I needed to get a pass rate > 80%.
There are still many flakes but I think this is a good starting
point to make better use of the ci.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23797>