Extend force_fb_preload to take an optional VkRenderingInfo. When it is
non-NULL, this is the unaligned preload and force_fb_preload should
clear attachments.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31895>
Corrects regression caused by prior commit that created memory
overwrite by not mallocing enough space for filename string.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32013>
The a630-gles-asan has a significant fraction, that's a trade-off for the
pre-merge, but then we need a full test in the nightly run.
The a630-gles-asan-full job usually takes 40-50 minutes. Therefore, the 20
minutes timeout is increased to 1h. The parallel feature is not used because
the nightly run is, with the introduction of this job, using 4 of the 6
devices available.
Signed-off-by: Sergi Blanch Torne <sergi.blanch.torne@collabora.com>
Reviewed-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31713>
I used this for testing when adding r300 driconf support
and it was commited by mistake.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31745>
Also add support for the 0*NaN = NaN IEEE compliant multiply on R500.
All of this is disabled by default, but can be enabled with a
RADEON_DEBUG variable or alternativelly with a driconf tweak.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31745>
In the case of v1=p_extract(v1=p_extract(src, 0, 16, 1), 0, 32, 0).
When we apply extracts with sub-dword definitions, this will also
include v2b=p_extract(v2b=p_extract(src, 0, 8, 1), 0, 16, 0).
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762>
This only worked when "a" was 16-bit because a pattern above replaced the
shift.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31762>
It only supported scalar intrinsics because it was written before
nir_opt_vectorize_io existed. The introduction of nir_opt_vectorize_io
exposes this issue. The direct path has been tested. The indirect path
hasn't. That's fine because if we see a CLIP_DIST failure with indirect
in the future, this pass is likely the cause.
This is a prerequisite for enabling nir_opt_varyings for all gallium
drivers.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31994>
When we set the visibility stream with CP_SET_PSEUDO_REG, it does two
things (or only one of the two, with concurrent binning):
- Set the "pseudo register" used by CP_SET_BIN_DATA5_OFFSET, which in
turn is used when decoding the vis. streams.
- Set the VSC register used by the binning pass.
Preemption with skipsaverestore obliterates the second, but not the
first. This means that before running the binning pass, we have to
re-emit these registers. I *think* this is what the blob does on a7xx.
On a6xx, where the pseudo register doesn't exist, the blob seems to
re-emit the preamble every time we re-allocate the visibility streams,
but we don't support a6xx yet so we can defer making that decision.
Fixes supertuxkart under zink with preemption enabled in the kernel.
Fixes: 1d2b479a3b ("tu: Allow being preempted on a7xx")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31937>
We never took the time to actually test this, but it works fine.
Improves performance on Navi 10 in the following test cases:
Baldur's Gate 3 Vulkan: up to 10%
Witcher 3 D3D11: around 4%
Granite primitive stress test: 107%
FSR2 sample app: 57%
Notes:
NGG is still disabled on Navi 14.
Not tested on Navi 12.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31971>
Helps performance in Baldur's Gate 3 on Navi 10
when NGG culling is enabled.
Also fix the description of the RADV_PERFTEST=nggc env var.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31971>
This has the best fossil-db results across in a sweep from 0..15.
fossil-db results on Alderlake:
Instructions in all programs: 152849904 -> 152824116 (-0.0%)
SENDs in all programs: 7677830 -> 7677830 (+0.0%)
Loops in all programs: 48470 -> 48470 (+0.0%)
Cycles in all programs: 11988670382 -> 11987530942 (-0.0%)
Spills in all programs: 42863 -> 41777 (-2.5%)
Fills in all programs: 77114 -> 73044 (-5.3%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31990>
\[WHY\]
For color fill only case we see performance on Vpe1.1 are not doubled due
to CD are all 0, no odd CD
\[HOW\]
1. Dummy stream dst rect should be in the middle of target rect so the
two (dummy seg + bg only seg) are balanced, instead of target at upper
left corner which makes it imbalance
2. BG gap generation should consider more for collaboration mode
num_multiple
3. When pure bg case, skip dummy stream handling and go ahead do BG gap
generation
4. Update memory requirement for the new pure BG case flow to avoid run out of embedded buffer
4. Additional -- fix the random Collaborate data generation bug (benign)
\[TESTING\]
Vpelibtest app + nv12torgb case with debug flag bgcolorfill set on in
vpelibtestapp
Media player with/without bgcolorfillonly flag
Teams
Reviewed-by: Roy Chan <Roy.Chan@amd.com>
Reviewed-by: Navid Assadian <Navid.Assadian@amd.com>
Acked-by: Chenyu Chen <Chen-Yu.Chen@amd.com>
Signed-off-by: Tomson Chang <tomson.chang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31918>
\[WHY\]
1.Remove TODO comments that don't need action item
2.Delete the legacy command number check as it is now using a vector (i.e. without hard limit)
\[HOW\]
Remove TODO comments and delete the legacy command number check
Signed off by <tvisan@amd.com>
Reviewed-by: Roy Chan <Roy.Chan@amd.com>
Reviewed-by: Jesse Agate <Jesse.Agate@amd.com>
Acked-by: Chenyu Chen <Chen-Yu.Chen@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31918>