For Tiger Lake and onward, we generally don't need to ambiguate the CCS
before accessing it. This is safe for two reasons:
- Tiger Lake and onward treat all CCS values as legal.
- We enable compression on all writable image layouts. The CCS will
receive all writes and will therefore always be valid.
When dealing with modifiers, we continue to allow ambiguates in some
instances.
Before this patch, I found ~19.5k ambiguates in Wolfenstein:
Youngblood's Riverside benchmark (note that this includes manually
entering the benchmark and exiting the app). With this patch, the number
of ambiguates goes down to zero.
Improves performance of Fallout 4 at 1080p/High settings on Arc A380 by
around 22%.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20118>
Whether or not CCS can be used without initialization depends on the
platform:
- On gfx7-8, each CCS element is 1-bit and encodes "fast-cleared" or
"pass-through". So, those platforms have no illegal values.
- On gfx9-11, each CCS element is 2-bits and some bit combinations
are invalid.
- On gfx12+, each CCS element is 4-bits but they have no truly illegal
values. Unused encodings are interpreted as "pass-through".
Refer to the "MCS/CCS Buffers for Render Target(s)" sections of the
PRMs for more info.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20118>
When INTEL_DEBUG=pc is set and a CCS operation is being performed, the
driver reports that flushes are happing before and after the operation.
It also reports that the operation is a fast clear, but that's not
always the case. We could be resolving for example.
Reporting the specific operation can help avoid confusion.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20118>
Since we know we're loading this Mesa build, we know that no_error is
always supported (the renderer query always returned true).
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20069>
Now that we can access the pipe screen through the dri_screen, we can skip
some indirection.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20069>
Now that we know the driver on the other side is the same version of Mesa
as our build, we can just access the screen instead of having accessor
functions.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20069>
All DRI loaders in Mesa (EGL, GLX, gbm) now require this ext and that the
driver come from a matching build. This will let us use Mesa-internal
types and enums across the loader-driver bounary inside of Mesa.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Eric Engestrom <eric@igalia.com>
LOL-YESed-by: Kristian Høgsberg <krh@bitplanet.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20069>
It's better than EGL's copy of it by having optional ext support in the
match structs, and GLX wishes it had either of the two.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20069>
Return VA_STATUS_ERROR_UNSUPPORTED_PROFILE if given profile is not
supported for both decode and encode.
Return VA_STATUS_ERROR_UNSUPPORTED_ENTRYPOINT if given profile is
supported (for at lease one of decode or encode), but current given
entrypoint is not supported.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20082>
Fixes regressions on GFX6 and the RAGE2 workaround.
Fixes: a297ac10a4 ("radv,aco: stop lowering FS outputs in NIR")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20154>
This is a trivial program to accomplish allocation of local/common
store shared registers, used when no actual program is available or
required.
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20130>
Some fields are to be initialized to a specific non-zero value if
unused; this inline function takes care of that.
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20130>
This was a bad idea because:
- it diverges too much with the fragment shader epilog
- it doesn't allow to implement alpha-to-coverage via MRTZ correctly
- it was supposed to be used by LLVM but this never happened
Reverting this back allows us to fix alpha-to-coverage via MRTZ
on GFX11 easily, including for fragment shader epilogs.
fossils-db (NAVI21):
Totals from 20411 (15.13% of 134913) affected shaders:
VGPRs: 972056 -> 971400 (-0.07%); split: -0.08%, +0.01%
CodeSize: 92284804 -> 92295392 (+0.01%); split: -0.05%, +0.06%
MaxWaves: 465010 -> 465166 (+0.03%); split: +0.03%, -0.00%
Instrs: 17034162 -> 17034963 (+0.00%); split: -0.00%, +0.01%
Latency: 252013190 -> 251971764 (-0.02%); split: -0.03%, +0.02%
InvThroughput: 45859625 -> 45842556 (-0.04%); split: -0.04%, +0.01%
VClause: 324627 -> 324629 (+0.00%); split: -0.03%, +0.03%
SClause: 672918 -> 672826 (-0.01%); split: -0.05%, +0.04%
Copies: 1172126 -> 1158152 (-1.19%); split: -1.20%, +0.01%
Branches: 420602 -> 420604 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 1025441 -> 1025481 (+0.00%)
PreVGPRs: 861787 -> 860650 (-0.13%); split: -0.17%, +0.03%
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20126>
16-bit isn't possible. Note that this is currently style broken for
compressed formats because the w channel is never written to.
Ported from RadeonSI ('radeonsi/gfx11: fix alpha-to-coverage with
stencil or samplemask export')
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20126>
nir_opt_if doesn't catch all the possible cases of empty then branches,
so resolve this on the fly when creating the backend IR.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20141>
If we access non-existing images Cayman hardware may lock up
and trigger a reset that is not always successful. Therefore,
make sure the images access is legal.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20141>
Don't add extra movs to construct the swizzles, but just split the
instruction into separate channels, if possible. Idea by Filip Gawin.
shader-db for RV370:
total instructions in shared programs: 84632 -> 83565 (-1.26%)
instructions in affected programs: 12613 -> 11546 (-8.46%)
helped: 295
HURT: 8
total temps in shared programs: 12437 -> 12237 (-1.61%)
temps in affected programs: 1807 -> 1607 (-11.07%)
helped: 153
HURT: 20
LOST: 1
GAINED: 19
The HURT instructions and the single lost shaders are some fluctuations
from pair scheduling. The number of instructions before pair scheduling
is always lower or equivalent.
Partial fix for: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6339
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Tested-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20009>
For any instruction that can be reasonably converted to alpha we check
all of its readers to see if the conversion is possible (including check
for at least one free alpha source) at the beginning of pair scheduling.
However, if the reader instruction has multiples sources that could be
converted to alpha and multiple indeed are, than we could run of of the
alpha sources eventually. So recheck just before converting that there
are still some unused sources left.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Tested-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20009>
Move the streamout code into the streamout-only branch. The code must be
guarded by si_shader_uses_streamout(). Using xfb_stride is not enough.
Fixes: 003cbddfee - radeonsi: use native shader info when init streamout args
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20147>
`dzn_instance_create` will call `dzn_instance_destroy` when the d3d12
library fails to load. Just like the issue in `d3d12_screen`, this will
lead to a crash because `d3d12_mod` is NULL.
To fix this, only close the library after if it was actually opened.
Cc: mesa-stable
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20145>