Align with vkGetBufferMemoryRequirements2 and utilize the cache for
retrieving memory requirements before trying the host call.
Fixes
dEQP-VK.api.invariance.memory_requirements_matching
dEQP-VK.memory.requirements.create_info.buffer.regular
Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18603>
This mirrors the change we made for vega10 (6bbe3c6d3) in August...
Seems like the chances of a PASS are indeed slim, but possible.
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18590>
Panfrost doesn't expose LATC format support at all, so RGTC
state-tracker level RGTC support is sufficient to drop the fake RGTC
flag on Panfrost.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Eric Engestrom <eric@igalia.com>
Tested-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18248>
This logic doesn't really do what it pretends to; we don't expose the
RGTC features unless we actually have RGTC support. This is about to
change, but for that logic to work, we need to be able to tell if we're
using a fallback-format or not, and we can't do that unless we keep the
format as RGTC.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Eric Engestrom <eric@igalia.com>
Tested-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18248>
This file had a mixture of tabs and spaces for indent.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Eric Engestrom <eric@igalia.com>
Tested-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18248>
When tests are already in the flakes list, it's useless to mark them
as expected failures.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18592>
If we get CPU access (such as a read) after an upload transfer, we need
to ensure that the host has handled the upload. Do this by stalling
when the buffer is mapped. (The previous commit ensures we don't try to
do a pointless upload for an already mapped buffer.)
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18604>
The upload path is intended to avoid stalling on host in order to mmap
recently allocated buffers. But if we already had to mmap it, no point
in taking the upload path.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18604>
A transfer that only partially writes the staging buffer could overwrite
valid buffer contents, unless we are told that it is ok to discard the
entire range.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18604>
Typically GLSL mediump lowering will have lowered all the ALU ops
generating the values to 16-bit, and once vars_to_ssa happens the mediump
temps disappear. However, if they don't disappear (for example, the var
gets indirected and eventually gets lowered to scratch or indirect
lowering), then you don't want the storage upconverted to 32-bit.
Also, if a CS shared var is declared mediump, then storing it as 16 bit
prevents conversions around the load store assuming the ALU ops related to
them are 16 bit. For gfxbench aztec ruins, the CS shared var sizes are
cut in half, improving overall perf by 0.805549% +/- 0.0953482% (n=6) on
gl-5-normal.
freedreno shader-db:
total instructions in shared programs: 2917577 -> 2917743 (<.01%)
instructions in affected programs: 46141 -> 46307 (0.36%)
total last-baryf in shared programs: 109712 -> 109492 (-0.20%)
last-baryf in affected programs: 638 -> 418 (-34.48%)
total full in shared programs: 190275 -> 190218 (-0.03%)
full in affected programs: 156 -> 99 (-36.54%)
total constlen in shared programs: 492596 -> 492600 (<.01%)
constlen in affected programs: 8 -> 12 (50.00%)
total cat6 in shared programs: 33019 -> 33107 (0.27%)
cat6 in affected programs: 3604 -> 3692 (2.44%)
total stp in shared programs: 3626 -> 3670 (1.21%)
stp in affected programs: 3336 -> 3380 (1.32%)
total ldp in shared programs: 1718 -> 1762 (2.56%)
ldp in affected programs: 1680 -> 1724 (2.62%)
(this is all in aztec ruins)
total sstall in shared programs: 195656 -> 195182 (-0.24%)
sstall in affected programs: 3249 -> 2775 (-14.59%)
total (ss) in shared programs: 52823 -> 52966 (0.27%)
(ss) in affected programs: 1733 -> 1876 (8.25%)
total systall in shared programs: 507928 -> 508687 (0.15%)
systall in affected programs: 103010 -> 103769 (0.74%)
total (sy) in shared programs: 23185 -> 23196 (0.05%)
(sy) in affected programs: 1276 -> 1287 (0.86%)
total waves in shared programs: 435290 -> 435302 (<.01%)
waves in affected programs: 12 -> 24 (100.00%)
total loops in shared programs: 407 -> 405 (-0.49%)
loops in affected programs: 9 -> 7 (-22.22%)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18452>
I don't know of any GPUs doing 16-bit atomic accesses, nor do I know of
anybody wanting that in shaders. But deqp has GLES CTS cases that set
mediump on shared variables, so just skip lowering for those vars.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18452>
If every use was a conversion to 16, then ir3_cf would fold it into the
bary instruction. But if something had generated a highp comparison of
the mediump input with a mediump op result, it would get stuck as highp,
even though we could have used 16-bit values without upconverting.
This fixes dEQP-GLES2.functional.shaders.algorithm.rgb_to_hsl_fragment on
ANGLE on turnip, closing #7043. fossil-db results are mixed:
fossil-db:
Totals from 697 (4.65% of 14988) affected shaders:
MaxWaves: 10712 -> 10736 (+0.22%)
Instrs: 82394 -> 83572 (+1.43%); split: -1.31%, +2.74%
CodeSize: 178280 -> 180118 (+1.03%); split: -0.46%, +1.49%
NOPs: 15887 -> 16067 (+1.13%); split: -7.48%, +8.61%
MOVs: 1297 -> 1328 (+2.39%); split: -6.86%, +9.25%
Full: 3730 -> 3842 (+3.00%); split: -1.80%, +4.80%
(ss): 1877 -> 1849 (-1.49%); split: -5.59%, +4.10%
(sy): 1249 -> 1255 (+0.48%); split: -1.04%, +1.52%
(ss)-stall: 6809 -> 6364 (-6.54%); split: -13.85%, +7.31%
(sy)-stall: 17059 -> 17257 (+1.16%); split: -6.51%, +7.67%
Cat0: 17220 -> 17400 (+1.05%); split: -6.90%, +7.94%
Cat1: 5307 -> 6366 (+19.95%); split: -6.93%, +26.89%
Cat2: 39138 -> 39101 (-0.09%); split: -0.31%, +0.22%
Cat3: 16772 -> 16741 (-0.18%)
Cat5: 1269 -> 1276 (+0.55%)
I tried to pick some apps to test that looked the most impacted, and
indeed the results are mixed:
cookie_run_kingdom: +0.275514% +/- 0.0883816% (n=68)
trex_200: +0.0943847% +/- 0.0297073% (n=1463)
command_and_conquer_rivals: no difference (n=131)
war_planet_online: no difference (n=120)
lego_legacy: -0.192131% +/- 0.152083% (n=99)
among_us: -0.625227% +/- 0.385419% (n=60)
Given that the perf results are small and go both ways, and apparently
we're an outlier in not always lowering mediump inputs to 16-bit, just do
it for consistency with other drivers.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18506>
This was already been done to gen7 platforms, so now extending to all
platforms without has_64bit_int.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18577>